NAÏVE BAYES ALGORITHM
NAÏVE BAYES ALGORITHM
What Is the Naive Bayes
Algorithm?
It is a classification technique based on Bayes’ Theorem with an
independence assumption among predictors. In simple terms, a Naive Bayes
classifier assumes that the presence of a particular feature in a class is
unrelated to the presence of any other feature.
The Naïve Bayes classifier is a popular supervised machine learning
algorithm used for classification tasks such as text classification. It belongs
to the family of generative learning algorithms, which means that it models the
distribution of inputs for a given class or category. This approach is based on
the assumption that the features of the input data are conditionally
independent given the class, allowing the algorithm to make predictions quickly
and accurately.
In statistics, naive Bayes classifiers are considered simple
probabilistic classifiers that apply Bayes’ theorem. This theorem is based on
the probability of a hypothesis, given the data and some prior knowledge. The
naive Bayes classifier assumes that all features in the input data are
independent of each other, which is often not true in real-world scenarios.
However, despite this simplifying assumption, the naive Bayes classifier is
widely used because of its efficiency and good performance in many real-world
applications.
Moreover, it is worth noting that naive Bayes classifiers are among the
simplest Bayesian network models, yet they can achieve high accuracy levels
when coupled with kernel density estimation. This technique involves using a
kernel function to estimate the probability density function of the input data,
allowing the classifier to improve its performance in complex scenarios where
the data distribution is not well-defined. As a result, the naive Bayes
classifier is a powerful tool in machine learning, particularly in text
classification, spam filtering, and sentiment analysis, among others.
For example, a fruit may be considered to be an apple if it is red,
round, and about 3 inches in diameter. Even if these features depend on each
other or upon the existence of the other features, all of these properties
independently contribute to the probability that this fruit is an apple, which
is why it is known as ‘Naive’.
An NB model is easy to build and particularly useful for very large data
sets. Along with simplicity, Naive Bayes is known to outperform even highly
sophisticated classification methods.
Bayes theorem provides a way of computing posterior probability P(c|x)
from P(c), P(x), and P(x|c). Look at the equation below:
Above,
- P(c|x) is the posterior probability of class (c, target)
given predictor (x, attributes).
- P(c) is the prior probability of class.
- P(x|c) is the likelihood which is the probability of
the predictor given class.
- P(x) is the prior probability of the predictor.
How Do Naive Bayes
Algorithms Work?
Let’s understand
it using an example. Below I have a training data set of weather and the corresponding
target variable ‘Play’ (suggesting possibilities of playing). Now, we need to
classify whether players will play or not based on weather conditions. Let’s
follow the below steps to perform it.
1.
Convert the
data set into a frequency table
In this first step data set is
converted into a frequency table
2.
Create a Likelihood
table by finding the probabilities
Create a Likelihood table by
finding the probabilities like Overcast probability = 0.29 and probability of
playing is 0.64.
3.
Use the
Naive Bayesian equation to calculate the posterior probability
Now, use the Naive Bayesian
equation to calculate the posterior probability for each class. The class with
the highest posterior probability is the outcome of the prediction.
Problem: Players
will play if the weather is sunny. Is this statement correct?
We can solve it using the above-discussed method of
posterior probability.
P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)
Here P( Sunny | Yes) * P(Yes) is in the numerator, and P
(Sunny) is in the denominator.
Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14
= 0.36, P( Yes)= 9/14 = 0.64
Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which
has a higher probability.
The Naive Bayes uses a similar method to predict the
probability of different classes based on various attributes. This algorithm is
mostly used in text classification (NLP) and with problems having multiple classes.
What Are the Pros and
Cons of Naive Bayes?
Pros:
·
It is easy and fast
to predict the class of the test data set. It also performs well in multi-class
prediction
·
When the assumption
of independence holds, the classifier performs better compared to other machine
learning models like logistic regression or decision tree and requires less
training data.
·
It performs well in the
case of categorical input variables compared to the numerical variable(s). For
numerical variables, the normal distribution is assumed (bell curve, which is a
strong assumption).
Cons:
·
If a categorical
variable has a category (in the test data set), which was not observed in the training
data set, then the model will assign a 0 (zero) probability and will be unable
to make a prediction. This is often known as “Zero Frequency”. To solve this,
we can use the smoothing technique. One of the simplest smoothing techniques is
called Laplace estimation.
·
On the other side,
Naive Bayes is also known as a bad estimator, so the probability outputs from
predict_proba are not to be taken too seriously.
·
Another limitation
of this algorithm is the assumption of independent predictors. In real life, it
is almost impossible that we get a set of predictors which are completely
independent.
Applications of Naive
Bayes Algorithms
· Real-time
Prediction: Naive Bayesian classifier is an eager
learning classifier and it is super fast. Thus, it could be used for making
predictions in real time.
· Multi-class
Prediction: This algorithm is also well known for multi-class
prediction features. Here we can predict the probability of multiple classes of
target variables.
· Text
classification/ Spam Filtering/ Sentiment Analysis:
Naive Bayesian classifiers mostly used in text classification (due to better results
in multi-class problems and independence rule) have higher success rates as
compared to other algorithms. As a result, it is widely used in Spam filtering
(identify spam e-mail) and Sentiment Analysis (in social media analysis, to
identify positive and negative customer sentiments)
· Recommendation
System: Naive Bayes Classifier and Collaborative Filtering
together build a Recommendation System that uses machine learning and data
mining techniques to filter unseen information and predict whether a user would
like a given resource or not.
Conclusion
The
Naive Bayes algorithm is one of the most popular and simple machine learning
classification algorithms.
It
is based on Bayes’ Theorem for calculating probabilities and conditional
probabilities.
You
can use it for real-time and multi-class predictions, text classifications,
spam filtering, sentiment analysis, and a lot more.
Reference
1. https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/
Hitansh Lakkad
Business Analytics intern at
Hunnarvi Technologies Pvt Ltd in collaboration with Nanobi Analytics.
VIEWS ARE PERSONAL
#naivebayesalgorithm
#datascience #businessanalytics #hunnarvi #nanobi #isme
Comments
Post a Comment