Naïve Bayes is a probabilistic machine learning algorithm based on Bayes’ Theorem.
It is called “naïve” because it assumes that all features (input variables) are independent of each other — which is rarely true in real life, but this assumption still works surprisingly well in many practical cases.
Bayes’ Theorem
Where:
-
: Probability of class A given feature B (posterior probability)
-
: Probability of feature B given class A (likelihood)
-
: Probability of class A (prior probability)
-
: Probability of feature B (evidence)
Working of Naïve Bayes Classifier
-
Calculate prior probabilities for each class.
-
Calculate likelihood (probability of each feature given a class).
-
Apply Bayes’ theorem to find the posterior probability for each class.
-
The class with the highest posterior probability is the predicted class.
Example — Email Spam Classification
Let’s say we want to classify whether an email is “Spam” or “Not Spam” using the presence of the word “Offer”.
Step 1: Given Data
| Word “Offer” Present | Class | |
|---|---|---|
| 1 | Yes | Spam |
| 2 | Yes | Spam |
| 3 | No | Not Spam |
| 4 | Yes | Not Spam |
| 5 | No | Not Spam |
Step 2: Calculate Priors
Step 3: Calculate Likelihoods
Step 4: Apply Bayes’ Theorem
We want to find whether an email with “Offer” is spam.
But since we only compare probabilities, we can ignore because it’s the same for all classes.
Step 5: Compare and Classify
Since :
Applications of Naïve Bayes
-
Spam filtering (Gmail, Yahoo Mail)
-
Sentiment analysis (Positive/Negative reviews)
-
Document classification
-
Medical diagnosis
-
Weather prediction
Where:
-
: Probability of class A given feature B (posterior probability)
-
: Probability of feature B given class A (likelihood)
-
: Probability of class A (prior probability)
-
: Probability of feature B (evidence)
Working of Naïve Bayes Classifier
-
Calculate prior probabilities for each class.
-
Calculate likelihood (probability of each feature given a class).
-
Apply Bayes’ theorem to find the posterior probability for each class.
-
The class with the highest posterior probability is the predicted class.
Example — Email Spam Classification
Let’s say we want to classify whether an email is “Spam” or “Not Spam” using the presence of the word “Offer”.
Step 1: Given Data
Email Word “Offer” Present Class 1 Yes Spam 2 Yes Spam 3 No Not Spam 4 Yes Not Spam 5 No Not Spam Step 2: Calculate Priors
Step 3: Calculate Likelihoods
Step 4: Apply Bayes’ Theorem
We want to find whether an email with “Offer” is spam.
But since we only compare probabilities, we can ignore because it’s the same for all classes.
Step 5: Compare and Classify
Since :
Applications of Naïve Bayes
-
Spam filtering (Gmail, Yahoo Mail)
-
Sentiment analysis (Positive/Negative reviews)
-
Document classification
-
Medical diagnosis
-
Weather prediction
-
Next Similar Tutorials
-
Bayesian Networks MCQs | Artificial Intelligence
- Decision tree induction on categorical attributes
- Decision Tree Induction and Entropy in data mining – Click Here
- Overfitting of decision tree and tree pruning – Click Here
- Attribute selection Measures – Click Here
- Computing Information-Gain for Continuous-Valued Attributes in data mining – Click Here
- Gini index for binary variables – Click Here
- Bagging and Bootstrap in Data Mining, Machine Learning – Click Here
- Evaluation of a classifier by confusion matrix in data mining – Click Here
- Holdout method for evaluating a classifier in data mining – Click Here
- RainForest Algorithm / Framework – Click Here
- Boosting in data mining – Click Here
- Naive Bayes Classifier – Click Here