Naive Bayes classifier in data mining

Naïve Bayes is a probabilistic machine learning algorithm based on Bayes’ Theorem.
It is called “naïve” because it assumes that all features (input variables) are independent of each other — which is rarely true in real life, but this assumption still works surprisingly well in many practical cases.

Bayes’ Theorem

$\frac{P(B|A) \times P(A)}{P(B)}$

Where:

P(A∣B)P(A|B) $P (A ∣ B)$ : Probability of class A given feature B (posterior probability)
P(B∣A)P(B|A) $P (B ∣ A)$ : Probability of feature B given class A (likelihood)
P(A)P(A) $P (A)$ : Probability of class A (prior probability)
P(B)P(B) $P (B)$ : Probability of feature B (evidence)

Working of Naïve Bayes Classifier

Calculate prior probabilities for each class.
Calculate likelihood (probability of each feature given a class).
Apply Bayes’ theorem to find the posterior probability for each class.
The class with the highest posterior probability is the predicted class.

Example — Email Spam Classification

Let’s say we want to classify whether an email is “Spam” or “Not Spam” using the presence of the word “Offer”.

Step 1: Given Data

Email	Word “Offer” Present	Class
1	Yes	Spam
2	Yes	Spam
3	No	Not Spam
4	Yes	Not Spam
5	No	Not Spam

Step 2: Calculate Priors

P(Spam)=25=0.4P(Spam) = \frac{2}{5} = 0.4 $P (Sp am) = \frac{2}{5} = 0.4$ P(NotSpam)=35=0.6P(NotSpam) = \frac{3}{5} = 0.6 $P (N o tSp am) = \frac{3}{5} = 0.6$

Step 3: Calculate Likelihoods

P(Offer=Yes∣Spam)=22=1P(Offer = Yes | Spam) = \frac{2}{2} = 1 $P (O ff er = Y es ∣ Sp am) = \frac{2}{2} = 1$ P(Offer=Yes∣NotSpam)=13≈0.33P(Offer = Yes | NotSpam) = \frac{1}{3} \approx 0.33 $P (O ff er = Y es ∣ N o tSp am) = \frac{1}{3} \approx 0.33$

Step 4: Apply Bayes’ Theorem

We want to find whether an email with “Offer” is spam.

P(Spam∣Offer=Yes)=P(Offer=Yes∣Spam)×P(Spam)P(Offer=Yes)P(Spam | Offer = Yes) = \frac{P(Offer = Yes | Spam) \times P(Spam)}{P(Offer = Yes)} $P (Sp am ∣ O ff er = Y es) = \frac{P ( O ff er = Y es ∣ Sp am ) \times P ( Sp am )}{P ( O ff er = Y es )}$

But since we only compare probabilities, we can ignore P(Offer=Yes)P(Offer = Yes) $P (O ff er = Y es)$ because it’s the same for all classes.

P(Spam∣Offer=Yes)∝1×0.4=0.4P(Spam | Offer = Yes) \propto 1 \times 0.4 = 0.4 $P (Sp am ∣ O ff er = Y es) \propto 1 \times 0.4 = 0.4$ P(NotSpam∣Offer=Yes)∝0.33×0.6=0.198P(NotSpam | Offer = Yes) \propto 0.33 \times 0.6 = 0.198 $P (N o tSp am ∣ O ff er = Y es) \propto 0.33 \times 0.6 = 0.198$

Step 5: Compare and Classify

Since 0.4>0.1980.4 > 0.198 $0.4 > 0.198$ :

Email is classified as SPAM.\boxed{\text{Email is classified as SPAM.}} $Email is classified as SPAM.$

Applications of Naïve Bayes

Spam filtering (Gmail, Yahoo Mail)
Sentiment analysis (Positive/Negative reviews)
Document classification
Medical diagnosis
Weather prediction

Where:
- $P (A ∣ B)$ : Probability of class A given feature B (posterior probability)
- $P (B ∣ A)$ : Probability of feature B given class A (likelihood)
- $P (A)$ : Probability of class A (prior probability)
- $P (B)$ : Probability of feature B (evidence)
Working of Naïve Bayes Classifier
1. Calculate prior probabilities for each class.
2. Calculate likelihood (probability of each feature given a class).
3. Apply Bayes’ theorem to find the posterior probability for each class.
4. The class with the highest posterior probability is the predicted class.
Example — Email Spam Classification

Let’s say we want to classify whether an email is “Spam” or “Not Spam” using the presence of the word “Offer”.

Step 1: Given Data

Email Word “Offer” Present Class

1 Yes Spam

2 Yes Spam

3 No Not Spam

4 Yes Not Spam

5 No Not Spam

Step 2: Calculate Priors

$\frac{2}{5} = 0.4$ $\frac{3}{5} = 0.6$

Step 3: Calculate Likelihoods

$\frac{2}{2} = 1$ $\frac{1}{3} \approx 0.33$

Step 4: Apply Bayes’ Theorem

We want to find whether an email with “Offer” is spam.

$\frac{P(Offer = Yes | Spam) \times P(Spam)}{P(Offer = Yes)}$

But since we only compare probabilities, we can ignore $P (O ff er = Y es)$ because it’s the same for all classes.

$\propto 1 \times 0.4 = 0.4$ $\propto 0.33 \times 0.6 = 0.198$

Step 5: Compare and Classify

Since $0.4 > 0.198$ :

$SPAM.\boxed{\text{Email is classified as SPAM.}}$

Applications of Naïve Bayes
- Spam filtering (Gmail, Yahoo Mail)
- Sentiment analysis (Positive/Negative reviews)
- Document classification
- Medical diagnosis
- Weather prediction

Next Similar Tutorials

Bayesian Networks MCQs | Artificial Intelligence
Decision tree induction on categorical attributes
Decision Tree Induction and Entropy in data mining – Click Here
Overfitting of decision tree and tree pruning – Click Here
Attribute selection Measures – Click Here
Computing Information-Gain for Continuous-Valued Attributes in data mining – Click Here
Gini index for binary variables – Click Here
Bagging and Bootstrap in Data Mining, Machine Learning – Click Here
Evaluation of a classifier by confusion matrix in data mining – Click Here
Holdout method for evaluating a classifier in data mining – Click Here
RainForest Algorithm / Framework – Click Here
Boosting in data mining – Click Here
Naive Bayes Classifier – Click Here