Evaluation of a classifier by confusion matrix in data mining
How to evaluate a classifier?
The classifier can be evaluated by building the confusion matrix. Confusion matrix shows the total number of correct and wrong predictions.
Confusion Matrix for class label positive(+VE) and negative(-VE)is shown below;
Actual Class(Target) | |||||
+VE | -VE | ||||
Predicted Class (Model) | +VE | A = True +VE | B = False -VE | +VE prediction | P=A / (A+B) |
-VE | C = False +VE | D = True -VE | -VE prediction | D / (C + D) | |
Sensitivity | Specificity | Accuracy = A + D / (A + B + C + D) | |||
A / (A + C) | D / (B + D) |
[quads id=1]
Accuracy:
Accuracy is the proportion of the total number of correct predictions.
e.g
Accuracy = A + D / (A + B + C + D)
Error-Rate:
Error Rate = 1 – Accuracy
+VE predictions:
+VE predictions are the proportion of the total number of correct positive predictions.
+VE predictions = A / (A+B)
[quads id=2]
-VE predictions:
-VE predictions are the proportion of the total number of correct negative predictions.
-VE predictions = D / (C + D)
Precision:
Precision is the correctness that how much tuple are
- +VE and classifier predicted them as +VE
- -VE and classifier predicted them as -VE
Precision = A / P
Recall:
Recall = A / Real positive
Sensitivity (Recall):
Sensitive is the total True +VE rate.
The correction of the actual positive cases that are correctly identified.
Sensitivity (Recall) = A / (A + C)
F-Measure:
F-Measure is harmonic mean of recall and precision.
F-Measure = 2 * Precision * Recall / Precision + Recall
Specificity:
Specificity is true -VE rate.
Specificity is the proportion of the actual -VE cases that are correctly identified.
Specificity = D / (B + D)
Note: Specificity of one class is same as the sensitivity of the other class.
Next Similar Tutorials
- Decision tree induction on categorical attributes – Click Here
- Decision Tree Induction and Entropy in data mining – Click Here
- Overfitting of decision tree and tree pruning – Click Here
- Attribute selection Measures – Click Here
- Computing Information-Gain for Continuous-Valued Attributes in data mining – Click Here
- Gini index for binary variables – Click Here
- Bagging and Bootstrap in Data Mining, Machine Learning – Click Here
- Evaluation of a classifier by confusion matrix in data mining – Click Here
- Holdout method for evaluating a classifier in data mining – Click Here
- RainForest Algorithm / Framework – Click Here
- Boosting in data mining – Click Here
- Naive Bayes Classifier – Click Here