Holdout method for evaluating a classifier in data mining

Holdout method:

All data is randomly divided into same equal size data sets. e.g,

  1. Training set
  2. Test set
  3. Validation set

Training set:

  • It is a data set helps in the prediction of the model.

[quads id=1]

Test set:

  • Unseen data is used as a subset of the data set to assess the performance of the model.

Validation set:

  • The validation set is also a data set used to assess the performance of model built during the training.

For example;

There are total 3 data sets.

Total training set for model construction

  • 2/3

Total test set for accuracy estimation

  • 1/3

Next Similar Tutorials

  1. Decision tree induction on categorical attributes  – Click Here
  2. Decision Tree Induction and Entropy in data mining – Click Here
  3. Overfitting of decision tree and tree pruning – Click Here
  4. Attribute selection Measures – Click Here
  5. Computing Information-Gain for Continuous-Valued Attributes in data mining – Click Here
  6. Gini index for binary variables – Click Here
  7. Bagging and Bootstrap in Data Mining, Machine Learning – Click Here
  8. Evaluation of a classifier by confusion matrix in data mining – Click Here
  9. Holdout method for evaluating a classifier in data mining – Click Here
  10. RainForest Algorithm / Framework – Click Here
  11. Boosting in data mining – Click Here
  12. Naive Bayes Classifier  – Click Here