Overfitting of tree:
Before overfitting of tree, let’s revise test data and training data;
Training data is the data that is used for prediction.
Test data is used to assess the power of training data in prediction.
Overfitting means too many un-necessary branches in the tree. Overfitting results in different kind of anomalies that are the results of outliers and noise.
How to avoid overfitting?
There are two techniques to avoid overfitting;
Pree-Pruning means to stop the growing tree before a tree is fully grown.
Post-Pruning means to allow the tree to grow with no size limit. After tree completion starts to prune the tree.
Advantages of pree-pruning and post-pruning:
- Pruning controls to increase the tree un-necessary.
- Pruning reduces the complexity of tree.