What is Boosting?

Boosting is an efficient algorithm that is able to convert a weak learner into a strong learner. Example: Suppose we want to check that an email is “spam email” or “safe email”? In this case, there can be multiple possibilities like;

Rule 1: Email contains only links to some websites.
- Decision: It is a spam
Rule 2: Email from an official email address. e.g [email protected].
- Decision: It is not spam.
Rule 3: Email has a request to get private bank details. e.g bank account number and father/mother name etc.
- Decision: It is a spam

Now the question is that the 3 rules discussed above or enough to classify an email as “spam” or not?

Answer: These 3 rules are not enough. These 3 rules are weak learner. So we need to boost these learners. We can boost the weak learners to the stronger learner by boosting.
Boosting can be done by combining and assigning weights to every weak learner.

Boosting have greater accuracy as compared to Bagging.

How does boosting algorithm work?

The boosting algorithm works to generate multiple weak learners and combine the predictions of these weak learners to form one strong learner. These weak learners are the result of applying different Learning algorithms on different distributions of the given sample of the data set.

Is Random Forest a boosting algorithm?

No, Random Forest is not a boosting algorithm. A random forest is an averaging method or ensemble bagging method and it is used to reduce the variance of individual trees by randomly selecting many trees from the given sample of the dataset, and finally to calculate the average of all.

Is Boosting better than bagging?

Boosting tries to reduce bias but Both of these are perform well when we want to reduce the variance and to provide higher stability. In case of over-fitting problem, Bagging is the best solution, but Boosting can be helpful to increase it.

Why does boosting not Overfit?

Boosting not overfit, and some of the reasons are mentioned below; As iterations move forward, the impact of changes is localized. Parameters are not jointly optimized. It involves the stage-wise estimation and this is the main reason to slows down the learning process of the algorithm.

Is gradient boosting supervised or unsupervised?

Gradient boosting is a supervised machine learning technique and it is used for classification and regression purpose.

Does boosting use bootstrap?

We can implement the boosting by using the bootstrap samples with the help of Gradient Boosting.

Why does boosting reduce bias?

Boosting tries its best to reduce the error in predictions, so we can say that boosting helps to reduce the biasness.

Types of boosting algorithm:

Three main types of boosting algorithm are as follows;

XGBoost algorithm
AdaBoost algorithm
Gradient tree boosting algorithm.

Next Similar Tutorials

Decision tree induction on categorical attributes – Click Here
Decision Tree Induction and Entropy in data mining – Click Here
Overfitting of decision tree and tree pruning – Click Here
Attribute selection Measures – Click Here
Computing Information-Gain for Continuous-Valued Attributes in data mining – Click Here
Gini index for binary variables – Click Here
Bagging and Bootstrap in Data Mining, Machine Learning – Click Here
Evaluation of a classifier by confusion matrix in data mining – Click Here
Holdout method for evaluating a classifier in data mining – Click Here
RainForest Algorithm / Framework – Click Here
Boosting in data mining – Click Here
Naive Bayes Classifier – Click Here

Boosting in data mining
By: Prof. Dr. Fazal Rehman | Last updated: March 3, 2022