Principles and differences between bagging and boosting

Bagging and Boosting both combine existing classification or regression algorithms in a certain way to form a more powerful classifier. To be more precise, this is an assembly method of classification algorithms. That is, the method of assembling weak classifiers into strong classifiers.

First introduce Bootstraping, the self-service method: it is a sampling method with replacement (repetitive samples may be drawn).

1、Bagging (bootstrap aggregating)

Bagging is the bagging method, and the algorithm process is as follows:

A) Extract the training set from the original sample set. In each round, n training samples are drawn from the original sample set using Bootstraping (in the training set, some samples may be drawn multiple times, and some samples may not be drawn at once). A total of k rounds of extraction are performed, and k training sets are obtained. (K training sets are independent of each other)

B) Each time you use one training set to get a model, k training sets get a total of k models. (Note: There is no specific classification algorithm or regression method here. We can use different classification or regression methods according to specific problems, such as decision trees, perceptrons, etc.)

C) For classification problems: use the k models obtained in the previous step to obtain the classification results by voting; for regression problems, calculate the mean value of the above models as the final result. (The importance of all models is the same)

 

2、Boosting

The main idea is to assemble the weak classifier into a strong classifier. Under the PAC (Probability Approximately Correct) learning framework, the weak classifier can be assembled into a strong classifier.

Two core questions about Boosting:

1) How to change the weight or probability distribution of the training data in each round?

By increasing the weights of the examples that were misclassified by the weak classifier in the previous round, and reducing the weights of the examples in the previous round, the classifier has a better effect on the misclassified data.

2) How to combine weak classifiers?

The weak classifiers are linearly combined through an additive model. For example, AdaBoost uses a weighted majority voting method to increase the weight of a classifier with a small error rate and reduce the weight of a classifier with a large error rate.

The boost tree gradually reduces the residuals by fitting the residuals, and superimposes the models generated in each step to obtain the final model.

 

3. The difference between Bagging and Boosting

The difference between Bagging and Boosting:

1) Sample selection:

Bagging: The training set is selected with replacement in the original set, and the training sets selected from the original set are independent of each other.

Boosting: The training set of each round is unchanged, but the weight of each example in the training set in the classifier changes. The weight is adjusted according to the previous round of classification results.

2) Example weight:

Bagging: Use uniform sampling, with equal weights for each sample

Boosting: Constantly adjust the weight of the sample according to the error rate. The greater the error rate, the greater the weight.

3) Prediction function:

Bagging: The weights of all prediction functions are equal.

Boosting: Each weak classifier has a corresponding weight, and a classifier with a small classification error will have a greater weight.

4) Parallel computing:

Bagging: Each prediction function can be generated in parallel

Boosting: Each prediction function can only be generated sequentially, because the latter model parameter requires the result of the previous round of the model.

 

4. Summary

Both of these methods are methods of integrating several classifiers into one classifier, but the method of integration is different, and the final result is different. It will be a certain degree to put different classification algorithms into this kind of algorithm framework. Improve the classification effect of the original single classifier, but also increase the amount of calculation.

The following is the new algorithm obtained by combining the decision tree with these algorithm frameworks:

1) Bagging + decision tree = random forest

2) AdaBoost + decision tree = boosted tree

3) Gradient Boosting + Decision Tree = GBDT

Reference: https://www.cnblogs.com/liuwu265/p/4690486.html

Guess you like

Origin blog.csdn.net/Matrix_cc/article/details/105365177