Boosting family of machine learning algorithm summary: AdaBoost

write in front

As the so-called "three stooges, the top Zhuge Liang", there is also such a philosophy in machine learning. Boosting is such a representative: an algorithm that comprehensively improves a series of weak learners into strong learners . The working mechanism of the Boosting family of algorithms: (1) First train a base learner from the initial training set; (2) Then adjust the training samples according to the performance of the base learner, so that the training samples that the previous base learner made mistakes in the follow-up (3) Continue to train the next basic learner based on the adjusted training samples; (4) Repeat this until the number of basic learners reaches the pre-set value T, and finally the T basic learners are The learner performs a weighted combination. (Aha is such a process, for specific examples, please refer to the courseware Lecture8 of Mr. Lin Xuantian's machine learning techniques)

In the Boosting family, the most representative implementation is Adaboost.

1. Introduction to AdaBoost

AdaBoost is the abbreviation of "Adaptive Boosting" (Adaptive Boosting) in English. Its adaptation lies in the fact that the weight of the wrongly classified sample in the previous basic classifier will increase, while the weight of the correctly classified sample will decrease. and used again to train the next base classifier. At the same time, in each round of iteration, a new weak classifier is added, and the final strong classifier is not determined until a predetermined small enough error rate or a pre-specified maximum number of iterations is reached.

2. AdaBoos algorithm

Input: training data set T (two classification problem); weak learning algorithm;

Output: final classifier G(x)

(1) Initialize the weight distribution of the training data. The first step assumes the same distribution of weights, namely:


(2) Repeatedly learning the basic classifier, in each round m=1,2,...,M, loop execution:

        (a) Using the training dataset weighted by the current distribution Dm, learn a basic classifier Gm(x);

(b) Calculate the classification error rate of         the base classifier Gm(x) on the weighted training dataset :


        (c) Calculate the coefficients of the base classifier Gm(x):


ps. Here log should be ln, natural logarithm

            Here alpha represents the importance. According to the mathematical definition, alpha increases as e decreases, indicating that the smaller the classification error rate, the greater the weight of the classifier in the final classifier.

        (d) Update the weight distribution of the training dataset


       The above update weight Wm+1 can also be written in the following form:


        It can be seen that the weight of the misclassified samples is increased in the next round of training, while the weight of the correct classification is decreased.

(3) Construct a linear combination of basic classifiers


to get the final classifier


3. AdaBoost algorithm explanation

AdaBoost can be understood as a two-class classification learning method in which the model is an additive model , the loss function is an exponential function , and the learning algorithm is a forward step algorithm .

The general idea of ​​the forward step-by-step algorithm: learning is an additive model. From front to back, only one basis function and its coefficients are learned in each step, and the optimization objective function formula is gradually approximated.

This simplifies solving all parameter problems from m=1 to M at the same time to solving the parameter problem of each cycle successively.

The specific steps of the algorithm are implemented:


Looking at the AdaBoost classifier form again, it can be seen that it is a special case of the forward step algorithm.

4. AdaBoost instance

After so much talking, mathematics is still relatively abstract. Here is an example of using the AdaBoost algorithm to achieve binary classification.

Adaboost algorithm principle analysis and example + code (concise and easy to understand)

Here the great god uses the matlab implementation code. I plan to see if I can write a python version for reference at night~


5. Summary

AdaBoost is one of the most popular algorithms in Boosting methods. It uses the weak classifier as the basic classifier. After inputting the data, it is weighted by the weighted vector; in each round of iteration, the weighted vector is updated based on the weighted error rate of the weak classifier, so as to proceed to the next iteration. And in each iteration, the coefficient of the weak classifier will be calculated, and the size of the coefficient will determine the importance of the weak classifier in the final prediction classification. Obviously, the combination of these two points is the advantage of the adaBoost algorithm.

  Advantages: low generalization error rate, easy to implement, can be applied to most classifiers, no parameter adjustment

  Cons: Sensitive to discrete data points

Finally, I found a frame diagram of AdaBoost on the Internet to deepen my impression~


above~

2018.04.18



Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324641700&siteId=291194637