Reading Notes - Li Hang "Statistical Learning Methods" CH08

Chapter 8 Lifting Methods

Improve boosting method: In classification problems, it learns multiple classifiers by changing the weight of training samples, and linearly combines these classifiers to improve the performance of classification;

8.1 Boosting method AdaBoost algorithm

"Three stooges beat Zhuge Liang"

  • Strong learnable & weak learnable: PDF153;
  • AdaBoost algorithm: upgrade the "weak learning algorithm" to a "strong learning algorithm"; increase the weights of those samples that were misclassified by the weak classifier in the previous round, and reduce the weights of those that were correctly classified; adopt a weighted majority voting method to combine Weak classifier; (see PDF155 for details)
    
    
  • Algorithm properties: continuously reduce the training error during the learning process, that is, the classification error rate on the training data set; the training error decreases at an exponential rate; AdaBoost can adapt to the respective training error rates of weak classifiers;
    

8.3 Explanation of AdaBoost Algorithm

  • Another understanding of the AdaBoost algorithm: a two-class classification learning method when the model is an additive model, the loss function is an exponential function, and the learning algorithm is a forward distribution algorithm;

  • Forward step-by-step algorithm: Solve the optimization problem - empirical risk minimization, that is, the loss function minimization problem. Only one basis function and its coefficients are learned in each step from the front to the back, and the optimization objective function is gradually approached;
    
  • Relationship with AdaBoost algorithm: AdaBoost algorithm is a special case of forward step-by-step addition algorithm. At this time, the model is an addition model composed of basic classifiers, and the loss function is an exponential function; (see PDF161 for the proof)

8.4 Boosting trees

Refers to the improvement method using classification tree or regression tree as the basic classifier;
Boosting tree model: The addition model (that is, the linear combination of the basis functions) and the forward step algorithm are actually used, and the boosting method with the decision tree as the basis function is called the boosting tree;

Boosting tree algorithm: forward step algorithm;

Gradient boosting algorithm: Use the value of the negative gradient of the loss function in the current model as an approximation of the residual in the boosting tree algorithm for regression problems to fit a regression tree;

          
In this chapter, many algorithms are not understood, and the basic concepts are mastered, and later need to be learned.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324725119&siteId=291194637