(Vii) AdaBoost Profile

Author: chen_h
Micro Signal & QQ: 862251340
micro-channel public number: coderpai


Getting Started (a) machine learning ensemble learning

(Ii) Method Bagging

(C) transactions with Python Random Forest algorithm

(Iv) implementation and interpretation of random forests in Python

(E) How to achieve Bagging algorithm from scratch in Python

(Vi) how to use Python from scratch to achieve random forest algorithms

(Vii) AdaBoost Profile


Boosting is an integrated technology, trying to create a strong classifier from a number of weak classifiers. In this article, we will introduce AdaBoost method. After reading this article, you'll know:

  • The basic principle of boosting algorithm;
  • Learn how to use AdaBoost algorithm to enhance the decision tree;
  • AdaBoost learned how to use the model to predict;
  • How best to prepare data to be used in conjunction with the AdaBoost algorithm;

This article is written for developers, there is no statistical or mathematical background. This article focuses on the algorithm works and how to use it for predictive modeling. If you have any questions, you can leave a message in the background. So, here we go.

Boosting integration method

Boosting is a universal integration approach, we can create a strong classifier from a number of weak classifiers. This is done by building a model from the training data, and then to correct the mistakes of the first model by building a second model. Model added until the maximum number or perfect forecast. AdaBoost Boosting algorithm is the first successful development, if this learning algorithm can better help us learn. Many of the modern boosting algorithm are based on AdaBoost algorithm, the most famous is the stochastic gradient upgrade.

AdaBoost learning models from data

AdaBoost is best suited for improving the performance of decision trees on a binary classification problem.

AdaBoost originally author Freund and Schapire called AdaBoost M1. Recently, it is called a discrete AdaBoost, because it is used for classification and not return.

AdaBoost can be used to improve the performance of any machine learning algorithm. It is most suitable for weak learners to use these models to achieve a higher accuracy in the stochastic model of classification.

AdaBoost most suitable for use with the decision tree having a layer structure. Because these trees is very short, containing only a classification decision, so they are often called decision tree stump.

Each instance of the training data set is required weighted initial weight is set to:

w e i g h t ( x i ) = 1 / n weight(xi) = 1/n

among them, x i x_i Is the i-th training examples, n being the number of training examples.

How to train a model

Sample preparation using a weighted weak classifiers (decision stumps) in the training data. This model supports only binary classification, so each decision tree stumps to make a decision on an input variable, and for the first class or second class output value of 1.0 or -1.0.

Misclassification rate calculated for a model training. Conventionally, calculated as follows:

e r r The r = ( c The r r e c t N ) / N error = (correct – N) / N

Where the error is the misclassification rate, correct classification was correct model number, N is the total number of training examples. For example, if the model correctly predicted 100 of 78 training instances, error or misclassification rate will be (100-78) / 100 or 0.22.

But we need to modify the wrong training examples:

e r r The r = s in m ( w ( i ) t e r r The r ( i ) ) / s in m ( w ) error = sum(w(i) * terror(i)) / sum(w)

This is a weighted sum of misclassification, wherein wi is the weight of weight training examples i, Terror i is the prediction error of training examples, if the error was classified 1, or 0 if the correct classification.

For example, if we have three training examples, and its weight is 0.01,0.5 0.2. Predictive value of -1, -1 and -1, examples of the true output variable -1,1 and -1. Then terror is 0, 1 and 0. Misclassification rate is calculated as follows:

error = (0.010 + 0.51 + 0.2*0) / (0.01 + 0.5 + 0.2) = 0.704

The training value of the model calculation stage, the stage is made to any predictive model weighted. Stage training model value is calculated as follows:

stage = ln((1-error) / error)

Wherein, stage by stage values ​​for weighted prediction model, LN () is the natural logarithm, and the error is the wrong model for classification. The right to re-stage the impact is more accurate model has a greater weight or contribution to the final prediction.

Update training weights, more weight is an example of the wrong prediction, the prediction for instance the right to correct weights smaller.

For example, using the following update a training examples (w) weight:

w = w * exp(stage * terror)

Where, w is the specific weight of the weight training examples, exp () is the numeric constant e, stage is the misclassification of weak classifiers, weak classifiers error Terror is predicted output variables, specifically calculated as:

terror = 0 if(y == p), otherwise 1

Where y is the output variable of training examples, p is predicted from the weak classifiers.

If the training examples are classified correctly, then the right will not change the weight, if not correctly classified training examples, the weights will be a corresponding increase.

Use AdaBoost forecast

It is predicted by calculating a weighted average of the weak classifiers.

For instance the new input, the analyzer calculates the predicted value of each weak +1 or -1. A weighted prediction value phase values ​​for each weak learner. Integrated forecasting model is regarded as the sum of weighted prediction. If the sum is positive, the first type of prediction, if it is negative, the second type of prediction.

For example, the prediction of the weak classifiers 5 is 1,1, 1,1 and -1. See the most votes, or a model predictive value of a first look class. But we have to calculate the weight, assuming that weight is 0.2,0.5,0.8,0.2 and 0.9. And calculating the weighted prediction, the final output is -0.8, which is -1 or the second type of prediction.

to sum up

In this article, you found a boosting method for machine learning, you can learn:

  • Lifting weak classifier algorithm becomes strong classifier;
  • AdaBoost is the first successful use of a boosting algorithm;
  • Model learning AdaBoost training examples and the weighted weak classifiers;
  • AdaBoost learning model from the weighted prediction by the weak classifiers;
Published 414 original articles · won praise 168 · views 470 000 +

Guess you like

Origin blog.csdn.net/CoderPai/article/details/97147488