Random forest and gradient boosting tree

Lift the tree model:

The lifting method actually uses an addition model (that is, a linear combination of basis functions) and a forward step algorithm. The lifting method based on decision tree is called lifting tree. The decision tree for classification problems is a binary classification tree. The lifting tree model can be expressed as an addition model of the decision tree: fMx = m = 1MT (x; θm), where T (x; θm) represents the decision tree; θm is the parameter of the decision tree; M is the number of trees.

The lifting tree algorithm uses a forward step algorithm. First determine the initial lifting tree f0x = 0, the model of the mth step is fmx = fm-1x + T (x; θm), where fm-1x is the current model, and the parameters of the next decision tree are determined by minimizing the empirical risk θm.

The lifting tree of the regression model
Assuming the square error loss function,

For the lifting tree algorithm of the regression problem, it is only necessary to simply fit the residual of the current model.

But for the general loss function, it is often not easy to optimize each step. For this problem, Freidman proposed a gradient lifting algorithm. This is an approximate method using the steepest descent method. The key is to use the negative gradient of the loss function in the current The value of the model. Fit a regression tree as an approximation of the residuals in the regression tree lifting tree algorithm.

Random Forest and Gradient Lifting Tree
At the algorithm level, random forest constructs training samples by randomly sampling the data set, and believes that randomization is beneficial to the generalization performance of the model on the test set. The gradient lifting tree finds the optimal linear combination of all decision trees based on the training data.

Random forests are easier to train than gradient boosting trees. Random forests only need to set a hyperparameter. The number of features randomly selected on each node is set to the total number of features of log2 or the square root of the total number of features in most cases. Get good results. The gradient boosting tree parameters include the number, depth, and learning rate of the boosting tree.

Random forests are more sad to fit than gradient boosting trees.

The gradient lifting tree is extremely sensitive to noise. From the perspective of deviation and variance, if the data is noisy, the boosting algorithm may show a higher model variance. However, in other cases, the boosting algorithm can often achieve better results. Random forests do not build integrated models based on model residuals, and can often achieve very low variance.
————————————————
Copyright Statement: This article is an original article by CSDN blogger "Marnez", following the CC 4.0 BY-SA copyright agreement, please attach the original source link and reprint This statement.
Original link: https://blog.csdn.net/ma412410029/article/details/84590204

Published 44 original articles · 130 praises · 1.37 million views

Guess you like

Origin blog.csdn.net/gb4215287/article/details/105181926