Here are a few good references:
- original author paper
- Excellent blogger
- Zhihu reference
- Blog Garden
The following is the collation information
1. Algorithm Introduction
XGBoost is a decision tree model that uses gradient descent to solve the tree. It is unique in that it uses second-order Taylor expansion and regularization to optimize the ordinary ladder boosting tree (GBDT).
(quoted from the second reference)
The objective function of this model is the sum of the loss function values of each sample plus the regular term is the entire objective function, the smaller the return value of the objective function, the better the model performance.
The reason why the L part is a constant is that the new model is iterated on the basis of the original model, that is, the original model has been determined, and the loss function of the original model is a constant.
The formula in Chen Tianqi's paper is like this
One of the transformations here is that the original calculation is the sum of the predicted values of each point. In fact, the predicted values of all samples on a leaf are actually the same, so it is finally converted to the sum of the predicted values of all leaves (the predicted value of a leaf is the sum of the predictions of all samples inside the leaf)