CART Decision Tree Essentials

https://www.jianshu.com/p/fb97b21aeb1d

What is a decision tree

Answer: The essence of decision tree is to find a set of classification rules from the training data, so that this rule can fit the training data as much as possible while having good generalization ability.
It can also be said to estimate a conditional probability model based on training data.

Decision tree regression tree building rules and loss function

  For the processing of continuous values, we know that the CART classification tree uses the size of the Gini coefficient to measure the pros and cons of each division point of the feature. This is more suitable for the classification model, but for the regression model, we use the common and variance measurement methods. The measurement goal of the CART regression tree is that, for any partition feature A, the corresponding arbitrary partition point s is divided on both sides of the dataset D1 and D2, find the feature and feature value division points corresponding to the minimum mean square error of the respective sets of D1 and D2, and the minimum sum of the mean square error of D1 and D2. The expression is:
write picture description here

  Suppose the input space has been divided into M cells R 1 , R 2 , . . . , R M , and in each unit R M has a fixed output value on c m , so the regression tree model can be expressed as

f ( x ) = m = 1 M c m I ( x R m )

  When the input space partition is determined, the squared error can be used x i R m ( and i f ( x i ) ) 2
to represent the prediction error of the regression tree for the training data, and use the criterion of minimizing the squared error to find the optimal output value on each unit.

  The loss function of a decision tree is a regularized maximum likelihood function

Classification tree decision tree loss function explanation:

https://blog.csdn.net/wjc1182511338/article/details/76793598
makes sense

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324770367&siteId=291194637