Decision tree machine learning

Decision Tree Algorithm

Classification and regression tree (the CART, Classification And Regression Tree)
the ID3 algorithm
C4.5 algorithm
. CHi-squared Automatic Interaction Detector ( CHAID) multilayer split by the spanning tree [. 7].
The MARS: better handling numeric data.

 

 

Decision tree as a predictive model to predict the label samples. This tree is also known as classification trees or regression trees.

Data are expressed as follows:

{\displaystyle ({\textbf {x}},Y)=(x_{1},x_{2},x_{3},...,x_{k},Y)}

Wherein Y is a target value, the vector x consists of these properties, x . 1 , x 2 , x . 3  and the like, to obtain a target value.

 

1, images from Zhou Zhihua "machine learning" Chapter 4

   

Decision tree based on information entropy division, the principle of division of the book is: information gain the greatest attribute divide division after selection, calculation of information gain can be described as:

$D - \sum_{i=1}^jD_i * \frac{c_i}{c}$

 

Among them, the smaller the better the information entropy, information gain bigger the better. The information gain equation can be derived, DD is the same as the same sample, so $ - \ sum_ {i = 1 } ^ jD_i * \ frac {c_i} {c} $ smaller, the better the result. So I stole a lazy, by minimizing $ - \ sum_ {i = 1 } ^ jD_i * \ frac {c_i} {c} $ to achieve the effect of maximizing the information gain. Wherein, according to the current attribute represents DiDi division, the information entropy of class ii, ii cici represents the number of samples of the class, CC represents the total number of samples. For example, the data set needs to be calculated according to the division result watermelon "pattern" of this attribute, the "pattern" attribute a clear, slightly paste, three fuzzy categories, the information gain is entropy three subcategories after subtracting the division before the division weighted average entropy.
Calculated entropy can be described as: dividing a rear branch, the positive and negative samples and the weighted sum of percentage ratio:


$-\sum_{i=1}^j p_i * log_2p_i$

 

 Algorithm core processes:

S1] in this case all nodes of the same type 
the same attributes of all nodes S2]
S3] to find the optimal partition property
S4] recursively create subtree


References:

  Decision Tree

  Machine Learning Job 4 - a decision tree and pruning  

  

Guess you like

Origin www.cnblogs.com/xiaoniu-666/p/10939450.html