Summary of all notes: "Machine Learning" Watermelon Book-Summary of reading notes
You can take a look at this first: Reading Notes on Statistical Learning Methods (5)-Decision Tree
One, the basic process
Decision tree (decision tree) is a common type of machine learning method.
Second, the division choice
Through the information gain or information gain rate, and the Gini coefficient, the specific process is described in detail in the article in the link above.
Three, pruning treatment
The article in the link above is very detailed.
Four, continuous and missing values
Continuous value: Use dichotomy to process continuous attributes, which is also the mechanism adopted by the C4.5 decision tree algorithm.
Missing value: By calculating the information gain, the same sample is classified into different child nodes with different probabilities.
Five, multivariate decision tree
It is equivalent to a multi-classification problem, that is, when dividing the same attribute, it is no longer two classifications but multiple classifications to find the best decision tree. It is no longer to find the best decision tree for each non-leaf node. Excellent partition attributes.
The next chapter portal: Watermelon book reading notes (5)-neural network