[Sklearn] Data classification prediction based on decision tree algorithm (Excel can directly replace data)
1. Model Principle
A decision tree is a classification and regression model based on a tree structure, which divides data into different categories or predictive values through a series of decision rules. The model principle and mathematical model of the decision tree are as follows:
1.1 Model principle
The basic idea of a decision tree is to start from the root node, through a series of nodes and branches, divide the data set into different subsets according to the values of different features, until reaching the leaf nodes, and then assign each leaf node to a category or predicted values. The process of building a decision tree is the process of determining how to select features and how to divide the data set.
The main steps of a decision tree:
-
Feature selection: Select the best feature from all features as the partition feature of the current node. This selection is usually based on a certain metric (such as information gain, Gini coefficient) to evaluate the importance of different features.
-
Divide the dataset: according to the selected features &