Adhering to bagging;
Multiple satellites configured independently CART decision tree, forming a forest, a common decision outputs;
Two random:
1) random input data: back into the selected portion of data from the entire data;
2) Each particle is characterized by a decision tree constructed from all feature randomly; (M feature selected from the m, and then select the best features from the m-th node as)
advantage:
1) easy to overfitting, strong noise resistance;
2) a highly parallel, fast operation;
3) unbiased estimate;
4) deletion insensitive to partial feature;
Random Forests parameter adjustment
1, algorithm type: ID3, C4.5, CART
2, number (n_estimator) tree
(0,100]
More sub-tree model to improve performance, reduce speed;
3, a random number of attributes (max_features)
logN, N / 3, sqrt (N), N
Random number of attributes increases, improving model performance, reduce the diversity of a single tree, reduce the speed;
4, the maximum depth of the tree
$[1,\infty )$
-1 represents the fully grown tree;
5, the leaf node minimum number of records (min_sample_leaf):
Minimum number of leaf nodes in the data, a minimum of two, usually about 50
Smaller leaves are easier to catch noise model training data, the training data better, more complex models;
6, recording a minimum percentage of leaf nodes
Data representing the number of leaf nodes in the minimum proportion of the parent node;