"Machine Learning" (watermelon book) Summary

 Chapter One

<Be more>

 

Chapter II model evaluation and selection

1, and the empirical error overfitting

Error generation become the training set or training error empirical error; unknown since the generalization error, the error is measured so that the empirical generalization error.

Experience should allow appropriate error; instead as small as possible, too, is easy overfitting; too large, underfitting;

2, the model evaluation methods

Assess learner generalization error test set used, to the extent possible mutually exclusive with the training set. The method of generating the data set from the training set and test set are:

Method aside (Hold-out): stratified sampling, direct data set into a set of two mutually exclusive, as a training set, as a test set.

  • Randomly divided into several, repeated after averaging; training set and data set size compromise: 2 / 3-4 / 5 used for training.

Cross-validation (Cross Validation): k first data set into a disjoint set of the same size (still using hierarchical), and then with each k-1 subsets and set as a training set, as another test set, a k times, eventually returning k-means test results.

  • Exception: leave-one (leave-one-out), features: accurate assessment, but too much overhead.

Bootstrap (Bootstrap): based on a self-sampling method, a sample from each randomly selected data set D, which was copied into the data set D`; repeated m times, to obtain a training set of m samples comprising D` . D \ D` as a test set.

  • Bagging and Random Senlin Ji here.

 3, performance metrics

Error rate and accuracy

Precision and recall: PR curve.

ROC and AUC: ROC curve is the rate of false positive cases and real cases rate; AUC: Area Under ROC Curve.

 

Chapter VIII integrated learning

Integrated learning to complete the task of learning by building and combining multiple learners. Related concepts: homogeneous / heterogeneous integration group learner / learner weak / strong learner, individual learner / learner component / integrated learner.

Integrated brief ideas: by combining multiple learner obtain significantly better generalization performance than a single learner. Requirements for individual learner: accurate and diverse.

Representative of several integrated learner:

1, Boosting: learner strong interdependence between individuals, serial production.

  • Start initial training set to train a group learner, then the performance of training sample distribution unit is adjusted, so that the learner wrong sample previously received more attention in the subsequent training, and then adjusted based on the sample distribution unit according to train the next group learner repeat until the number reaches a specified number of learning groups T. Finally, the T base learners with weights.

2, Bagging: weak learner interdependence among individuals, generated in parallel.

  • Generates m samples of T-containing sample set Bootstrap Method. Each set of samples to train a learning group, and then the base-T binding is based learning.
  • When the output of base learners were combined, Bagging usually simple voting method for classification tasks, average task regression method.

3, Random Forest: Bagging method differs in that, the RF-based decision tree, the decision tree is introduced randomly selected property. Diversified property contains sample fluctuations and volatility, it is often better than Bagging performance.

4, with strategy

Average Method: averaging.

Voting method: majority rule.

Learning (?)

 

Guess you like

Origin www.cnblogs.com/sanlangHit/p/11626950.html