Zhou Zhihua machine learning notes Chapter 2 model assessment

2.1 Experience with the over-fitting error

Our real hope is in the new sample can show a good learner. To this end, a study should be applicable to all potential sample of "universal law" from the sample as possible, so as to make the right judgments in the face of new samples.

Factors leading to over-fitting: learning ability is too strong, the general characteristics of the training is not included in the sample learned.

The factors leading to poor fitting: learning disabilities

Here Insert Picture Description

2.2 Evaluation

Test set and the training set should be as mutually exclusive, that is, the test sample try not to focus on training occurs.

2.2.1 distillation method

Relatively simple and commonly used, the following is divided sklearn training and testing sets of packets

from sklearn.model_selection import  train_test_split
X_train,X_test,y_train,y_test=train_test_split(x,y,test_size=0.2)
Cross validation 2.2.2

Here Insert Picture Description

Bootstrap 2.2.3

There are sets of data D, where there are m samples. M times using random sampling with replacement. Get new dataset D`. Probability sample of m samples are not always to be taken ( 1 1 m ) m (1-\frac{1}{m})^m taking the limit = 1 e \frac{1}{e} Approximately equal to 0.368

D` then can be used as a training set, no data is sampled as a test set. This method is suitable for small data sets, the situation is difficult to effectively divide the training set and test set.

2.2.4 tune participate in the final model

After training model, should integrate the training data and test data from the new training model, this model uses all the samples, this is the ultimate model presented to the user.

2.3.1 error rate and accuracy

Here Insert Picture Description

2.3.2 precision, recall and F1

Precision is also called "accuracy", recall, also known as "recall"

Here Insert Picture Description
When evaluating the model "balance" (Break-Event Point, referred BEP), which is precision = value at the time of the recall, but a bit oversimplified BEP or, more commonlyF1measure:
F 1 = 2 P R P + R = 2 T P + T P T N F1 = \ frac {2 * P * R} {P + R} = \ frac {2 * TP} {Total sample + TP-TN}

Not to mention the book contents: a threshold corresponding to any point will have an F value, the maximum F value of the selected classifier F_score
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description

2.3.3 ROC and AUC

True and false positive rate:
Here Insert Picture Description
Bold Style
in the race classification algorithm is also commonly used as a measure of AUC (roc fraction of
the area under the curve)
Here Insert Picture Description

2.4.1 Hypothesis Testing

Generalization obtained by deriving the error rate e measured-in the error rate of the learner e ^ probability formula obtained:
Here Insert Picture Description
hypothesis testing hypothesis is learning generalization error rate profile is determined or some conjecture

If the test error rate is less than the threshold, i.e. can be a confidence level 1-α that, the learner generalization error rate is not more than [epsilon] O, or reject the hypothesis.

2.4.2 cross-validation test t

To repeatedly set aside by t-test method

Here too is more statistical knowledge, if not too much foundation, then recommend a look at this book vernacular statistics

Guess you like

Origin blog.csdn.net/weixin_41992565/article/details/91044860