"Machine Learning" (Zhou Zhihua) study notes --- model evaluation and selection

1, evaluation methods

1.1 distillation method

D = SUT, S∩T = Ø 

S: training set

T: test set

Note: keep the consistency of the data distribution

1.2 cross validation

D = D1 U D2 U D3 ........ UD k,  D ∩ D j = Φ

k-1 is set and subsets of the training set, the remaining subset of the test set

10 the value of k common value, 5, 20

Bootstrap 1.3

m samples of the data set, randomly selected from each D in a sample, copy it into D ' , and then returned to the original sample data set, such that the next sample in the sample may still be taken to, this process is repeated m times to obtain a data set comprising m samples D ' ,

D training set, D / D test set

Applications: data set is small, difficult to effectively test set training division.

2, performance metrics

The mean square error (return mission)

Error rate, accuracy (classification tasks)

Precision, recall and F1

ROC and AUC

Consideration error rate and the cost of sensitivity curves (different results for different types of errors caused by)

3, comparison test

3.1 Hypothesis testing

3.2 Cross-validation test t

3.3 McNemar test

3.4 Friedman test and follow-up inspection Nemenyi

4, deviation and variance

 

Guess you like

Origin www.cnblogs.com/avecle/p/11628580.html