What is cross-validation (Cross Validation)?

Brief

Verification refers to the measure of the quality of the model of the machine learning model training.
Cross-validation model selection is a commonly used method, using partial data set of model validation.

A common method

Common cross validation divided into three kinds:

1. A simple cross-validation

The data set is divided into two portions (or three parts), 70% as a training set, as a validation set of 30%. With 70% of the data, choose a different model parameters, for training. After the data using 30% of the (untrained) for authentication. Choose the best model.

2.S fold cross-validation

The data sets of similar size into S disjoint sets of data, using the S-1 partial data to train the model, a portion of the remaining data validation. After several training optimal model selected.
[Note] each validation set are likely to be different.

3. leave a cross-validation

S is actually a special form fold cross-validation, i.e. the size of the dataset, and when small (less than 100, even more exaggerated). The folded S S = N, where N is the size of the data. Leaving one to do data validation.
Details of
all the data sets in the choice of the time, we need to emphasize the selection (random sampling) independent and identically distributed, because the scientific theory of machine learning is in this framework arising

Reference

[1] "statistical learning method" P14-P15
[2] https://www.cnblogs.com/pinard/p/5992719.html

Guess you like

Origin blog.csdn.net/qq_19672707/article/details/88983691