There is supervised learning in machine learning. When training a model by labeling data, cross-validation is usually used to select model parameters.
Divide the labeled data into three sets: training set , (cross) validation set , and test set :
In the model of machine learning, some model parameters need to be specified in advance. They are constants before training (different from the parameters obtained by the minimize objective function during the training process). Specifying parameters based on experience is not necessarily reliable, so it needs to be Before training, do a cross-validation to choose the value of this constant.
When doing model selection, more precisely when doing model parameter selection, and (here, taking the SVM of the polynomial kernel as an example, the parameter we need to select is the order k of the polynomial):
1 Training: first use the Training Set to run the primary model, secondary model, and cubic model (several times refers to the order of z=Z θ (x)), and then use the numerical optimization algorithm to obtain the specified order. In the number of cases, what are the parameters θ that minimize the training error
2 Parameter optimization: Run each model M(1, θ 1 )...M(k, θ k )...M(n, θ n ) obtained in Cross Validation Set in 1 to calculate the Cross Validation error, select the model number k that minimizes this error
3 Final training: Given the order k, the training obtains the parameter θ that minimizes the Training error on the Training Set+CV Set, and obtains the model M(k, θ k ')
effect evaluation:
Use the model M(k, θ k ') obtained in Test Set in 3 (the selected model has been fixed at this time, and the parameters cannot be changed), run it, and calculate the Test error as an evaluation of the generalization ability of the model, That is, the prediction accuracy for unknown data.
The above is the basic cross-validation. There is a variant with moderate effect and time cost, that is, n-fold cross-validation, which is the most commonly used method for machine learning experiments: