Watermelon book reading notes (2)-model evaluation and selection

Summary of all notes: "Machine Learning" Watermelon Book-Summary of reading notes

1. Empirical error and overfitting

We call the difference between the actual predicted output of the learner and the true output of the sample "error", and the error of the learner on the training set is called "training error" or "experience error". The error on the new sample is called "generalization error". Obviously our goal is to find a learner with a small generalization error, but because we don't know in advance what the new sample is like, we can only try to minimize the experience error. But when the empirical error is small enough, even all training samples are classified correctly, overfitting will occur .

A variety of factors may lead to over-fitting. The most common case is that the learning ability is too strong to learn all the unusual features contained in the training sample, while under-fitting is usually due to low learning ability. The resulting under-fitting is relatively easy to overcome, such as expanding branches in decision tree learning, increasing the number of training rounds in neural network learning, etc., while over-fitting is very troublesome.

2. Evaluation method

  1. Leave out method: divide the data set into training set and test set.
  2. Cross validation method: For example, divided into four parts 1, 2, 3, 4, we can first use 1, 2, 3 as the training set, 4 as the test set; then 1, 2, 4 as the training set, and 3 as the test set…
  3. Self-service method: Based on the self-adopted method, it is useful when the data set is small and it is difficult to effectively divide the training/test set; in addition, it can also generate multiple different training sets from the original data, which is very helpful for integrated learning .

Parameter tuning and model selection are equally important. The parameters of most learning algorithms need to be set. Small differences in parameters may have significant changes in the performance of the resulting model.

Three, performance measurement

Performance measure (performance measure) is an evaluation standard that measures the generalization ability of a model. The "good or bad" of a model is relative. What kind of model is good depends not only on the algorithm and data, but also on the task requirements.

Four, comparative test

What does it compare?

  1. Performance comparison on test set and training set
  2. Different size test sets
  3. Same size test set but different test samples
  4. The algorithm itself is random, the same parameters are run multiple times on the same test set

Look at the relevant test methods directly against the textbook examples.

Five, deviation and variance

The difference between the expected output and the real label is called the deviation. The deviation-variance decomposition shows that the generalization performance is determined by the ability of the learning algorithm, the adequacy of the data, and the difficulty of the learning task itself. For a given learning task, in order to achieve good generalization performance, it is necessary to make the deviation small, that is, to be able to fully fit the data, and to make the variance small, that is, to make the impact of data disturbance small.

The next chapter Portal: Watermelon book reading notes (3)-linear model

Guess you like

Origin blog.csdn.net/qq_41485273/article/details/112702796