Error and Gradient Descent (Li Hongyi "Machine Learning")

Datawhale202211 Li Hongyi "Machine Learning" (Deep Learning Direction) P5-P8 - Error and Gradient Descent



foreword

This section introduces the relevant knowledge of error and gradient, which is the basis of model training. At the same time, we will also see some coping strategies for problems in the training process, which are more theoretical.


1. Error

Where did the error come from?

Error - the accuracy of the entire model;
Bias - the error between the output of the model on the sample and the true value;
Variance - the error (stability) between each output of the model and the output of the model
insert image description here

In the actual experiment process, Bias and Variance often cannot have both. We try to use limited training samples to estimate unlimited real data. When we believe in the authenticity of the data, it is easy to ignore the bright knowledge of the model. By ensuring the accuracy of the model on the training samples to reduce the bias of the model, it is easy to overfit. Increase the uncertainty of the model; if we are more inclined to the prior knowledge of the model, we will increase the stability by increasing the limit, which will increase the Bias. Therefore, trade-off bias and variance are important topics.

estimate

  • Evaluate the bias of x

Assuming that the mean of x is μ and the variance is σ^2, evaluate the mean

  • Get N sample points
  • Calculate the mean m (m≠μ)
  • Calculate multiple groups of m and find the expectation E(m)
  • find the variance
    insert image description here
  • Evaluate the variance of x

insert image description here

judgment

No good training set - too much deviation - underfitting
Good training set, big problem in test set - large variance - overfitting

  • Large deviation - underfitting

- Redesigned model

  • Add more parameters or consider more complex models
  • Large variance - overfitting
  • train with more data
  • Dataset augmentation: cropping, shifting...

choose

insert image description here

  • Cross-validation

Cross-validation is to divide the training set into two parts, one part is used as the training set and the other part is used as the verification set.

  • Determine the model with the newly split training and validation sets
  • Then use the entire training set and validator to experiment with the model and make adjustments
    insert image description here
  • N-fold cross-validation
  • Divide the training set into multiple parts, and train multiple models separately
  • Choose the best global training of Average

2. Gradient descent

Adjust the learning rate

insert image description here

stochastic gradient descent

insert image description here

feature scaling

insert image description here
insert image description here

3. Reference documents

Feeding from Datawhale

Li Hongyi's "Machine Learning" open source content 1:
https://linklearner.com/datawhale-homepage/#/learn/detail/93
Li Hongyi's "Machine Learning" open source content 2:
https://github.com/datawhalechina/leeml -notes
Li Hongyi's "Machine Learning" open source content 3:
https://gitee.com/datawhalechina/leeml-notes

Feeding from the official

Li Hongyi's "Machine Learning" official address
http://speech.ee.ntu.edu.tw/~tlkagk/courses.html
Li Mu's "Hands-On Deep Learning" official address
https://zh-v2.d2l.ai/

Feeds from netizens

Gradient basic knowledge reference
https://blog.csdn.net/weixin_50967907/article/details/127259554


Summarize

1. This course is very suitable for novices, and the inner logic of each step is explained very well.
2. It is a pity that many concepts are difficult to get quickly without actual combat. You can match some actual combat exercises by yourself, and you will have the opportunity to continue later.
3. Thanks to the datawhale friends for their support, especially the support and help within the group, and make persistent efforts.

Guess you like

Origin blog.csdn.net/weixin_50967907/article/details/127930303