Datawhale202211 Li Hongyi "Machine Learning" (Deep Learning Direction) P5-P8 - Error and Gradient Descent

Article Directory

Datawhale202211 Li Hongyi "Machine Learning" (Deep Learning Direction) P5-P8 - Error and Gradient Descent
foreword
1. Error
2. Gradient descent
3. Reference documents
Summarize

foreword

This section introduces the relevant knowledge of error and gradient, which is the basis of model training. At the same time, we will also see some coping strategies for problems in the training process, which are more theoretical.

1. Error

Where did the error come from?

Error - the accuracy of the entire model;
Bias - the error between the output of the model on the sample and the true value;
Variance - the error (stability) between each output of the model and the output of the model

In the actual experiment process, Bias and Variance often cannot have both. We try to use limited training samples to estimate unlimited real data. When we believe in the authenticity of the data, it is easy to ignore the bright knowledge of the model. By ensuring the accuracy of the model on the training samples to reduce the bias of the model, it is easy to overfit. Increase the uncertainty of the model; if we are more inclined to the prior knowledge of the model, we will increase the stability by increasing the limit, which will increase the Bias. Therefore, trade-off bias and variance are important topics.

estimate

Evaluate the bias of x

Assuming that the mean of x is μ and the variance is σ^2, evaluate the mean

Get N sample points

Calculate the mean m (m≠μ)

Calculate multiple groups of m and find the expectation E(m)

find the variance

Evaluate the variance of x

judgment

No good training set - too much deviation - underfitting
Good training set, big problem in test set - large variance - overfitting

Large deviation - underfitting

- Redesigned model

Add more parameters or consider more complex models

Large variance - overfitting

train with more data

Dataset augmentation: cropping, shifting...

choose

insert image description here

Cross-validation

Cross-validation is to divide the training set into two parts, one part is used as the training set and the other part is used as the verification set.

Determine the model with the newly split training and validation sets

Then use the entire training set and validator to experiment with the model and make adjustments

N-fold cross-validation

Divide the training set into multiple parts, and train multiple models separately

Choose the best global training of Average

2. Gradient descent

Adjust the learning rate

insert image description here

stochastic gradient descent

insert image description here

feature scaling

insert image description here

3. Reference documents

Feeding from Datawhale

Li Hongyi's "Machine Learning" open source content 1:
https://linklearner.com/datawhale-homepage/#/learn/detail/93
Li Hongyi's "Machine Learning" open source content 2:
https://github.com/datawhalechina/leeml -notes
Li Hongyi's "Machine Learning" open source content 3:
https://gitee.com/datawhalechina/leeml-notes

Feeding from the official

Li Hongyi's "Machine Learning" official address
http://speech.ee.ntu.edu.tw/~tlkagk/courses.html
Li Mu's "Hands-On Deep Learning" official address
https://zh-v2.d2l.ai/

Feeds from netizens

Gradient basic knowledge reference
https://blog.csdn.net/weixin_50967907/article/details/127259554

Summarize

1. This course is very suitable for novices, and the inner logic of each step is explained very well.
2. It is a pity that many concepts are difficult to get quickly without actual combat. You can match some actual combat exercises by yourself, and you will have the opportunity to continue later.
3. Thanks to the datawhale friends for their support, especially the support and help within the group, and make persistent efforts.

Error and Gradient Descent (Li Hongyi "Machine Learning")

Datawhale202211 Li Hongyi "Machine Learning" (Deep Learning Direction) P5-P8 - Error and Gradient Descent

Article Directory

foreword

1. Error

Where did the error come from?

estimate

judgment

choose

2. Gradient descent

Adjust the learning rate

stochastic gradient descent

feature scaling

3. Reference documents

Feeding from Datawhale

Feeding from the official

Feeds from netizens

Summarize

Guess you like