table of Contents
1.3 Learning rate-hyperparameter (manually configurable)
1.6 Network optimization and hyperparameter selection
1.6.1 How to choose hyperparameters
Use dropout to prevent overfitting
Why is it said that Dropout can solve overfitting?
Theoretical knowledge
1.1 Multilayer perceptron
1.2 Gradient descent method
1.3 Learning rate-hyperparameter (manually configurable)
Inappropriate learning rate
Local extreme point
1.4 Backpropagation algorithm
1.5 Optimization function
- SGD stochastic gradient
- adam optimizer
Common parameters
RMSprop
Code
1.6 Network optimization and hyperparameter selection
1.6.1 How to choose hyperparameters
So how to improve the fitting ability of the network
Note:
The number of neurons in a single layer should not be too small. If it is too small, it will cause an information bottleneck and make the model underfit
13W trainable parameters
Result improved
Judging from the correct rate
- Underfitting: test and training data scores are low
- Overfitting : The score is relatively low on the test data, and the score on the training data is compared
Use dropout to prevent overfitting
Random forest and ensemble methods
Mentioned in the AlexNet paper
Why is it said that Dropout can solve overfitting?
Preferences
Therefore, our parameter training principle
Then, suppress overfitting---no problem in capacity
Inhibits fit most good way is to increase the training data
Tuning parameters-experience
General principles of network construction
General principle: ensure that the neural network capacity is sufficient to fit the data
1. Increase the network capacity until overfitting
Second, take measures to suppress over-fitting
3. Continue to increase network capacity until overfitting
1.7 Actual combat
How to add droupout layer to the network
Reducing the network size is also a way