Neural network optimization method summary

Optimization of a neural network: using regularization techniques to improve the generalization ability of the model
Neural Network Optimization II: gradient optimization
Optimizing neural network Three: network initialization parameter tuning tips and super

From the shallow to the deeper levels of the neural network neural network, neural network simply by increasing the number of layers and can not effectively improve the performance of the model, we will summarize here some optimization algorithms:

Optimization of a neural network: using regularization techniques to improve the generalization ability of the model

The regularization method used is as follows:

L1, L2 regularization
dropout regularization
Data Augmentation increase training samples
Early stopping to select the appropriate number of training iterations

Neural Network Optimization II: gradient optimization

Conventional gradient method is as follows:

Gradient descent
- Batch Gradient Descent
- Mini-Batch gradient descent
- Stochastic gradient descent (SGD)
Gradient descent with momentum (Momentum GD)
Nesterw Momentum
AdaGrad
RMSprop
Adam

Optimizing neural network Three: network initialization parameter tuning tips and super

Network initialization skills
- Standardization of the input network (the number of each characteristic value range is the same)
- Weight w initialization
  - Let the variance of the weight W 1 / n [l-1]
```
  W[l] = np.random.randn(n[l],n[l-1])*np.sqrt(1/n[l-1])
```
  - Let the variance of the weight W 2 / n [l-1]
```
 W[l] = np.random.randn(n[l],n[l-1])*np.sqrt(2/n[l-1])
```
  - Let the variance of the weight W 2 / n [l-1] ⋅n [l]
```
W[l] = np.random.randn(n[l],n[l-1])*np.sqrt(2/(n[l-1]*n[l]))
```
Ultra-parameter testing
more depth neural network needs to be debugged hyper-parameters, common follows:
- Learning factor α
- Gradient descent with momentum factor β
- Adam optimization algorithm parameters β1, β2, ε
- Neural network layers
- Each hidden layer neuron number
- Learning factor declined parameters
- The number of samples included in the training sample batch
- L1, L2 regularization factor λ

Lgqiann

Published 21 original articles · won praise 6 · views 7993

Private letter concerns

Neural network optimization method summary

Neural network optimization method summary

Optimization of a neural network: using regularization techniques to improve the generalization ability of the model

Neural Network Optimization II: gradient optimization

Optimizing neural network Three: network initialization parameter tuning tips and super

Guess you like