1. Neural network training process
- Forward propagation, calculate the loss
- Backpropagation, update parameters
- Repeatedly, the loss is minimal
The process of model training is the process of guiding the model to update parameters through optimization algorithms such as SGD and Adam.
2. Forward propagation
The process of forward propagation is to calculate and store the results of each layer in the neural network, that is, the intermediate variables, in the order from input to output.
3. Backpropagation
The process of Backward Propagation is to calculate and store the gradients of the intermediate variables and parameters of the neural network in the order from output to input. The principle is the chain rule.
The intermediate values stored in the forward pass are reused during backpropagation to avoid double computation. Therefore, the intermediate results in the forward propagation process need to be preserved , which will cause the model training to require more memory (video memory) than the model prediction .
4. Gradient descent
Update the weight and bias values along the direction of gradient descent, increase the value of the loss function along the gradient direction, and the learning rate is η ηthe
The learning rate cannot be too large or too small, otherwise the situation shown in the figure below will appear: