Overview of the neural network training process

1. Neural network training process

  • Forward propagation, calculate the loss
  • Backpropagation, update parameters
  • Repeatedly, the loss is minimal

The process of model training is the process of guiding the model to update parameters through optimization algorithms such as SGD and Adam.

2. Forward propagation

  The process of forward propagation is to calculate and store the results of each layer in the neural network, that is, the intermediate variables, in the order from input to output.

insert image description here

Network structure and forward calculation graph

3. Backpropagation

  The process of Backward Propagation is to calculate and store the gradients of the intermediate variables and parameters of the neural network in the order from output to input. The principle is the chain rule.

  The intermediate values ​​stored in the forward pass are reused during backpropagation to avoid double computation. Therefore, the intermediate results in the forward propagation process need to be preserved , which will cause the model training to require more memory (video memory) than the model prediction .

4. Gradient descent

  Update the weight and bias values ​​along the direction of gradient descent, increase the value of the loss function along the gradient direction, and the learning rate is η ηthe

insert image description here

  The learning rate cannot be too large or too small, otherwise the situation shown in the figure below will appear:
insert image description here

Guess you like

Origin blog.csdn.net/python_plus/article/details/130715981