Vanishing and Exploding Gradients in Deep Neural Networks

        When there are too many layers in the neural network, the problem of gradient disappearance and gradient explosion is prone to occur. What is gradient explosion or gradient disappearance? Gradient explosion means that when starting training, the parameters are all random numbers, resulting in too much loss. Backpropagation When the gradient is too large, or the gradient disappears, the network cannot converge. The solution is gradient clipping.

        Gradient clipping includes normal clipping and L2 norm clipping, and each method includes global clipping and partial clipping.

Guess you like

Origin blog.csdn.net/zy1620454507/article/details/128912597