Recurrent Neural Networks disappear gradient / gradient explosion

2019-08-27 15:42:00

Problem Description: Why circulation neural networks will have problems disappear gradient or gradient explosion, which improve the program.

Problem Solving:

Solving cycle neural network model can be used BPTT (Back Propagation Through Time, based on back-propagation time) algorithm, BPTT is actually a simple variant of back-propagation algorithm. If the recurrent neural network expanded into T layer in chronological feed-forward neural networks to understand, and the general back-propagation algorithm no difference. One designed to loop neural network is able to capture long-distance dependencies between input.

From a structural point of view, recurrent neural networks should be able to do this. In practice, however found that the use BPTT algorithm neural network learning cycle and can not successfully capture the dependencies between long-distance input, this phenomenon is mainly due to the neural network gradient disappears.

 

Guess you like

Origin www.cnblogs.com/TIMHY/p/11418914.html