RNN Recurrent Neural Network - A Preliminary Study

There are not many gossips. This article mainly conducts a preliminary exploration from the origin, application, main purpose, problems encountered, and solutions of the cyclic neural network. The details are as follows:

 

  • Origin: Recurrent Neural Networks were proposed by Hopfield in 1982 to use historical information to help with current decision making!
  • Application: Traditional machine learning algorithms rely heavily on manually extracted features, making image recognition, speech recognition, and natural language processing based on traditional machine learning a bottleneck in feature extraction. However, there are too many parameters in the method based on the fully connected neural network, and the time series information in the data cannot be used. With the continuous introduction of more effective cyclic neural network structures, the time series information in the mining data of the cyclic neural network and the deep expression ability of the given information have been fully utilized, and realized in the aspects of speech recognition, language model, machine translation and time series analysis. breakthrough.
  • Structure: An important concept of the recurrent neural network is time, which will give an output to the input of each time combined with the impact of the current model. For example, as shown in Figure 1. In addition to the input of the main structure A of the cyclic neural network from the input layer  Xt  , there is also a cyclic edge to provide the current state. At each moment, module A of the RNN will read the state at time t. At each moment, the module A of the recurrent neural network will read  the input X t at time t and output a value h t . At the same time, the state of A will be passed from the current step to the next step. Therefore, the recurrent neural network can theoretically be regarded as the result of infinite replication of the same neural network structure. However, due to optimization considerations, the current cyclic neural network cannot achieve a true wireless loop. In reality, the loop body is generally expanded. The specific structure is shown in Figure 2.
  •  
  • Figure 1 Schematic diagram of the classic structure of the recurrent neural network Figure 2 The structure of the recurrent neural network expanded by time
  • Main purpose: processing and forecasting sequence data.
  • Encountered problems: Especially in text data information mining, the text gap between the predicted position and related information may become larger. In short, there will be long-term dependencies.
  • Solution: In order to solve the long-term dependence problem, Sepp Hochreiter proposed the Long short term memory (LSTM) structure in 1997. Mainly through the structure of some "gates" to allow information to selectively affect the state of each moment in the recurrent neural network, the LSTM structure can more effectively determine the forgetting and retention of information.
  •  Structure: Input gate and forget gate are the core of LSTM.
  • Forget Gate: The main function is to let the cyclic neural network "forget" the previously useless information.
  • Principle: According to the current input X t , the state c t-1 of the previous moment , and the output h t-1 of the previous moment , jointly decide which part of the memory needs to be forgotten.
  •         Output gate: After the recurrent neural network "forgets" the state between parts, it also needs to supplement the latest memory from the current input.
  • Principle: The output gate will decide which part will enter the current state c t according to X t , c t-1, h t-1 .

 

Guess you like

Origin blog.csdn.net/jinhao_2008/article/details/78700064