Unstanding LSTM

1.RNNs

  RNNs we can do as an ordinary network replication after repeated added together in combination, each network will deliver the output to the next network.

  RNNs by the time step to expand, we get the following figure;

  

 

 

   From RNNs chain structure can be easily understood and that he is related to the sequence information.

2. long time-dependent problems

  With the interval related information and prediction information increases, it RNNs difficult to associate them up.

  However, LSTMs can solve this problem

3. LSTM network

  Long Short Term Memory networks (short and long term memory network) is usually called LSTMS. LSTMs is designed to avoid prolonged dependence mentioned earlier, their essence is the ability to remember a very long period of information.

  RNNs are copied from the same formed structure, in the conventional RNNs, this module is very simple, such as only a single layer tanh.

  

 

 

   LSTMs has a similar structure, but not a simple repetition of the module sections tanh layer but four special layers.

   

 

 

   First define the symbol used:

  

3.1 LSTMs core idea

  The most critical place is LSTMs traversed the cell, i.e., the state and structure of the green part of a horizontal line in FIG.

  

 

 

  cell state such as a conveyor belt, the vector uploaded from the cell, only a small amount of linear operation, it is easy to pass through the cell information without alteration (to achieve a prolonged memory retention)

  LSTMs be achieved by adding or deleting information structure of the door (gates) of.

  Gate may be achieved by selectively passing information through a neural sigmod layer and a point-wise multiplication operation is achieved.

  

 

 

   sigmod layer output value from 0 to 1, the corresponding information should be represented by the right weight, let represented by 0, 1 represents a pass all information.

  Each of three LSTM door structure, to achieve protection and control cell state, are forgotten door forget gate layer, entry pass input gate layer, the output of gate output gate layer.

3.2 develop an understanding of LSTM

  3.2.1 Forgotten Door

    LSTM first step is to decide what information is discarded, which is called the forget gate layer by a layer of sigmod achieved.

    

 

 

   A language model to predict the next word based on the context of all the information, each cell should save the current state of gender subject (retention of information), then in order to use pronouns correctly, when we began to describe a new subject when you use the subject of sex change in fishes above to forget (forget information)

  3.2.2 pass entry

  LSTM next step is to decide what new information to be added to the cell state in the past. It consists of two parts, 1. Input gate layer is a layer called sigmod decide which value is updated. 2. Create a layer tanh a vector is used to update the contents of an alternative C T  

  

 

   In our language model, the door to my new gender information on the subject of adding cell state, replacing the old status information.

  With the above configuration, it is possible to update the status of the cell C T-. 1 to C T

  

 

 

  3.2.3 output gate

  Finally, we want to determine what the value of output, the output value is dependent on cell state C t , go through a filtering process.

  First, I use a gate layer determines sigmoid C T in which part of the information is outputted, then C T by a tanh layer (the value of the property to -1 to 1), the tanh sigmoid layer and the output layer is calculated from the weight multiplied by the final output.

  In the language model, the model just contact with a pronoun, then you may want to output a verb, and the output may be closely related to the information pronouns, such as the use of the verb should be singular or plural form, so we just want to learn and pronouns related information to the cell state in order to carry out correct predictions.

  

 

 

 

  

Guess you like

Origin www.cnblogs.com/wengyuyuan/p/11795819.html