LSTM(Long-Short Term Memory,LSTM)

The reason for LSTM?  

        In order to improve the long-range dependence of the recurrent neural network , a very good solution is to introduce a gating mechanism to control the accumulation speed of information , including selectively adding new information and selectively forgetting the previously accumulated information . This type of network can be called a gating-based recurrent neural network ( Gated RNN). There are many gating-based recurrent neural networks, among which the most classic gating-based recurrent neural network is the long short-term memory network, or LSTM.

Why is it called LSTM (long short-term memory)?


        The name of LSTM (Long-Short Term Memory, LSTM) is a bit strange, called long short-term memory. This actually reflects the principle of this algorithm: keep the length of memory. For example, our human brain, our brain does not remember all information, there are short-term memory and long-term memory. LSTM is designed using this principle.

How to understand memory on a deeper level :        

LSTM refers to long "short-term memory ".

LSTM gating mechanism?

        
        Gating mechanism In a digital circuit, a gate ( gate ) is a binary variable {0, 1}, 0 represents a closed state , and no information is allowed to pass through; 1 represents an open state, allowing all information to pass through.

        The LSTM network introduces a gating mechanism (Gating Mechanism) to control the path of information transmission.
        Ordinary RNN only maintains short-term memory, and the processing of memory is added to LSTM. The logic of this added portion control is more complicated. To put it simply, the added memory unit is controlled by three gates: the forget gate, the input gate and the output gate. It should be noted that the added gates are used to control the memory unit.
The functions of the three gates are:

1. The forget gate determines how much of the previous memory is retained ( to control how much information needs to be forgotten in the internal state of the previous moment )
2. The input gate determines how much of the current input is converted into memory ( to control the candidate state at the current moment) How much information needs to be saved )
3. The output gate determines how much of the memory will be output ( how much information needs to be output to the external state to control the internal state at the current moment )
        The above can be called the processing of the memory unit. Compared with ordinary RNN, LSTM increases the output of memory unit.

What are LSTMs?

        One sentence introduces LSTM, which is an advanced version of RNN. If the maximum of RNN is to understand a sentence, then the maximum of LSTM is to understand a paragraph . The details are as follows:

        LSTM, the full name of Long Short Term Memory networks (Long Short Term Memory networks), is a special RNN that can learn long-term dependencies. The LSTM paper was first published in 1997. Due to the unique design structure, LSTM is suitable for processing and predicting important events with very long intervals and delays in  time series .

Ordinary cyclic neural network structure diagram and LSTM structure diagram

        All recurrent neural networks have the form of repeating neural network modules forming chains. In a normal RNN, the repeating module structure is very simple, such as a tanh layer. Its structure is as follows:

insert image description here

        LSTM avoids the problem of long-term dependencies. Long-term information can be remembered! LSTM has a more complex structure inside. It can choose to adjust the transmitted information through the gate control state, remember the information that needs to be memorized for a long time, and forget the unimportant information . Its structure is as follows: 

insert image description here

The above LSTM structure can be learned by watching the following video

LSTM structure analysis

 

 LSTM network structure analysis

Figure 6.7 The recurrent unit structure of the LSTM network

—————————————————
Partial reference from: https://blog.csdn.net/qq_38251616/article/details/125613533

Guess you like

Origin blog.csdn.net/m0_48241022/article/details/132379005