The reason for LSTM?
Why is it called LSTM (long short-term memory)?
The name of LSTM (Long-Short Term Memory, LSTM) is a bit strange, called long short-term memory. This actually reflects the principle of this algorithm: keep the length of memory. For example, our human brain, our brain does not remember all information, there are short-term memory and long-term memory. LSTM is designed using this principle.
How to understand memory on a deeper level :
LSTM gating mechanism?
1. The forget gate determines how much of the previous memory is retained ( to control how much information needs to be forgotten in the internal state of the previous moment )
2. The input gate determines how much of the current input is converted into memory ( to control the candidate state at the current moment) How much information needs to be saved )
3. The output gate determines how much of the memory will be output ( how much information needs to be output to the external state to control the internal state at the current moment )
The above can be called the processing of the memory unit. Compared with ordinary RNN, LSTM increases the output of memory unit.
What are LSTMs?
One sentence introduces LSTM, which is an advanced version of RNN. If the maximum of RNN is to understand a sentence, then the maximum of LSTM is to understand a paragraph . The details are as follows:
LSTM, the full name of Long Short Term Memory networks (Long Short Term Memory networks), is a special RNN that can learn long-term dependencies. The LSTM paper was first published in 1997. Due to the unique design structure, LSTM is suitable for processing and predicting important events with very long intervals and delays in time series .
Ordinary cyclic neural network structure diagram and LSTM structure diagram
All recurrent neural networks have the form of repeating neural network modules forming chains. In a normal RNN, the repeating module structure is very simple, such as a tanh layer. Its structure is as follows:
LSTM avoids the problem of long-term dependencies. Long-term information can be remembered! LSTM has a more complex structure inside. It can choose to adjust the transmitted information through the gate control state, remember the information that needs to be memorized for a long time, and forget the unimportant information . Its structure is as follows:
The above LSTM structure can be learned by watching the following video
LSTM structure analysis
LSTM network structure analysis
—————————————————
Partial reference from: https://blog.csdn.net/qq_38251616/article/details/125613533