[Machine Learning Core Summary] What is a Long Short Term Memory Network (LSTM)

What is a Long Short Term Memory Network (LSTM)

RNN has a certain memory ability, but unfortunately it can only retain short-term memory, and it does not perform well on various tasks, so what should I do?

People turn their attention to themselves. Human memory has trade-offs. We don’t remember everything that happens every moment. We will choose to keep the important ones and discard the unimportant ones.

Referring to this memory mechanism, people transformed the "small box" used in RNN and found the mechanism of "gate". The gate is a small switch used to determine how information is retained. Its value is between 0-1. 1 is completely reserved, 0 is completely discarded.

There are three doors on the new little box

  1. Among them, the forgetting gate determines how much original information should be kept in the small box, that is, which unimportant memories should be discarded.
  2. The input gate determines how much of the current network information should be saved in the small box, that is, which new things to remember
  3. The output gate determines how much to output the information in the small box

The modified small box can not only understand the current network state through the input gate, but also use the forget gate to leave important past information. This is the LSTM-long short-term memory model .

Please add a picture description

By changing the structure of the small box, there are many variants of LSTM, such as MGU and SRU.

But the most popular is the Gated Recurrent Unit (GRU) . GRU has only two gates. The update gate is a combination of the forget gate and the input gate. It determines which old information is discarded and which new information is added. The reset gate determines how much the state of the network at the last moment is written to capture short-term memory. Structure More concise, more computationally efficient, and comparable to LSTM, GRU is becoming more and more popular.

Please add a picture description

Guess you like

Origin blog.csdn.net/RuanJian_GC/article/details/131547426