[Deep Learning] rnn and lstm

Recommend a blog post, talking about rnn and lstm is very easy to understand.

https://www.jianshu.com/p/9dc9f41f0b29

Below are some of my understanding and summary, welcome to criticize and correct~


I have always been curious about how rnn generates so many words in the sentence one by one to update the weight, especially the first half of the input is only the input, and the second half is only the output. In fact, after a sentence is generated, the loss of each word is calculated uniformly, and then updated together.


The ingenious design of lstm is to solve the problem of rnn short-term memory.

lstm has an information path, which is used to transmit the previous information directly to the back. In each small module, it will determine which part needs to be remembered and which part needs to be forgotten. Then, combine the information output by the network this time and continue to summarize it into the upper channel. The upper path continuously absorbs every input and output information, judges the remembered or forgotten part, leaves the useful ones and removes the useless ones to get the final result.

Guess you like

Origin blog.csdn.net/Sun7_She/article/details/80738617