Introduction to LSTMs

LSTM

ft means forget gate. For each input, LSTM first decides which memories to forget. Multiplying ft and the cell state at time t-1 yields which memories will be forgotten.

The memory gate is used to control whether to incorporate the data at time t into the cell state. The tanh function can extract the effective information in the vector at the moment, which is gt. The sigmoid function is used to control how much memory enters the cell state at this moment, and the output is it.

The output gate will first integrate the current input value and the vector of the output value at the previous moment, and use the sigmoid function to extract the information in it. Then, the current unit state will be compressed and mapped to the interval (-1, 1) by the tanh function.

 

 LSTM - Long Short-Term Memory Recurrent Neural Network - Zhihu (zhihu.com)

Single layer LSTM

import torch
import torch.nn as nn

lstm = nn.LSTM(input_size=100, hidden_size=20, num_layers=1)
print(lstm)
print(lstm._p

Guess you like

Origin blog.csdn.net/qq_40107571/article/details/131587642