table of Contents
3 Tensorflow2.0 RNN cyclic convolutional network
3.1.1 Simple implementation of LSTM network
3.1.2 Variant GRU of LSTM network
3.3 Implementation code example
4 specific implementation code
4.1 Airline comment sentiment prediction
4.1.4 Algorithm improvement-two-way RNN
4.2 Beijing air pollution sequence forecast
4.2.2 Preparation before training the model
4.2.5 Optimizing the LSTM model
0 Preface:
Regarding natural language processing, RNN can't run. When learning TF2.0, there is no detailed explanation of this part of the knowledge in the study notes, and this part of learning also took a long time.
Theoretical knowledge
1 RNN model structure
They can only take and process one input individually, and the previous input and the next input are completely irrelevant. However, some tasks need to be able to better process sequence information, that is, the previous input is related to the subsequent input.
Simple understanding
https://zhuanlan.zhihu.com/p/30844905
1.1 RNN structure
1.2 Specific diagram
1.3 Calculation process
1.4 RNN summary diagram
2 Backpropagation of RNN
https://blog.csdn.net/qq_32241189/article/details/80461635
For specific calculations, please refer to this blog post. The key point is how to update W, U, V when backpropagating .
At the same time understand how S remembers
Note : 1.W, U, and V here are equal at each moment ( weight sharing ).
2. The hidden state can be understood as: S=f (existing input + past memory summary)
3 Tensorflow2.0 RNN cyclic convolutional network
3.1 LSTM network
The Long Short Term network is generally called LSTM, which is a special type of RNN that can learn long-term dependent information. In many problems, LSTM has achieved considerable success and has been widely used. It is the de facto standard for RNN.
3.1.1 Simple implementation of LSTM network
LSTM passes through the gate] to control the information that passes through: i gate] is a method of letting information pass through selectively. The LSTM pass gate allows information not to pass, to pass completely, and to pass part of it .
If you don't understand LSTM enough, you can refer to this blog to understand step by step:
https://blog.csdn.net/shijing_0214/article/details/52081301
3.1.2 Variant GRU of LSTM network
GRU Threshold Cycle Unit. It merges the forget gate and the input gate into a single update gate, and merges the data unit state and hidden state, making the model structure simpler than that of LSTM.
Compared with LSTM, the GRU structure is simpler. It has an update gate, which determines the fusion ratio of the internal state and the input state state (combined with the previous state). In short, compared to the LSTM network, the GRU is simple to construct , Less calculation, equivalent effect.
3.2 Implementation of Keras
- Keras supports various variants of RNN:
- layers.LSTM
Note :
The input of TF2.0 LSTM is composed of a certain format
Sequence input format
( Batch , sequence length , dimension of each input )
Give a simple example (the airline example implemented later)
The first dimension: batch
Second dimension: For example, the length of the second dimension of airline reviews
The third dimension: It is the vector word corresponding to the word that represents the batch * length (feature)
3.3 Implementation code example
-
Univariate sequence - prediction of airline review sentiment
-
Multivariate sequence - (weather forecast, select a sequence, and forecast the weather of the previous three days)
4 specific implementation code
Only post part of the code as analysis
4.1 Airline comment sentiment prediction
4.1.1 Data processing
4.1.2 Training model
4.1.3 Result analysis
Overfitting
Use cyclic dropout to suppress overfitting
4.1.4 Algorithm improvement-two-way RNN
We see that the accuracy of the previous training is 93% . If we need to further optimize, just like a doubly linked list, RNN also has two-way
4.2 Beijing air pollution sequence forecast
4.2.1 Data processing
the data shows:
Processing date
Since the feature cbwd is an object, we use the method of reading hot encoding
Finally we get the data we need
4.2.2 Preparation before training the model
standardization
note:
We only need to standardize the training data, not the results
The output becomes a two-dimensional output (batch, final output)
4.2.3 Basic model
4.2.4 Simple LSTM
The result is a better drop
Further analysis of LSTM:
4.2.5 Optimizing the LSTM model
Good training
4.2.6 Forecast
step:
- Save model
- Load the model
- Test Data
After the prediction is verified, we need to ensure that the number of predicted data is consistent with the number of tested data rows (one-to-one correspondence)
Reference
-
Introduction of RNN:
-
An article to understand the basics of RNN (recurrent neural network): https://zhuanlan.zhihu.com/p/30844905
-
- Variants of RNN:
- RNN cyclic neural network and LSTM long short-term memory model-introduction https://blog.csdn.net/yoyofu007/article/details/80361422
- Step-by-step analysis of LSTM:
-
Simple understanding of LSTM neural network https://blog.csdn.net/shijing_0214/article/details/52081301
-
- RNN backpropagation
- Deep Learning RNN (Circular Neural Network) https://blog.csdn.net/qq_32241189/article/details/80461635