Tensorflow2.0 entry and practical study notes (14)-RNN recurrent convolutional network

table of Contents

0 Preface:

Theoretical knowledge

1 RNN model structure

1.1 RNN structure

1.2 Specific diagram

1.3 calculation

1.4 RNN summary diagram

2 Backpropagation of RNN

3 Tensorflow2.0 RNN cyclic convolutional network

3.1 LSTM network

3.1.1 Simple implementation of LSTM network

3.1.2 Variant GRU of LSTM network

3.2 Implementation of Keras

3.3 Implementation code example

4 specific implementation code

4.1 Airline comment sentiment prediction

4.1.1 Data processing

4.1.2 Training model

4.1.3 Result analysis

4.1.4 Algorithm improvement-two-way RNN

4.2 Beijing air pollution sequence forecast

4.2.1 Data processing

4.2.2 Preparation before training the model

4.2.3 Basic model

4.2.4 Simple LSTM

4.2.5 Optimizing the LSTM model

4.2.6 Forecast

Reference


0 Preface:

Regarding natural language processing, RNN can't run. When learning TF2.0, there is no detailed explanation of this part of the knowledge in the study notes, and this part of learning also took a long time.

Theoretical knowledge

1 RNN model structure

They can only take and process one input individually, and the previous input and the next input are completely irrelevant. However, some tasks need to be able to better process sequence information, that is, the previous input is related to the subsequent input.

Simple understanding

https://zhuanlan.zhihu.com/p/30844905

1.1 RNN structure

1.2 Specific diagram

1.3 Calculation process

1.4 RNN summary diagram

2 Backpropagation of RNN

https://blog.csdn.net/qq_32241189/article/details/80461635

For specific calculations, please refer to this blog post. The key point is how to update W, U, V when backpropagating .

At the same time understand how S remembers

 Note : 1.W, U, and V here are equal at each moment ( weight sharing ).

             2. The hidden state can be understood as: S=f (existing input + past memory summary) 

3 Tensorflow2.0 RNN cyclic convolutional network

3.1 LSTM network

The Long Short Term network is generally called LSTM, which is a special type of RNN that can learn long-term dependent information. In many problems, LSTM has achieved considerable success and has been widely used. It is the de facto standard for RNN.

3.1.1 Simple implementation of LSTM network

LSTM passes through the gate] to control the information that passes through: i gate] is a method of letting information pass through selectively. The LSTM pass gate allows information not to pass, to pass completely, and to pass part of it .

If you don't understand LSTM enough, you can refer to this blog to understand step by step:

https://blog.csdn.net/shijing_0214/article/details/52081301

3.1.2 Variant GRU of LSTM network

GRU Threshold Cycle Unit. It merges the forget gate and the input gate into a single update gate, and merges the data unit state and hidden state, making the model structure simpler than that of LSTM.

Compared with LSTM, the GRU structure is simpler. It has an update gate, which determines the fusion ratio of the internal state and the input state state (combined with the previous state). In short, compared to the LSTM network, the GRU is simple to construct , Less calculation, equivalent effect.

3.2 Implementation of Keras

  • Keras supports various variants of RNN:
  • layers.LSTM

Note :

The input of TF2.0 LSTM is composed of a certain format

Sequence input format

( Batch , sequence length , dimension of each input )

Give a simple example (the airline example implemented later)

The first dimension: batch

Second dimension: For example, the length of the second dimension of airline reviews

The third dimension: It is the vector word corresponding to the word that represents the batch * length (feature)

3.3 Implementation code example

  • Univariate sequence - prediction of airline review sentiment

  • Multivariate sequence - (weather forecast, select a sequence, and forecast the weather of the previous three days)

4 specific implementation code

Only post part of the code as analysis

4.1 Airline comment sentiment prediction

4.1.1 Data processing

4.1.2 Training model

4.1.3 Result analysis

Overfitting

Use cyclic dropout to suppress overfitting

4.1.4 Algorithm improvement-two-way RNN

We see that the accuracy of the previous training is 93% . If we need to further optimize, just like a doubly linked list, RNN also has two-way

4.2 Beijing air pollution sequence forecast

4.2.1 Data processing

the data shows:

Processing date

Since the feature cbwd is an object, we use the method of reading hot encoding

Finally we get the data we need

4.2.2 Preparation before training the model

standardization

note:

We only need to standardize the training data, not the results

The output becomes a two-dimensional output (batch, final output)

4.2.3 Basic model

4.2.4 Simple LSTM

The result is a better drop

Further analysis of LSTM:

 

4.2.5 Optimizing the LSTM model

 

Good training

4.2.6 Forecast

 

step:

  • Save model
  • Load the model
  • Test Data

After the prediction is verified, we need to ensure that the number of predicted data is consistent with the number of tested data rows (one-to-one correspondence)

Reference

Guess you like

Origin blog.csdn.net/qq_37457202/article/details/108183601
Recommended