The RNN tensorflow-BasicLSTMCell + mnist handwritten character classification data set complete code + github

The RNN tensorflow-BasicLSTMCell + mnist handwritten character classification data set complete code + github

First, the principle part

For more information about RNN, see Bowen
Detailed recurrent neural network RNN + tensorflow implement the principles
on LSTM:
We know RNN includes input any time before, but this will cause the disappearance of the gradient or gradient explosion, but can not be a good learning feature Therefore using LSTM
(. 1) of the input (x, h) control (through the [0-1] is multiplied by the decimal sigmod);
(2) before the control "accumulation" state required degree forgotten;
(3) controlling the current time the need for further output.
To achieve more flexible control.
Diagram is as follows:
Here Insert Picture Description
See the following two blog posts, tells in great detail.
BasicLSTMCell in num_units parameters explain
the following article on tf.nn.rnn_cell.BasicLSTMCell class on how to get the process variable shown above explanation is clear.
Depth study notes 2: understanding of the neural network input and output LSTM

Second, Code Detailed

1, code analysis

A total of two files
one is RNN_LSTM_Classfication.py
one simple_RNN.py

RNN_LSTM_Classfication.py

Header files, and data loading, data loading and explained in Bowen Some ERROR tensorflow run convolution neural network

from tensorflow import keras, Session, transpose, global_variables_initializer
from modular.simple_RNN import simple_RNN
from modular.compute_accuracy import compute_accuracy
from modular.random_choose import corresponding_choose
(train_x_image, train_y), (test_x_image, test_y) = keras.datasets.mnist.load_data(path='/home/xiaoshumiao/.keras/datasets/mnist.npz')
train_y = keras.utils.to_categorical(train_y)
test_y=keras.utils.to_categorical(test_y)

parameter settings

epochs = 1000
n_classes = 10
batch_size = 200#number
chunk_size = 28
n_chunk = 28
rnn_size = 128#the lenth of a hidden_neural_layer or the number of hidden_neural
learning_rate = 0.001

RNN instance of a class of their own definition

rnn = simple_RNN(chunk_size, n_chunk, rnn_size, batch_size, n_classes, learning_rate)

Sessa ()

with Session() as sess:
    sess.run(global_variables_initializer())
    for i in range(epochs):
        train_data = corresponding_choose(train_x_image, batch_size, m=0)
        train_x_betch = train_data.row_2(train_x_image) / 255.
        train_y_betch = train_data.row(train_y)
        sess.run(rnn.train,feed_dict={rnn.X:train_x_betch,rnn.y:train_y_betch})
        if i % 20 ==0:
            test_data = corresponding_choose(test_x_image, 200, m=0)
            test_x_betch = test_data.row_2(test_x_image) / 255.
            test_y_betch = test_data.row(test_y)
            b = sess.run(rnn.result, feed_dict={rnn.X: test_x_betch})
            c = sess.run(compute_accuracy(b, transpose(test_y_betch)))
            print(c)

simple_RNN.py

Header file import

from tensorflow import placeholder,float32,transpose, nn, reduce_mean, multiply, log, reduce_sum, train
from add_layer import add_layer
from tensorflow.python.ops.rnn import dynamic_rnn

data input

class simple_RNN(object):
    def __init__(self,chunk_size,n_chunk,hidden_chunk_size, batch_size,  n_class, learning_rate):
        self.X = placeholder(float32,[None,n_chunk,chunk_size])#200,28,28
        self.y = placeholder(float32,[None, n_class])

Defines a cell
initialization state value (see structure above two links)
Construction rnn FIG calculated last time step and to obtain an output.

self.LSTM_cell = nn.rnn_cell.BasicLSTMCell(hidden_chunk_size, forget_bias=1.0, state_is_tuple=True)
self.init_state = self.LSTM_cell.zero_state(batch_size, float32)
self.output, self.states = dynamic_rnn(self.LSTM_cell, self.X, initial_state=self.init_state, dtype=float32)

After obtaining one output connector fully connected layers, to give 10-dimensional output.

self.result = add_layer(transpose(self.states[1]), hidden_chunk_size, n_class, activation_function=nn.softmax)

A method for training a fully connected network training full output connection layer

self.loss = reduce_mean(-reduce_sum(multiply(transpose(self.y),log(self.result)),reduction_indices=[0]))
self.train = train.AdamOptimizer(learning_rate).minimize(self.loss)

2, problem solving

(1),
the first question also troubled by the problem for a long time, because I was tired of Mo according to RNN python learning their lessons according to Mo trouble, we first manually calculated 128-dimensional vector layer because then into the cell. But I looked Bowen found, num_units is the number of nerve (the dimensions of this layer output vectors due) Yuan tf.nn.rnn_cell.BasicLSTMCell function defined in the hidden layer, and when building RNN computing map dynamic_rnn, we need input is raw data. Also refer to the link above two blog, LSTM input is hidden neurons and output raw data on a moment. In summary therefore, we do not achieve the results manually calculate the hidden layer neurons.
At the same time, it can be found in tf.nn.rnn_cell.BasicLSTMCell
Here Insert Picture Description
therefore, do not need to process the input data.

(2)、

test_data = corresponding_choose(test_x_image, 200, m=0)

In the beginning of the test sample batch_size and training samples do not meet, so it set up 200, or can be changed again.

(3),
initialize variables, do not use anything else, to be honest with
global_variables_initializer ()

Third, the experimental results + github

To achieve the following results:
Here Insert Picture Description
Overall, the results still considerable.

The complete code (other write their own library calls), and other program comments, you need to pay attention to the place, all together on github, need can take a look.

https://github.com/wangjunhe8127/tensorflow-BasicLSTMCell-mnist

May the force be with you!

Published 32 original articles · won praise 7 · views 2158

Guess you like

Origin blog.csdn.net/def_init_myself/article/details/105372941