The RNN tensorflow-BasicLSTMCell + mnist handwritten character classification data set complete code + github
First, the principle part
For more information about RNN, see Bowen
Detailed recurrent neural network RNN + tensorflow implement the principles
on LSTM:
We know RNN includes input any time before, but this will cause the disappearance of the gradient or gradient explosion, but can not be a good learning feature Therefore using LSTM
(. 1) of the input (x, h) control (through the [0-1] is multiplied by the decimal sigmod);
(2) before the control "accumulation" state required degree forgotten;
(3) controlling the current time the need for further output.
To achieve more flexible control.
Diagram is as follows:
See the following two blog posts, tells in great detail.
BasicLSTMCell in num_units parameters explain
the following article on tf.nn.rnn_cell.BasicLSTMCell class on how to get the process variable shown above explanation is clear.
Depth study notes 2: understanding of the neural network input and output LSTM
Second, Code Detailed
1, code analysis
A total of two files
one is RNN_LSTM_Classfication.py
one simple_RNN.py
RNN_LSTM_Classfication.py
Header files, and data loading, data loading and explained in Bowen Some ERROR tensorflow run convolution neural network
from tensorflow import keras, Session, transpose, global_variables_initializer
from modular.simple_RNN import simple_RNN
from modular.compute_accuracy import compute_accuracy
from modular.random_choose import corresponding_choose
(train_x_image, train_y), (test_x_image, test_y) = keras.datasets.mnist.load_data(path='/home/xiaoshumiao/.keras/datasets/mnist.npz')
train_y = keras.utils.to_categorical(train_y)
test_y=keras.utils.to_categorical(test_y)
parameter settings
epochs = 1000
n_classes = 10
batch_size = 200#number
chunk_size = 28
n_chunk = 28
rnn_size = 128#the lenth of a hidden_neural_layer or the number of hidden_neural
learning_rate = 0.001
RNN instance of a class of their own definition
rnn = simple_RNN(chunk_size, n_chunk, rnn_size, batch_size, n_classes, learning_rate)
Sessa ()
with Session() as sess:
sess.run(global_variables_initializer())
for i in range(epochs):
train_data = corresponding_choose(train_x_image, batch_size, m=0)
train_x_betch = train_data.row_2(train_x_image) / 255.
train_y_betch = train_data.row(train_y)
sess.run(rnn.train,feed_dict={rnn.X:train_x_betch,rnn.y:train_y_betch})
if i % 20 ==0:
test_data = corresponding_choose(test_x_image, 200, m=0)
test_x_betch = test_data.row_2(test_x_image) / 255.
test_y_betch = test_data.row(test_y)
b = sess.run(rnn.result, feed_dict={rnn.X: test_x_betch})
c = sess.run(compute_accuracy(b, transpose(test_y_betch)))
print(c)
simple_RNN.py
Header file import
from tensorflow import placeholder,float32,transpose, nn, reduce_mean, multiply, log, reduce_sum, train
from add_layer import add_layer
from tensorflow.python.ops.rnn import dynamic_rnn
data input
class simple_RNN(object):
def __init__(self,chunk_size,n_chunk,hidden_chunk_size, batch_size, n_class, learning_rate):
self.X = placeholder(float32,[None,n_chunk,chunk_size])#200,28,28
self.y = placeholder(float32,[None, n_class])
Defines a cell
initialization state value (see structure above two links)
Construction rnn FIG calculated last time step and to obtain an output.
self.LSTM_cell = nn.rnn_cell.BasicLSTMCell(hidden_chunk_size, forget_bias=1.0, state_is_tuple=True)
self.init_state = self.LSTM_cell.zero_state(batch_size, float32)
self.output, self.states = dynamic_rnn(self.LSTM_cell, self.X, initial_state=self.init_state, dtype=float32)
After obtaining one output connector fully connected layers, to give 10-dimensional output.
self.result = add_layer(transpose(self.states[1]), hidden_chunk_size, n_class, activation_function=nn.softmax)
A method for training a fully connected network training full output connection layer
self.loss = reduce_mean(-reduce_sum(multiply(transpose(self.y),log(self.result)),reduction_indices=[0]))
self.train = train.AdamOptimizer(learning_rate).minimize(self.loss)
2, problem solving
(1),
the first question also troubled by the problem for a long time, because I was tired of Mo according to RNN python learning their lessons according to Mo trouble, we first manually calculated 128-dimensional vector layer because then into the cell. But I looked Bowen found, num_units is the number of nerve (the dimensions of this layer output vectors due) Yuan tf.nn.rnn_cell.BasicLSTMCell function defined in the hidden layer, and when building RNN computing map dynamic_rnn, we need input is raw data. Also refer to the link above two blog, LSTM input is hidden neurons and output raw data on a moment. In summary therefore, we do not achieve the results manually calculate the hidden layer neurons.
At the same time, it can be found in tf.nn.rnn_cell.BasicLSTMCell
therefore, do not need to process the input data.
(2)、
test_data = corresponding_choose(test_x_image, 200, m=0)
In the beginning of the test sample batch_size and training samples do not meet, so it set up 200, or can be changed again.
(3),
initialize variables, do not use anything else, to be honest with
global_variables_initializer ()
Third, the experimental results + github
To achieve the following results:
Overall, the results still considerable.
The complete code (other write their own library calls), and other program comments, you need to pay attention to the place, all together on github, need can take a look.
https://github.com/wangjunhe8127/tensorflow-BasicLSTMCell-mnist
May the force be with you!