mnist LSTM training, testing, model saving, loading and recognition

Original article, please indicate the source for reprint: http://blog.csdn.net/wanggao_1990/article/details/77964504

Each character (0-9) in the MNIST character database corresponds to a 28x28 one-channel picture, each column (row) of the picture can be regarded as a feature, and all rows (columns) can be regarded as a sequence. The characters can then be modeled by an RNN (lstm) with an input size of 28 and a time length of 28. For the same character, say 0, the line-to-line dynamic changes can be well represented by RNNs, and all these continuous line changes represent a specific pattern of a character. Therefore, RNN can be used for character recognition.

Tensorflow provides a good RNN interface. The basic idea is
1. Establish the basic unit cell in the RNN network; tf provides many types of cells, BasicRNNCell, BasicLSTMCell, LSTMCell, etc.
2. By calling the rnn.static_rnn function or rnn.static_bidirectional_rnn Connect the cells into an RNN network. This example uses the rnn.static_bidirectional_rnn function. (different versions vary)

LSTM training and testing

import os
import numpy as np
'''
A Bidirectional Recurrent Neural Network (LSTM) implementation example using TensorFlow library.
This example is using the MNIST database of handwritten digits (http://yann.lecun.com/exdb/mnist/)
Long Short Term Memory paper: http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf

Author: Aymeric Damien
Project: https://github.com/aymericdamien/TensorFlow-Examples/
'''

from __future__ import print_function

import tensorflow as tf
from tensorflow.contrib import rnn

# Import user date convertor
import os
from convert_data import convert_datas

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/data/", one_hot=True)

'''
To classify images using a bidirectional recurrent neural network, we consider
every image row as a sequence of pixels. Because MNIST image shape is 28*28px,
we will then handle 28 sequences of 28 steps for every sample.
'''

# Parameters
learning_rate = 0.001

# 训练迭代次数
training_iters = 100000

# 每次训练的样本大小
batch_size = 128

# 这个是用来显示的。
display_step = 10

# Network Parameters
# n_steps*n_input其实就是那张图 把每一行拆到每个time step上。
n_input = 28 # MNIST data input (img shape: 28*28)
n_steps = 28 # timesteps


# 隐藏层大小
n_hidden = 128 # hidden layer num of features
n_classes = 10 # MNIST total classes (0-9 digits)

# tf Graph input
# [None, n_steps, n_input]这个None表示这一维(样本数)不确定大小
x = tf.placeholder("float", [None, n_steps, n_input], name="input_x")
y = tf.placeholder("float", [None, n_classes], name="input_y")

# Define weights and biases
weights = tf.Variable(tf.random_normal([2*n_hidden, n_classes]), name="weights")
biases = tf.Variable(tf.random_normal([n_classes]), name="biases")

def BiRNN( x, weights, biases):
    # Prepare data shape to match `bidirectional_rnn` function requirements
    # Current data input shape: (batch_size, n_steps, n_input)
    # Required shape: 'n_steps' tensors list of shape (batch_size, n_input)

    # Unstack to get a list of 'n_steps' tensors of shape (batch_size, n_input)
    # 变成了n_steps*(batch_size, n_input)
    x = tf.unstack(x, n_steps, 1)

    # Define lstm cells with tensorflow
    # Forward direction cell
    lstm_fw_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
    # Backward direction cell
    lstm_bw_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)

    # Get lstm cell output
    try:
        outputs, _, _ = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x, dtype=tf.float32)
    except Exception: # Old TensorFlow version only returns outputs not states
        outputs = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x, dtype=tf.float32)

    # Linear activation, using rnn inner loop last output

    # return tf.matmul(outputs[-1], weights['out']) + biases['out']
    # return tf.matmul(outputs[-1], weights) + biases

    return tf.add(tf.matmul(outputs[-1], weights), biases)

pred = BiRNN(x, weights, biases)

# Define loss and optimizer
# softmax_cross_entropy_with_logits:Measures the probability error in discrete classification tasks in which the classes are mutually exclusive
# return a 1-D Tensor of length batch_size of the same type as logits with the softmax cross entropy loss.
# reduce_mean就是对所有数值(这里没有指定哪一维)求均值。
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    step = 1
    # Keep training until reach max iterations
    while step * batch_size < training_iters:
        batch_x, batch_y = mnist.train.next_batch(batch_size)
        # Reshape data to get 28 seq of 28 elements
        batch_x = batch_x.reshape((batch_size, n_steps, n_input))
        # Run optimization op (backprop)
        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
        if step % display_step == 0:
            # Calculate batch accuracy
            acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y})
            # Calculate batch loss
            loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
            print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + "{:.6f}".format(loss) + \
                  ", Training Accuracy= " + "{:.5f}".format(acc))
        step += 1
    print("Optimization Finished!")

    # Calculate accuracy for 128 mnist test images
    # test_len = 128
    # test_data = mnist.test.images[:test_len].reshape((-1, n_steps, n_input))
    # test_label = mnist.test.labels[:test_len]

    ## Input 为 batch_size*30*17
    ##  实际测试,需要满足 tensorflow的输入placeholder要求
    test_data = mnist.test.images.reshape((-1, n_steps, n_input))
    test_label = mnist.test.labels

    print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: test_label}))

Save the trained model

Immediately after the above test progress output, enter the following code to run repeatedly

    saver = tf.train.Saver()
    model_path = "./model/my_model"
    save_path = saver.save(sess, model_path)
    print("Model saved in file: %s" % save_path)

This is just one way and saves the entire network structure. model_pathIn modelis the folder where my_modelthe model is saved, which is the prefix for saving the model, which can be understood as the name of the model.

After running, a new folder named "model" will be created in the current directory, and there are four folders: checkpoint, my_model.data-00000-of-00001, my_model.index and my_model.meta. There are many online presentations about the four documents here.

Note that this value is for saving the model. The purpose of saving here is to load and test the input data, and does not need to rebuild the entire network. Therefore, it is also necessary to save some computing nodes, and use these nodes to calculate the output in the recognition phase. 1 prediction node needs to be added here. Added after pred = BiRNN(x, weights, biases):

    tf.add_to_collection('predict', pred)

Binding the entire computation of pred to the entire name of "predict" makes it possible to read the entire computation node by its entire name after loading.


Load training model, identify

Loading the model is very simple, the main code is as follows

with tf.Session() as sess:
    new_saver = tf.train.import_meta_graph('./model/my_model.meta')
    new_saver.restore(sess, './model/my_model')

It should be noted here that the path of the restore() function should be the same as when saving.

Next, read the required nodes from the loaded model. The first is the pred operation corresponding to the predict node, and the second is the pred operation that requires input x, which is the placeholder "input_x" in the training code. Continue to add the code as follows

    graph = tf.get_default_graph()    
    predict = tf.get_collection('predict')[0]
    input_x = graph.get_operation_by_name("input_x").outputs[0]

Finally, it is to input a picture data and identify and classify it.

    x = mnist.test.images[0].reshape((1, n_steps, n_input))
    res = sess.run(predict, feed_dict={input_x: x})

The first picture of the test data set used here, the process here is similar to the test part, except that there is no second parameter label. The returned result can get the category value through tf.argmax.

When using the argmax function, you need to confirm the shape of the data, and then determine the dimension of the calculation. The complete code for this part is as follows:

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/data/", one_hot=True)

n_input = 28
n_steps = 30
n_classes = 2

with tf.Session() as sess:
    new_saver = tf.train.import_meta_graph('./model/my_model.meta')
    new_saver.restore(sess, './model/my_model')

    graph = tf.get_default_graph()
    predict = tf.get_collection('predict')[0]
    input_x = graph.get_operation_by_name("input_x").outputs[0]

    x = mnist.test.images[0].reshape((1, n_steps, n_input))
    y = mnist.test.labels[0].reshape(-1, n_classes)  # 转为one-hot形式

    res = sess.run(predict, feed_dict={input_x: test_data })

    print("Actual class: ", str(sess.run(tf.argmax(y, 1))), \
          ", predict class ",str(sess.run(tf.argmax(res, 1))), \
          ", predict ", str(sess.run(tf.equal(tf.argmax(y, 1), tf.argmax(res, 1))))
          )

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325448021&siteId=291194637