TensorFlow-Handwritten Digit Recognition (3)

In this article, based on the previous TensorFlow-handwritten digit recognition (2), the fully connected network is changed to the LeNet-5 convolutional neural network to realize handwritten digit recognition.

1 Introduction

Fully connected network: Each neuron has a connection relationship with each neuron in the adjacent layer before and after, the input is the feature, and the output is the predicted result.

Number of parameters: Σ (front layer x rear layer + rear layer)

For example, the 3-layer fully connected network used for handwriting recognition has 784 nodes in the input layer, 500 nodes in the hidden layer, and 10 nodes in the output layer. then:

  • Hidden layer parameters: 748*500+500

  • Output layer parameters: 500*10+10

  • Total: 397510≈400,000

Note: The parameters of a certain layer mentioned here refer to the parameters between this layer and the previous layer, so the input layer has no parameters.

Therefore, a black-and-white image with a resolution of only 28x28 has nearly 400,000 parameters to be optimized.
High-resolution color images in real life have more pixels and are three channels of red, green and blue information.

Too many parameters to be optimized can easily lead to over-fitting of the model.
In order to avoid this phenomenon, the original pictures are generally not directly fed to the fully connected network in actual applications.
Instead, first extract the features of the original image, feed the extracted features to the fully connected network, and then let the fully connected network calculate the classification evaluation value.

2 CNN basics

2.1 Convolutional

Convolution is an effective method for extracting image features.
Generally, a square convolution kernel is used to traverse every pixel on the picture.
Each pixel value corresponding to the overlapping area of ​​the picture and the convolution kernel is multiplied by the weight of the corresponding point in the convolution kernel, and then summed, plus the offset, and
finally a pixel value in the output picture is obtained.

2.2 All zero padding Pdding

Sometimes all zero padding is performed around the input image to ensure that the size of the output image is consistent with the input image.

The size of the output data = (w+2*pk)/s+1

  • w: input size

  • p: padding size

  • k: Convolution kernel size (sometimes also denoted by f)

  • s: nuclear sliding step length

For example: the input is 32x32x3, the core is 5x5x3, without all zero filling, the output is (32-5+1)/1=28.

If you want to keep the output at 32x32x3, you can calculate how many layers of zeros need to be filled according to the formula.
32=(32+2P-5)/1 +1, P=2 is calculated, that is, 2 layers (circles) of zeros need to be filled.

2.3 Convolution calculation function in TensorFlow

tf.nn.conv2d(input, convolution kernel, step size, padding='VALID')

  • Input: eg.[batch,5,5,1]

  • Use batch to give how many pictures are fed at one time

  • The resolution of each picture, such as 5 rows and 5 columns

  • These pictures contain several channels of information (single-channel grayscale image: 1, red, green and blue three-channel color image: 3)

  • Convolution kernel: eg.[3,3,1,16]

  • The resolution of the convolution kernel is 3 rows and 3 columns respectively

  • The convolution kernel has 1 channel. The number of channels is determined by the number of channels of the input picture. It is equal to the number of channels of the input picture, so it is also 1.

  • There are a total of 16 such convolution kernels, indicating that the depth of the output image after the convolution operation is 16, that is, the output is 16 channels.

  • Step size: eg.[1,1,1,1]

  • The second parameter represents the horizontal sliding step length

  • The third parameter represents the longitudinal sliding step length.

  • The first and last 1 are fixed here, which means that the horizontal and vertical steps are all 1 as the step length.

  • padding: Whether to use padding, the default is VALID, note that VALID is given in the form of a string.

2.4 Multi-channel image convolution

In most cases, the input picture is a color picture composed of three colors of RGB. The input picture contains three layers of red, green and blue data.
The depth of the convolution kernel should be equal to the number of channels of the input picture, so 3x3x3 convolution is used. Core, the
last 3 means matching the 3 channels of the input image, so this convolution kernel has three layers,
each layer will randomly generate 9 parameters to be optimized, a total of 27 parameters to be optimized w and a bias b.

The convolution calculation method is similar to the single-layer convolution kernel. In order to match the three colors of RGB, the convolution kernel puts the three-layer convolution kernel on the three-layer color picture, and the
overlapping 27 pixels perform the multiplication and addition operation of the corresponding points. , The final result plus the offset term b to obtain a value in the output picture.

For example, a 5x5x3 input image is filled with all zeros and a 3x3x3 convolution kernel is used. All 27 points
are multiplied by the corresponding parameters to be optimized, and the sum of the products is added to the offset b to obtain a value in the output image.

2.5 Pooling Polling

TensorFlow gives a function for calculating pooling.
The maximum pooling uses the tf.nn.max_pool function, and the average pooling uses the tf.nn.avg_pool function.

pool=tf.nn.max_pool(input, pooling core, core step size, padding='SAME')

  • Input: eg.[batch,28,28,6], give a batch of pictures, row and column resolution, and the number of input channels.

  • Pooling kernel: eg.[1,2,2,1], only describes the row resolution and column resolution, the first and last parameters are fixed at 1.

  • Nuclear step size: eg. [1,2,2,1], pooling nuclear sliding step length, only describes the horizontal sliding step length and the vertical sliding step length, the first and last parameters are fixed at 1.

  • padding: Whether to use zero padding padding, it can be SAME or not VALID

2.6 Abandon Dropout

In the neural network training process, in order to reduce too many parameters, the dropout method is often used, and some neurons are discarded from the neural network according to a certain probability.
This kind of abandonment is temporary, only abandoning some neurons during training;
when using a neural network, all neurons will be restored to the neural network.

In practical applications, dropout is often used to reduce overfitting and speed up model training when constructing neural networks in forward propagation.
Dropout is generally placed in a fully connected network.

The dropout function provided by TensorFlow: tf.nn.dropout (upper output, the probability of temporarily discarded neurons)

For example: in the process of training parameters, output = tf.nn.dropout (upper layer output, dropout probability),
so that neurons with a specified probability are randomly zeroed, and the zeroed neurons do not participate in the current round of parameter optimization.

2.7 Convolutional Neural Network CNN

The convolutional neural network can be considered to be composed of two parts, one is to extract the features of the input image, and the other is the fully connected network,
but it is no longer the original image that is fed into the fully connected network, but after several convolutions and activations. And pooled feature information.

Since the birth of convolutional neural networks, many classic network structures have emerged, such as Lenet-5, Alenet, VGGNet, GoogleNet, and ResNet.
Each network structure is expanded based on the four operations of convolution, activation, pooling, and full connection.

LeNet-5 is the earliest convolutional neural network, which effectively solves the recognition problem of handwritten digits.

3 LeNet-5 network analysis

The LeNet-5 neural network was proposed by Yann LeCun et al. in 1998. The neural network fully considers the correlation of images.

3.1 LeNet-5 neural network structure

  • The input is a picture size of 32 32 1, which is a single-channel input;

  • Perform convolution, the size of the convolution kernel is 5 5 1, the number is 6, the step size is 1, non-zero filling mode;

  • Pass the convolution result through a nonlinear activation function;

  • Perform pooling, the pooling size is 2*2, the step size is 1, and the all-zero filling mode;

  • Perform convolution, the size of the convolution kernel is 5 5 6, the number is 16, the step size is 1, non-zero filling mode;

  • Pass the convolution result through a nonlinear activation function;

  • Perform pooling, the pooling size is 2*2, the step size is 1, and the all-zero filling mode;

  • Fully connected layer for 10 categories

analysis:

The input of LeNet-5 neural network is 32 32 1, after 5 5 1 convolution kernel, the number of convolution kernel is 6, non-zero filling, step size 1:
output = (32+0-5)/1+1 =28, so the output is 28 28 6 after convolution .

After the first pooling layer, the pooling size is 2*2, all zeros are filled, and the step size is 2:
output=input/step size=28/2=14, the pooling layer does not change the depth, and the depth is still 6.

Using the same calculation method, the output of the second layer of pooling is 5 5 16.
The output after the second pooling layer is straightened and sent to the fully connected layer.

Features;

  • Convolution (Conv), pooling (ave-pooling), and nonlinear activation function (sigmoid) alternate with each other;

  • Sparse connections between layers to reduce computational complexity

3.2 Fine-tune LeNet-5 structure to adapt to MNIST data

Since the image size of the MNIST data set is 28 28 1 grayscale images, and the input of the LeNet-5 neural network is 32 32 1, it is necessary to fine-tune the structure of LeNet-5.

Structure after adjustment:

  • The input is a picture size of 32 32 128 28 1, which is a single-channel input;

  • Perform convolution, the size of the convolution kernel is 5 5 1, the number is 632, the step size is 1, and the non-all-zero all-zero filling mode;

  • Pass the convolution result through a nonlinear activation function;

  • Perform pooling, the pooling size is 2*2, the step size is 12, and the all-zero filling mode;

  • Perform convolution, the size of the convolution kernel is 5 5 65 5 32, the number is 1664, the step size is 1, and the non-all-zero all-zero filling mode;

  • Pass the convolution result through a nonlinear activation function;

  • Perform pooling, the pooling size is 2*2, the step size is 12, and the all-zero filling mode;

  • Fully connected layer for 10 categories

4 Code to implement LeNet-5

The implementation of LeNet-5 neural network on the MNIST data set is mainly divided into three parts:

  • Forward propagation process (mnist_lenet5_forward.py)

  • Backpropagation process (mnist_lenet5_backword.py)

  • Test process (mnist_lenet5_test.py)

4.1 Forward propagation process (mnist_lenet5_forward.py)

Realize the initialization of the parameters and biases in the network, define the convolution structure and pooling structure, and define the forward propagation process.

The specific code is as follows:

  • Define the parameters commonly used in the forward propagation process
import tensorflow as tf

#输入图片的尺寸和通道数
IMAGE_SIZE = 28
NUM_CHANNELS = 1

#第一层卷积核的大小和个数
CONV1_SIZE = 5
CONV1_KERNEL_NUM = 32

#第二层卷积核的大小和个数
CONV2_SIZE = 5
CONV2_KERNEL_NUM = 64

#第三层全连接层的神经元个数
FC_SIZE = 512

#第四层全连接层的神经元个数
OUTPUT_NODE = 10

The weight w generating function and the bias b generating function are the same as the previous definitions


def get_weight(shape, regularizer): #生成张量的维度,正则化项的权重
    # tf.truncated_normal:生成去掉过大偏离点的正态分布随机数的张量,stddev是指标准差
    w = tf.Variable(tf.truncated_normal(shape,stddev=0.1))
    #  为权重加入L2正则化
    if regularizer != None:
        tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(regularizer)(w)) 
    return w

def get_bias(shape): 
    b = tf.Variable(tf.zeros(shape))  
    return b

Convolutional layer and pooling calculation functions are as follows


def conv2d(x,w):  #一个输入 batch,卷积层的权重   'SAME' 表示使用全 0  填充,而'VALID'
    return tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):  #ksize表示池化过滤器的边长为2 strides表示过滤器移动步长是2 'SAME'提供使用全0填充
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') 

Define the forward propagation process

def forward(x, train, regularizer):
    #【卷积池化】
    conv1_w = get_weight([CONV1_SIZE, CONV1_SIZE, NUM_CHANNELS, CONV1_KERNEL_NUM], regularizer) #初始化卷积核
    conv1_b = get_bias([CONV1_KERNEL_NUM]) #初始化偏置项
    conv1 = conv2d(x, conv1_w) 
    relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_b)) #非线性激活,相比sigmoid和tanh函数,relu函数可快速收敛
    pool1 = max_pool_2x2(relu1) 

    #【卷积池化】
    conv2_w = get_weight([CONV2_SIZE, CONV2_SIZE, CONV1_KERNEL_NUM, CONV2_KERNEL_NUM],regularizer) 
    conv2_b = get_bias([CONV2_KERNEL_NUM])
    conv2 = conv2d(pool1, conv2_w) 
    relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_b))
    pool2 = max_pool_2x2(relu2)

    #将上一池化层的输出pool2(矩阵)转化为下一层全连接层的输入格式(向量)
    pool_shape = pool2.get_shape().as_list() #得到pool2输出矩阵的维度,并存入list中,注意pool_shape[0]是一个batch的值
    nodes = pool_shape[1] * pool_shape[2] * pool_shape[3] #从list中依次取出矩阵的长宽及深度,并求三者的乘积就得到矩阵被拉长后的长度
    reshaped = tf.reshape(pool2, [pool_shape[0], nodes]) #将pool2转换为一个batch的向量再传入后续的全连接

    #【全连接】
    fc1_w = get_weight([nodes, FC_SIZE], regularizer) 
    fc1_b = get_bias([FC_SIZE]) 
    fc1 = tf.nn.relu(tf.matmul(reshaped, fc1_w) + fc1_b) 
    if train:
        fc1 = tf.nn.dropout(fc1, 0.5)#如果是训练阶段,则对该层输出使用dropout,随机将该层输出中的一半神经元置为无效

    ##全连接
    fc2_w = get_weight([FC_SIZE, OUTPUT_NODE], regularizer)
    fc2_b = get_bias([OUTPUT_NODE])
    y = tf.matmul(fc1, fc2_w) + fc2_b
    return y 

4.2 Backward propagation process (mnist_lenet5_backward.py)

Used to complete the training of neural network parameters

  • Define hyperparameters during training

#coding:utf-8
BATCH_SIZE = 50#100 #batch
LEARNING_RATE_BASE = 0.005 #Learning rate
LEARNING_RATE_DECAY = 0.99
#Learning rate decay rate REGULARIZER = 0.0001 #Regularization item weight STEPs =
50000
#Number of iterations MOVING_AVERAGE_DECAY = 0.99 #Moving average decay rate
MODEL_SAVE_PATH="./model/
" #The path to save the model MODEL_NAME="mnist_model" #Model naming


* 完成反向传播过程

* 给x, y_ 是占位

* 调用前向传播过程

* 求含有正则化的损失值

* 实现指数衰减学习率

* 实现滑动平均模型

* 将train_step和ema_op两个训练操作绑定到train_op上

* 实例化一个保存和恢复变量的saver,并创建一个会话

def backward(mnist):
#x,y_占位
x = tf.placeholder(tf.float32,[
BATCH_SIZE,
mnist_lenet5_forward.IMAGE_SIZE,
mnist_lenet5_forward.IMAGE_SIZE,
mnist_lenet5_forward.NUMCHANNELS])
y
= tf.placeholder(tf.float32, [None, mnist_lenet5_forward.OUTPUT_NODE])

#前向传播
y = mnist_lenet5_forward.forward(x,True, REGULARIZER)

#声明一个全局计数器,并输出化为0
global_step = tf.Variable(0, trainable=False)

#先是对网络最后一层的输出y做softmax,再将此向量和实际标签值做交叉熵
ce = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_, 1))

#再对得到的向量求均值就得到 loss
cem = tf.reduce_mean(ce)

#添加正则化中的 losses
loss = cem + tf.add_n(tf.get_collection('losses'))

#实现指数级的减小学习率
learning_rate = tf.train.exponential_decay( 
    LEARNING_RATE_BASE,
    global_step,
    mnist.train.num_examples / BATCH_SIZE, 
    LEARNING_RATE_DECAY,
    staircase=True)

#传入学习率,构造一个实现梯度下降算法的优化器,再通过使用minimize更新存储要训练的变量的列表来减小loss
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)

#实现滑动平均模型,参数MOVING_AVERAGE_DECAY用于控制模型更新的速度
ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)
ema_op = ema.apply(tf.trainable_variables())

#将train_step和ema_op两个训练操作绑定到train_op
with tf.control_dependencies([train_step, ema_op]): 
    train_op = tf.no_op(name='train')

#实例化一个保存和恢复变量的saver
saver = tf.train.Saver() 

#创建一个会话,并通过python中的上下文管理器来管理这个会话
with tf.Session() as sess: 
    init_op = tf.global_variables_initializer() 
    sess.run(init_op) 

    #  通过checkpoint文件定位到最新保存的模型
    ckpt = tf.train.get_checkpoint_state(MODEL_SAVE_PATH) 
    if ckpt and ckpt.model_checkpoint_path:
        saver.restore(sess, ckpt.model_checkpoint_path) 

    for i in range(STEPS):
        #读取一个batch的数据
        xs, ys = mnist.train.next_batch(BATCH_SIZE) 
        #将输入数据xs转换成与网络输入相同形状的矩阵
        reshaped_xs = np.reshape(xs,(  
        BATCH_SIZE,
        mnist_lenet5_forward.IMAGE_SIZE,
        mnist_lenet5_forward.IMAGE_SIZE,
        mnist_lenet5_forward.NUM_CHANNELS))
        #喂入训练图像和标签,开始训练
        _, loss_value, step = sess.run([train_op, loss, global_step], feed_dict={x: reshaped_xs, y_: ys}) 
        if i % 100 == 0: 
            print("After %d training step(s), loss on training batch is %g." % (step, loss_value))
            saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME), global_step=global_step)

运行结果

RESTART: G:...\lenet5\mnist_lenet5_backward.py
Extracting ./data/train-images-idx3-ubyte.gz
Extracting ./data/train-labels-idx1-ubyte.gz
Extracting ./data/t10k-images-idx3-ubyte.gz
Extracting ./data/t10k-labels-idx1-ubyte.gz
After 19312 training step(s), loss on training batch is 0.650531.
After 19412 training step(s), loss on training batch is 0.699633.
After 19512 training step(s), loss on training batch is 0.686086.
After 19612 training step(s), loss on training batch is 0.725393.
After 19712 training step(s), loss on training batch is 0.788735.
After 19812 training step(s), loss on training batch is 0.697031.
After 19912 training step(s), loss on training batch is 0.712534.
After 20012 training step(s), loss on training batch is 0.746723.
After 20112 training step(s), loss on training batch is 0.776782.
After 20212 training step(s), loss on training batch is 0.791459.
After 20312 training step(s), loss on training batch is 0.731853.
After 20412 training step(s), loss on training batch is 0.666092.


损失函数值在0.7左右徘徊,继续调节训练参数应该可以得到更好的结果。

**4.3 测试过程(mnist_lenet5_test.py)**

对MNIST数据集中的测试数据进行预测,测试模型准确率。

import time
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import mnist_lenet5_forward
import mnist_lenet5_backward
import numpy as np

TEST_INTERVAL_SECS = 5
#+++++++++++++++++++++++++++++++ Modify the size of the read
BATCH_SIZE = 500#0 #batch
STEPS = 2

def test(mnist):
with tf.Graph().as_default() as g:
x = tf.placeholder(tf.float32,[
BATCH_SIZE,#mnist.test.num_examples,
mnist_lenet5_forward.IMAGE_SIZE,
mnist_lenet5_forward.IMAGE_SIZE,
mnist_lenet5_forward.NUMCHANNELS])
y
= tf.placeholder(tf.float32, [None, mnist_lenet5_forward.OUTPUT_NODE])
y = mnist_lenet5_forward.forward(x,False,None)

    ema = tf.train.ExponentialMovingAverage(mnist_lenet5_backward.MOVING_AVERAGE_DECAY)
    ema_restore = ema.variables_to_restore()
    saver = tf.train.Saver(ema_restore)

#判断预测值和实际值是否相同
    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
    #求平均得到准确率
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 

    while True:
        with tf.Session() as sess:
            ckpt = tf.train.get_checkpoint_state(mnist_lenet5_backward.MODEL_SAVE_PATH)
            if ckpt and ckpt.model_checkpoint_path:
                saver.restore(sess, ckpt.model_checkpoint_path)

        #根据读入的模型名字切分出该模型是属于迭代了多少次保存的
                global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]

                for i in range(STEPS):
                #读取一个batch的数据
                    xs, ys = mnist.test.next_batch(BATCH_SIZE) 
                    reshaped_x = np.reshape(xs,(
                    BATCH_SIZE,#mnist.test.num_examples,
                    mnist_lenet5_forward.IMAGE_SIZE,
                    mnist_lenet5_forward.IMAGE_SIZE,
                    mnist_lenet5_forward.NUM_CHANNELS))
                    #计算出测试集上准确率
                    accuracy_score = sess.run(accuracy, feed_dict={x:reshaped_x,y_:ys}) 
                    print("After %s training step(s), test accuracy = %g" % (global_step, accuracy_score))
            else:
                print('No checkpoint file found')
                return
        #每隔5秒寻找一次是否有最新的模型
        time.sleep(TEST_INTERVAL_SECS) 

def main():
mnist = input_data.read_data_sets("./data/", one_hot=True)
test(mnist)

if name == 'main':
main()


运行结果:

RESTART: G:\TestProject\python\tensorflow\peking_caojian\7CNNbase\lenet5\mnist_lenet5_test2.py
Extracting ./data/train-images-idx3-ubyte.gz
Extracting ./data/train-labels-idx1-ubyte.gz
Extracting ./data/t10k-images-idx3-ubyte.gz
Extracting ./data/t10k-labels-idx1-ubyte.gz
After 21512 training step(s), test accuracy = 0.9842
After 21512 training step(s), test accuracy = 0.9802


测试集上的准确率在98%左右。

**4.4 测试真实图片数据**

修改之前的mnist_app.py文件,主要有两点改变:

* 读取图片的格式:原来是[1,784]这种形式,现在使用的是[1,28,28,1]。

* 读取图片的方式:原来是手动输入文件名,现在修改为自动读取整个文件夹里的图片,

图片按自定义的格式命名,还可以直接判断出知否预测准确,并给出总的准确率。

import tensorflow as tf
import numpy as np
from PIL import Image
import mnist_lenet5_backward as mnist_backward
import mnist_lenet5_forward as mnist_forward

import os
filedir = os.getcwd()+ '\pic'
m = 0
n = 0

def restore_model(testPicArr):
with tf.Graph().as_default() as tg:
x = tf.placeholder(tf.float32,[
1,
mnist_forward.IMAGE_SIZE,
mnist_forward.IMAGE_SIZE,
mnist_forward.NUM_CHANNELS])
y = mnist_forward.forward(x,False,None)
preValue = tf.argmax(y, 1)

    variable_averages = tf.train.ExponentialMovingAverage(mnist_backward.MOVING_AVERAGE_DECAY)
    variables_to_restore = variable_averages.variables_to_restore()
    saver = tf.train.Saver(variables_to_restore)

    with tf.Session() as sess:
        ckpt = tf.train.get_checkpoint_state(mnist_backward.MODEL_SAVE_PATH)
        if ckpt and ckpt.model_checkpoint_path:
            saver.restore(sess, ckpt.model_checkpoint_path)
            preValue = sess.run(preValue, feed_dict={x:testPicArr})
            return preValue
        else:
            print("No checkpoint file found")
            return -1

def pre_pic(picName):
img = Image.open(picName) #Load the picture to be tested (white background)
reIm = img.resize((28,28), Image.ANTIALIAS) #resize to 28x28
im_arr = np.array( reIm.convert('L'))
threshold = 50 #Binary threshold
for i in range(28):
for j in range(28):
im_arr[i][j] = 255-im_arr[i][j] #反Color (black
background ) if (im_arr[i][j] <threshold): #
黑底白字im_arr[i][j] = 0
else:
im_arr[i][j] = 255

#nm_arr = im_arr.reshape([1, 784]) #图片转成1行
nm_arr = np.reshape(im_arr,(
                    1,#mnist.test.num_examples,
                    mnist_forward.IMAGE_SIZE,
                    mnist_forward.IMAGE_SIZE,
                    mnist_forward.NUM_CHANNELS))
nm_arr = nm_arr.astype(np.float32)
img_ready = np.multiply(nm_arr, 1.0/255.0) #取值范围限制在0~1之间

return img_ready

'''
def application():
testNum = input("input the number of test pictures:")
for i in range(int(testNum)):
testPic = input("the path of test picture:")
testPicArr = pre_pic(testPic)
preValue = restore_model(testPicArr)
print ("The prediction number is:", preValue)
'''

def application():
global m,n
for root, dirs, files in os.walk(filedir):
for file in files:
if os.path.splitext(file)[1] == '.png':
n = n+1
imagename = os.path.splitext(file)[0]+'.png'
testPic = os.path.join(root, file)
testPicArr = pre_pic(testPic)
preValue = restore_model(testPicArr)
print ("The %d image name is %s:" % (n,imagename))
print ("The prediction number is:", preValue)
if int(imagename[0])== preValue:
m = m+1
print("TRUE")
else:
print("FALSE!!!!!!!!!!!!!!!!!!!!")
print("m = %d,n = %d" % (m,n))
print("test accuracy = %d%%" % (m/n*100))

def main():
application()

if name == ' main ':
main()
operation result:

RESTART: G:...\lenet5\mnist_app.py
The 1 image name is 0.png:
The prediction number is: [0]
TRUE
The 2 image name is 1.png:
The prediction number is: [1]
TRUE
The 3 image name is 2.png:
The prediction number is: [2]
TRUE
The 4 image name is 3.png:
The prediction number is: [3]
TRUE
The 5 image name is 4.png:
The prediction number is: [4]
TRUE
The 6 image name is 5.png:
The prediction number is: [5]
TRUE
The 7 image name is 6.png:
The prediction number is: [6]
TRUE
The 8 image name is 7.png:
The prediction number is: [7]
TRUE
The 9 image name is 8.png:
The prediction number is: [8]
TRUE
The 10 image name is 9.png:
The prediction number is: [9]
TRUE
m = 10,n = 10
test accuracy = 100%



该测试结果用的是下面教程链接中的图片(下图第一排),换成自己手写的数字(下图第二排),准确率为80%(上篇文章使用全连接网络的准确率只有50%)。

![](https://s4.51cto.com/images/blog/202102/27/6c58937b434838954fb797686cc22603.png?x-oss-process=image/watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=)

参考:人工智能实践:Tensorflow笔记

Guess you like

Origin blog.51cto.com/15060517/2641122