"Introduction to Deep Learning" Chapter 3 Actual Combat: Handwritten Number Recognition

"Introduction to Deep Learning" Chapter 3 Actual Combat: Handwritten Number Recognition


foreword

The author recently read the third chapter of the book "Introduction to Deep Learning-Python-Based Theory and Implementation". At the end of the chapter, there happened to be a practical content on handwritten digit recognition, so I wrote a program according to the content of the book and ran it. Make a note here.


`

1. A little introduction

This small case of handwritten digit recognition is implemented using a 3-layer neural network. It should be noted that in this small case, the neural network does not have the function of "learning". Before training, we will read a sample_weight.pkl file, which stores the weights and weights of the learned neural network. Bias parameter.

The structure of this neural network is roughly as follows:

Input layer: 784 neurons
Hidden layer 1: 50 neurons
Hidden layer 2: 100 neurons
Output layer: 10 neurons

**Why is the input layer 784 neurons? **Since this corresponds to an image of size 28*28=784, the input image is expanded into a one-dimensional array of size exactly 784.
**Why is the output layer 10 neurons? **Because the recognition result of "handwritten digit recognition" has 0-9, these 10 possibilities, so it corresponds to 10 neurons. Furthermore, the activation function used in the output layer is softmax, so if the value of the 0th neuron in the output layer is 0.632, it means that this number has a 63.2% possibility of being 0.
The number of neurons in the hidden layer can be set to any value.

The following get_data() function is used to load data. Flatten is set to true to expand the input image into a one-dimensional array (784 dimensions); normalize is set to True to normalize the input image to 0-1.

def get_data():
    
    (x_train, t_train), (x_test, t_test) = load_mnist(flatten=True, normalize=False)
    return x_test, t_test

The following function init_network(), as the name suggests, initializes the network. It will read in the parameters saved in the sample_weight.pkl file to initialize the network.

def init_network():
    with open("sample_weight.pkl", 'rb') as f:
        network = pickle.load(f)
    return network

The following two functions sigmoid() and softmax() are activation functions.

def sigmoid(a):
    return 1 / (1 + np.exp(-a))


def softmax(a):
    exp_a = np.exp(a)
    sum = np.sum(exp_a)
    y = exp_a / sum
    return y

The following function predict(network, x) is used for prediction. The input network is the parameter of the neural network, and the input x is the handwritten digital image data. The output y is the probability corresponding to each label.

def predict(network, x):
    W1, W2, W3 = network['W1'], network['W2'], network['W3']
    b1, b2, b3 = network['b1'], network['b2'], network['b3']
    a1 = np.dot(x, W1) + b1
    z1 = sigmoid(a1)
    a2 = np.dot(z1, W2) + b2
    z2 = sigmoid(a2)
    a3 = np.dot(z2, W3) + b3
    y = softmax(a3)
    return y

The following part of the code mainly does the following tasks: ① read in data; ② initialize the network; ③ make predictions; ④ evaluate the accuracy of predictions.
The idea of ​​"batch processing" is also adopted here. Batch processing refers to packing multiple pieces of data for prediction at the same time, which can improve the efficiency of data processing. The batch size in the code below is 100, which means that 100 pictures are sent to the network for prediction each time.

x, t = get_data()  # x是测试数据,t是测试标签
network = init_network()
batch_size = 100  # 批数量
accuracy_cnt = 0
for i in range(0, len(x), batch_size):
    x_batch = x[i: i + batch_size]
    y_batch = predict(network, x_batch)  # 预测
    # 选出y中最大值所在的下标
    p = np.argmax(y_batch, axis=1)
    accuracy_cnt += np.sum(p == t[i:i+batch_size])

print("Accuracy:" + str(float(accuracy_cnt) / len(x)))

2. Complete code

import pickle
import sys, os
sys.path.append(os.pardir)  # 为了导入父目录中的文件而进行的设定
from dataset.mnist import load_mnist
from PIL import Image
import numpy as np


# 定义一个函数,用于显示图片
def img_show(img):
    pil_img = Image.fromarray(np.uint8(img))
    pil_img.show()


# # 加载数据。flatten设置为true表示展开输入图像为一维数组(784维);normalize设置为True表示将输入图像正规化到0-1
# (x_train, t_train), (x_test, t_test) = load_mnist(flatten=True, normalize=False)
# img = x_train[0]
# label = t_train[0]
# print(label)
# print(img.shape)
# img = img.reshape(28, 28)
# print(img.shape)
#
# img_show(img)


def get_data():
    # 加载数据。flatten设置为true表示展开输入图像为一维数组(784维);normalize设置为True表示将输入图像正规化到0-1
    (x_train, t_train), (x_test, t_test) = load_mnist(flatten=True, normalize=False)
    return x_test, t_test


def init_network():
    with open("sample_weight.pkl", 'rb') as f:
        network = pickle.load(f)
    return network


def sigmoid(a):
    return 1 / (1 + np.exp(-a))


def softmax(a):
    exp_a = np.exp(a)
    sum = np.sum(exp_a)
    y = exp_a / sum
    return y


def predict(network, x):
    W1, W2, W3 = network['W1'], network['W2'], network['W3']
    b1, b2, b3 = network['b1'], network['b2'], network['b3']
    a1 = np.dot(x, W1) + b1
    z1 = sigmoid(a1)
    a2 = np.dot(z1, W2) + b2
    z2 = sigmoid(a2)
    a3 = np.dot(z2, W3) + b3
    y = softmax(a3)
    return y


x, t = get_data()  # x是测试数据,t是测试标签
network = init_network()
batch_size = 100  # 批数量
accuracy_cnt = 0
for i in range(0, len(x), batch_size):
    x_batch = x[i: i + batch_size]
    y_batch = predict(network, x_batch)  # 预测
    # 选出y中最大值所在的下标
    p = np.argmax(y_batch, axis=1)
    accuracy_cnt += np.sum(p == t[i:i+batch_size])

print("Accuracy:" + str(float(accuracy_cnt) / len(x)))

Program running result:
insert image description here

3. A small problem with importing data sets

The operation of this program needs to import the mnist dataset. I typed the following line of code according to the book at the beginning. After trying to import the dataset package, the program reported an error:

from dataset.mnist import load_mnist

Later, when I searched for solutions on the Internet, I found out that the mnist data set needs to be downloaded from the official website of this book. Official website link: http://www.ituring.com.cn/book/1921
Then, first click "Download with book" on the right, and then click the second "Download".
insert image description here
After downloading, find the mnist.py file in the dataset folder, and copy this file to the project venv\Lib\site-packages\dataset path. If you run it again, you will not report an error~
insert image description here
insert image description here

Guess you like

Origin blog.csdn.net/rellvera/article/details/127903549