Detection and recognition of MNIST dataset based on BP neural network (numpy version)

1. about the author

Wang Kai, male, School of Electronic Information, Xi'an Polytechnic University, 2022 graduate student
Research direction: machine vision and artificial intelligence
Email: [email protected]

Zhang Siyi, female, School of Electronic Information, Xi'an Polytechnic University, 2022 graduate student, Zhang Hongwei Artificial Intelligence Research Group
Research direction: machine vision and artificial intelligence
Email: [email protected]

2. Introduction to BP neural network

2.1 BP neural network

Build a neural network with two layers (two weight matrices and one hidden layer), in which the number of input nodes and output nodes is determined, which are 784 and 10 respectively. The number of nodes in the hidden layer has not yet been determined, and there is no clear requirement for the number of nodes in the hidden layer, so 50 are taken here. Now that the structure of the neural network has been determined, let's take a look at what it looks like inside. Here is the calculation process for a data:
insert image description here
Mathematical formula:
insert image description here

3. BP neural network detection experiment on MNIST data set

3.1 Read the dataset

Install numpy: pip install numpy
install matplotlib pip install matplotlib
mnist is a data set containing various handwritten digital pictures: there are 60000 training data and 10000 test occasions, that is, 60000 train_img and the corresponding train_label, 10000 test_img and the corresponding test_label.
insert image description here
Among them, train_img and test_img are in the form of this picture. train_img is the training data for training the neural network algorithm, and test_img is the test data for testing the neural network algorithm. Each picture is 2828, and the picture is converted to 28 28= 784 Pixels, the value of each pixel is 0 to 255, the size of the pixel value represents the gray level, thus forming a matrix of 1 784 as the input of the neural network, and the output form of the neural network is a matrix of 110, each: eg: [0.01, 0.01, 0.01, 0.04, 0.8, 0.01, 0.1, 0.01, 0.01, 0.01], the numbers in the matrix represent the probability of the predicted value of the neural network, for example, 0.8 represents the probability of the predicted value of the fifth number.
Among them, train_label and test_label are labels corresponding to training data and test data, which can be understood as a 1*10 matrix, represented by one-hot-vectors (only the correct solution is represented as 1), and when one_hot_label is True, the label is one -hot array returns, one-hot array example: [0, 0, 0, 0, 1, 0, 0, 0, 0, 0], that is, the number 1 in the matrix means that the fifth number is True, which is this The label represents the number 5.
Data set reading:
load_mnist(normalize=True, flatten=True, one_hot_label=False): in,
normalize : Whether to normalize the pixel value of the image to 0.0~1.0 (normalizing the pixel value is beneficial to improve the accuracy). flatten : Whether to flatten the image into a one-dimensional array. one_hot_label: Whether to use one-hot representation.
insert image description here
Complete code and data set download: https://gitee.com/wang-kai-ya/bp.git

3.2 Forward Propagation

When propagating forward, we can construct a function that takes in data and outputs predictions.

def predict(self, x):
    w1, w2 = self.dict['w1'], self.dict['w2']
    b1, b2 = self.dict['b1'], self.dict['b2']

    a1 = np.dot(x, w1) + b1
    z1 = sigmoid(a1)
    a2 = np.dot(z1, w2) + b2
    y = softmax(a2)

3.3 Loss function

Find the predicted value of the neural network for a set of data, which is a 1*10 matrix.
insert image description here

Among them, Yk represents the predicted value of the k-th node, and Tk represents the one-hot value of the k-th node in the label. For example, the previous eg: (the predicted value of the handwritten number 5 and the label of 5) Yk=[
0.01 , 0.01, 0.01, 0.04, 0.8, 0.01, 0.1, 0.01, 0.01, 0.01]
Tk=[0, 0, 0, 0, 1, 0, 0, 0, 0, 0] It
is worth mentioning that in the In the entropy error function, the value of Tk has only one 1, and the rest are 0, so the cross-entropy error for this data is E = -1 (log0.8).
Here, the cross-entropy error is selected as the loss function, and the code is implemented as follows:

def loss(self, y, t):
    t = t.argmax(axis=1)
    num = y.shape[0]
    s = y[np.arange(num), t]

    return -np.sum(np.log(s)) / num

3.4 Building a Neural Network

Earlier we defined the prediction value predict, the loss function loss, the recognition accuracy accuracy, and the gradient grad. Next, we build a neural network class and add these methods to the neural network class:

for i in range(epoch):
    batch_mask = np.random.choice(train_size, batch_size)  # 从0到60000 随机选100个数
    x_batch = x_train[batch_mask]
    y_batch = net.predict(x_batch)
    t_batch = t_train[batch_mask]
    grad = net.gradient(x_batch, t_batch)

    for key in ('w1', 'b1', 'w2', 'b2'):
        net.dict[key] -= lr * grad[key]
    loss = net.loss(y_batch, t_batch)
    train_loss_list.append(loss)

    # 每批数据记录一次精度和当前的损失值
    if i % iter_per_epoch == 0:
        train_acc = net.accuracy(x_train, t_train)
        test_acc = net.accuracy(x_test, t_test)
        train_acc_list.append(train_acc)
        test_acc_list.append(test_acc)
        print(
            '第' + str(i/600) + '次迭代''train_acc, test_acc, loss :| ' + str(train_acc) + ", " + str(test_acc) + ',' + str(
                loss))

3.5 training

import numpy as np
import matplotlib.pyplot as plt
from TwoLayerNet import TwoLayerNet
from mnist import load_mnist

(x_train, t_train), (x_test, t_test) = load_mnist(normalize=True, one_hot_label=True)
net = TwoLayerNet(input_size=784, hidden_size=50, output_size=10, weight_init_std=0.01)

epoch = 20400
batch_size = 100
lr = 0.1

train_size = x_train.shape[0]  # 60000
iter_per_epoch = max(train_size / batch_size, 1)  # 600

train_loss_list = []
train_acc_list = []
test_acc_list = []

Save weights:

np.save('w1.npy', net.dict['w1'])
np.save('b1.npy', net.dict['b1'])
np.save('w2.npy', net.dict['w2'])
np.save('b2.npy', net.dict['b2'])

Result visualization:
insert image description here insert image description here

3.6 Model Inference

import numpy as np
from mnist import load_mnist
from functions import sigmoid, softmax
import cv2
######################################数据的预处理
(x_train, t_train), (x_test, t_test) = load_mnist(normalize=True, one_hot_label=True)
batch_mask = np.random.choice(100,1)  # 从0到60000 随机选100个数
#print(batch_mask)
x_batch = x_train[batch_mask]

#####################################转成图片
arr = x_batch.reshape(28,28)
cv2.imshow('wk',arr)
key = cv2.waitKey(10000)
#np.savetxt('batch_mask.txt',arr)
#print(x_batch)
#train_size = x_batch.shape[0]
#print(train_size)
########################################进入模型预测
w1 = np.load('w1.npy')
b1 = np.load('b1.npy')
w2 = np.load('w2.npy')
b2 = np.load('b2.npy')

a1 = np.dot(x_batch,w1) + b1
z1 = sigmoid(a1)
a2 = np.dot(z1,w2) + b2
y = softmax(a2)
p = np.argmax(y, axis=1)

print(p)

Run python reasoning.py
to see that the model has a high accuracy rate.

4. full code

train

import numpy as np
import matplotlib.pyplot as plt
from TwoLayerNet import TwoLayerNet
from mnist import load_mnist

(x_train, t_train), (x_test, t_test) = load_mnist(normalize=True, one_hot_label=True)
net = TwoLayerNet(input_size=784, hidden_size=50, output_size=10, weight_init_std=0.01)

epoch = 20400
batch_size = 100
lr = 0.1

train_size = x_train.shape[0]  # 60000
iter_per_epoch = max(train_size / batch_size, 1)  # 600

train_loss_list = []
train_acc_list = []
test_acc_list = []

for i in range(epoch):
    batch_mask = np.random.choice(train_size, batch_size)  # 从0到60000 随机选100个数
    x_batch = x_train[batch_mask]
    y_batch = net.predict(x_batch)
    t_batch = t_train[batch_mask]
    grad = net.gradient(x_batch, t_batch)

    for key in ('w1', 'b1', 'w2', 'b2'):
        net.dict[key] -= lr * grad[key]
    loss = net.loss(y_batch, t_batch)
    train_loss_list.append(loss)

    # 每批数据记录一次精度和当前的损失值
    if i % iter_per_epoch == 0:
        train_acc = net.accuracy(x_train, t_train)
        test_acc = net.accuracy(x_test, t_test)
        train_acc_list.append(train_acc)
        test_acc_list.append(test_acc)
        print(
            '第' + str(i/600) + '次迭代''train_acc, test_acc, loss :| ' + str(train_acc) + ", " + str(test_acc) + ',' + str(
                loss))

np.save('w1.npy', net.dict['w1'])
np.save('b1.npy', net.dict['b1'])
np.save('w2.npy', net.dict['w2'])
np.save('b2.npy', net.dict['b2'])

markers = {'train': 'o', 'test': 's'}
x = np.arange(len(train_acc_list))
plt.plot(x, train_acc_list, label='train acc')
plt.plot(x, test_acc_list, label='test acc', linestyle='--')
plt.xlabel("epochs")
plt.ylabel("accuracy")
plt.ylim(0, 1.0)
plt.legend(loc='lower right')
plt.show()

test

import numpy as np
from mnist import load_mnist
from functions import sigmoid, softmax
import cv2
######################################数据的预处理
(x_train, t_train), (x_test, t_test) = load_mnist(normalize=True, one_hot_label=True)
batch_mask = np.random.choice(100,1)  # 从0到60000 随机选100个数
#print(batch_mask)
x_batch = x_train[batch_mask]

#####################################转成图片
arr = x_batch.reshape(28,28)
cv2.imshow('wk',arr)
key = cv2.waitKey(10000)
#np.savetxt('batch_mask.txt',arr)
#print(x_batch)
#train_size = x_batch.shape[0]
#print(train_size)
########################################进入模型预测
w1 = np.load('w1.npy')
b1 = np.load('b1.npy')
w2 = np.load('w2.npy')
b2 = np.load('b2.npy')

a1 = np.dot(x_batch,w1) + b1
z1 = sigmoid(a1)
a2 = np.dot(z1,w2) + b2
y = softmax(a2)
p = np.argmax(y, axis=1)

print(p)

Guess you like

Origin blog.csdn.net/m0_37758063/article/details/131087214