PaddlePaddle introductory practice-handwritten digit recognition

Task requirements

  It can recognize images of handwritten digits 0-9. Specifically, the gray-scale image of handwritten digits (28 pixels x 28 pixels) is divided into 10 categories (0-9). It is required to use the PaddlePaddle framework to implement the model.

Data set and environment

  • Data set source: MNIST , a classic data set in the ML field , containing 60,000 training images and 10,000 test images
  • Data description: The data is divided into pictures and labels. The picture is a 28*28 pixel matrix, and the label is 10 numbers from 0 to 9
  • Operating environment: PaddlePaddle2.0 + cuda11.1 + pycharm

  Tips: The new version of PaddlePaddle2.0 ship, the newly added high-level API simplifies the model building process, and is convenient for quick hands-on practice!

Model building process

  Next, we mainly conduct experiments around this process, as shown in the figure:
General process of deep learning
further, in the model training, we mainly do the tasks shown in the following figure:

Insert picture description here
  Tips : The overall process framework here (why we should follow this process) can refer to the explanation of the regression ( Pokémon) of teacher Li Hongyi , about BP neural network (especially backpropagation and the chain derivation rule) ) You can refer to the watermelon book (summary) and the flower book (detailed). In addition, the gradient optimization can also refer to the statistical learning method of Li Hang and the gradient decent part of Li Hongyi . I will write a summary later, and I will not expand it here. Please forgive me~

Data preprocessing

  The flying paddle has built-in MNIST data set, just call it. Define the training set train_datasetand test set of the data set test_dataset. Then use the Normalizeinterface to normalize the picture.

import paddle
import numpy as np
import matplotlib.pyplot as plt


import paddle.vision.transforms as T

# 数据的加载和预处理
transform = T.Normalize(mean=[127.5], std=[127.5])

# 训练数据集
train_dataset = paddle.vision.datasets.MNIST(mode='train', transform=transform)

# 评估数据集
eval_dataset = paddle.vision.datasets.MNIST(mode='test', transform=transform)

print('训练集样本量: {},验证集样本量: {}'.format(len(train_dataset), len(eval_dataset)))

Insert picture description here

  Why do we need to normalize? Here is a pre-processed picture for illustration.

print('图片:')
print(type(train_dataset[0][0]))
print(train_dataset[0][0])
print('标签:')
print(type(train_dataset[0][1]))
print(train_dataset[0][1])

# 可视化展示
plt.figure()
plt.imshow(train_dataset[0][0].reshape([28,28]), cmap=plt.cm.binary)
plt.show()

As shown in the figure, the range of the pixel matrix value after normalization is no longer 0 ~ 255, but compressed to -1 ~ 1. Obviously it is convenient to perform calculations later. For the normalization method, we use a uniform mean and standard deviation to calculate each channel of the image.

Insert picture description here

  Then come to pay attention Normalizeto what can be done with the interface?

class paddle.vision.Normalize(mean=0.0, std=1.0, data_format='CHW', to_rgb=False, keys=None)

We have just mentioned the processing method of image normalization, and in this interface, the calculation process is as follows:
output [channel] = (input [channel] − mean [channel]) / std [channel] output[channel] = ( input[channel]-mean[channel]) / std[channel]output[channel]=(input[channel]m e a n [ c h a n n e l ] ) / s t d [ c h a n n e l ]
Definition of related parameters used this time:

  • mean: the normalized mean for each channel
  • std: the standard deviation used for the normalization of each channel
  • data_format (str, optional): The format of the data, it must be'HWC' or'CHW'. Default value:'CHW'

This method returns the normalized image, the return type is numpy ndarray(numpy n-dimensional array object).

Model networking

  Now start to design the neural network, using a single hidden layer fully connected network. Input layer neurons 784 (28 pixels * 28 pixels), hidden layer 512 neurons (can be customized at will), output layer 10 neurons (obviously this is a multi-classification task, divided into 0-9 numbers).
Insert picture description here
The model building code is as follows:

# 模型网络结构搭建
network = paddle.nn.Sequential(
    paddle.nn.Flatten(),           # 拉平,将 (28, 28) => (784)
    paddle.nn.Linear(784, 512),    # 隐层:线性变换层
    paddle.nn.ReLU(),              # 激活函数
    paddle.nn.Linear(512, 10)      # 输出层
)

# 模型封装
model = paddle.Model(network)

# 模型可视化
model.summary((1, 28, 28))

Insert picture description here
  Here we Sequentialdefine the neural network. Note: The Sequentialinterface is the sequential container provided by paddlepaddle . among them,

1. FlattenInterface, flatten a continuous-dimensional Tensor into a one-dimensional Tensor. In short, it is to flatten the 28*28 pixels.
2. LinearInterface, set the hidden layer and output layer to linear transformation layer. That is:
O ut = XW + b Out = XW + bOut=X W+b
3.ReLUInterface, use the relu activation function to process the result of the neuron's linear transformation, and then as the output value, output to the next layer
ReLU (x) = max (0, x) ReLU(x)=max(0, x)R and L U ( x )=max(0,x)
Insert picture description here

After that, the model is encapsulated and visualized to confirm the success of the model construction.

Training model

  Now start to configure the loss function, optimizer, and evaluation indicators. Here we use the gradient descent method to optimize the parameters of the neural network. Among them, we use the Adam optimizer to dynamically adjust the learning rate (learning rate) of each parameter. Paddlepaddle also provides the corresponding interface . It is recommended to see the interface document. Adam algorithm paper.
  Then start training the model.

# 配置优化器、损失函数、评估指标
model.prepare(paddle.optimizer.Adam(learning_rate=0.001, parameters=network.parameters()),
              paddle.nn.CrossEntropyLoss(),
              paddle.metric.Accuracy())
              
# 启动模型全流程训练
model.fit(train_dataset,  # 训练数据集
          eval_dataset,   # 评估数据集
          epochs=5,       # 训练的总轮次
          batch_size=64,  # 训练使用的批大小
          verbose=1)      # 日志展示形式

Insert picture description here

Evaluation model

  Evaluate the model to get the accuracy (accuracy).


# 模型评估,根据prepare接口配置的loss和metric进行返回
result = model.evaluate(eval_dataset, verbose=1)

print(result)

Insert picture description here

Model prediction

Batch prediction

  Use predictfor batch prediction.
  Extracted from official documents , the high-level API provides an predictinterface to facilitate users to predict and verify the trained model. You only need to put the data that needs to be predicted and tested into the interface for calculation based on the trained model, and the interface will pass the model The calculated prediction result is returned.
  The return format is a list, the number of elements corresponds to the output number of the model:

  • The model is a single output:[(numpy_ndarray_1, numpy_ndarray_2, …, numpy_ndarray_n)]
  • The model is multi-output:[(numpy_ndarray_1, numpy_ndarray_2, …, numpy_ndarray_n), (numpy_ndarray_1, numpy_ndarray_2, …, numpy_ndarray_n), …]
  • Note: It numpy_ndarray_nis the predicted data obtained after the corresponding original data is calculated by the model, and the number corresponds to the number of the predicted data set.
# 进行预测操作
result = model.predict(eval_dataset)

# 定义画图方法
def show_img(img, predict):
    plt.figure()
    plt.title('predict: {}'.format(predict))
    plt.imshow(img.reshape([28, 28]), cmap=plt.cm.binary)
    plt.show()

# 抽样展示
indexs = [2, 15, 38, 211]

for idx in indexs:
    show_img(eval_dataset[idx][0], np.argmax(result[0][idx]))

Insert picture description here

Single picture prediction

  Use model.predict_batch to predict a single or a small number of multiple pictures.


# 读取单张图片
image = eval_dataset[501][0]

# 单张图片预测
result = model.predict_batch([image])

# 可视化结果
show_img(image, np.argmax(result))

Insert picture description here

deploy

Save model

# 保存用于后续继续调优训练的模型
model.save('finetuning/mnist')

Insert picture description here

Continue tuning training

from paddle.static import InputSpec


# 模型封装,为了后面保存预测模型,这里传入了inputs参数
model_2 = paddle.Model(network, inputs=[InputSpec(shape=[-1, 28, 28], dtype='float32', name='image')])

# 加载之前保存的阶段训练模型
model_2.load('finetuning/mnist')

# 模型配置
model_2.prepare(paddle.optimizer.Adam(learning_rate=0.001, parameters=network.parameters()),
                paddle.nn.CrossEntropyLoss(),
                paddle.metric.Accuracy())

# 模型全流程训练
model_2.fit(train_dataset, 
            eval_dataset,
            epochs=2,
            batch_size=64,
            verbose=1)

Insert picture description here

Save the prediction model

# 保存用于后续推理部署的模型
model_2.save('infer/mnist', training=False)

Insert picture description here

reference

Guess you like

Origin blog.csdn.net/weixin_40807714/article/details/113625151