MNIST handwritten digit recognition - based on Mindspore to quickly build a perceptron to achieve ten classes

Rapid construction of perceptron based on deep learning framework to realize binary classification of handwritten digits 

        Without relying on mathematical knowledge, use the MindSpore deep learning framework to quickly implement the method of model structure definition, loss function definition, and gradient descent.

        In the previous section, the model structure of the perceptron was realized from scratch, the loss function and evaluation function were defined, and the formula of gradient descent was manually derived. Finally, after 3000 epoch training, the perceptron model made the handwritten numbers 0 and 1 The accuracy rate of more than 0.9 has been achieved on the two classification task.

        This method of implementing the entire machine learning process from scratch is relatively laborious, especially in the manual derivation of the gradient descent formula, which requires certain mathematical knowledge. If another loss function is replaced, a new gradient descent formula has to be deduced again, which is laborious for the adjustment model.

        Fortunately, there are already many deep learning frameworks today, which have friendly encapsulated the process of model structure definition, loss function definition, gradient descent implementation, etc., and only need to make some simple function calls to complete the machine learning training process. There is no need to pay attention to how the underlying gradient descent is implemented, which greatly improves the efficiency of model development.

Next, use the MindSpore framework to quickly implement the perceptron model, and perform binary classification of handwritten digits.

1. Load the dataset 

Since the load_data_zeros_ones function has been defined in the previous section, it can be called directly 

import os
import sys

import moxing as mox

datasets_dir = '../datasets'
if not os.path.exists(datasets_dir):
    os.makedirs(datasets_dir)
    
if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
    mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip', 
                  os.path.join(datasets_dir, 'MNIST_Data.zip'))
    os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
    
sys.path.insert(0, os.path.join(os.getcwd(), '../datasets/MNIST_Data'))
from load_data_zeros_ones import load_data_zeros_ones

train_images, train_labels, test_images, test_labels = load_data_zeros_ones(datasets_dir)

Number 0, training set size: 5923, test set size: 980

Number 1, training set size: 6742, test set size: 1135

 The data format returned by the load_data_zeros_ones function is np.ndarray format, but the format required in MindSpore is Tensor format, so the following code should be executed to convert the data format

import mindspore
from mindspore import Tensor

# 重新调整数据集形状
train_images = train_images.reshape((-1,1,28,28))
train_labels = train_labels.flatten()

test_images = test_images.reshape((-1,1,28,28))
test_labels = test_labels.flatten()

train_size = len(train_labels)
test_size = len(test_labels)

# 转变为mindspore支持的tensor格式的数据
train_images =  Tensor(train_images, mindspore.float32)
train_labels = Tensor(train_labels, mindspore.int32)
test_images = Tensor(test_images, mindspore.float32)
test_labels = Tensor(test_labels, mindspore.int32)

 2. Define the network structure and evaluation function

        It is very simple to use MindSpore to implement the perceptron model. You only need to call nn.Dense to define a fully connected layer, plus a Sigmoid unit, and nn.Dense will automatically initialize the weight w and threshold bias b. The code is as follows :

import mindspore.nn as nn
import mindspore.ops as ops
from mindspore.common.initializer import Normal

class Network(nn.Cell):
    def __init__(self, num_of_weights):
        super(Network, self).__init__()
        self.fc = nn.Dense(in_channels=num_of_weights, out_channels=2)  # 定义一个全连接层
        self.nonlinearity = nn.Sigmoid()
        self.flatten = nn.Flatten()
    
    def construct(self, x):  # 加权求和单元和非线性函数单元通过定义计算过程来实现
        x = self.flatten(x)
        z = self.fc(x)
        pred_y = self.nonlinearity(z)
        return pred_y

# 评价函数
def evaluate(pred_y, true_y): 
    pred_labels = ops.Argmax(output_type=mindspore.int32)(pred_y)
    correct_num = (pred_labels == true_y).asnumpy().sum().item()
    return correct_num

 3. Define the cross-entropy loss function and optimizer

        To train a neural network model, a loss function and an optimizer need to be defined.

        The loss functions supported by MindSpore include SoftmaxCrossEntropyWithLogits, L1Loss, MSELoss, etc. Here the cross entropy loss function SoftmaxCrossEntropyWithLogits is used.

        The optimizers supported by MindSpore include Adam, AdamWeightDecay, SGD, Momentum, etc. Here we use the Momentum optimizer as an example.

# 损失函数
net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')

# 创建网络
network = Network(28*28)  
lr = 0.01
momentum = 0.9

# 优化器
net_opt = nn.Momentum(network.trainable_params(), lr, momentum)

4. Implement the training function

def train(network, max_epochs= 50):
    net = WithLossCell(network, net_loss)
    train_network = TrainOneStepCell(net, net_opt)
    train_network.set_train()
    for epoch in range(1, max_epochs + 1):
        train_correct_num = 0.0
        test_correct_num = 0.0
        output = train_network(train_images,train_labels)
        pred_train_labels = network.construct(train_images)  # 前向传播
        train_correct_num = evaluate(pred_train_labels, train_labels)
        train_acc = float(train_correct_num) / train_size

        if (epoch == 1) or (epoch % 10 == 0):
            pred_test_labels = network.construct(test_images)
            test_correct_num = evaluate(pred_test_labels, test_labels)
            test_acc = test_correct_num / test_size
            print("epoch: {0}/{1}, train_losses: {2:.4f}, tain_acc: {3:.4f}, test_acc: {4:.4f}" \
                  .format(epoch, max_epochs, output.asnumpy(), train_acc, test_acc, cflush=True))

5. Configure running information

        Before formal training, use context.set_context to configure the information required for running, such as running mode, backend information, hardware and other information.

from mindspore import context
context.set_context(mode=context.GRAPH_MODE, device_target="CPU")  # device_target 可选 CPU/GPU, 当选择GPU时mindspore规格也需要切换到GPU

 6. Start training

import time
from mindspore.nn import WithLossCell, TrainOneStepCell

max_epochs = 50
start_time = time.time()
print("*"*10 + "开始训练" + "*"*10)
train(network, max_epochs= max_epochs)
print("*"*10 + "训练完成" + "*"*10)
cost_time = round(time.time() - start_time, 1)
print("训练总耗时: %.1f s" % cost_time)
**********开始训练**********

epoch: 1/50, train_losses: 0.7050, tain_acc: 0.3516, test_acc: 0.3759

epoch: 10/50, train_losses: 0.5338, tain_acc: 0.9901, test_acc: 0.9943

epoch: 20/50, train_losses: 0.3990, tain_acc: 0.9949, test_acc: 0.9981

epoch: 30/50, train_losses: 0.3593, tain_acc: 0.9935, test_acc: 0.9972

epoch: 40/50, train_losses: 0.3468, tain_acc: 0.9934, test_acc: 0.9967

epoch: 50/50, train_losses: 0.3410, tain_acc: 0.9938, test_acc: 0.9967

**********训练完成**********

训练总耗时: 9.1 s

        From the above results, we can see that the perceptron model implemented by MindSpore achieved an accuracy rate of 0.9967 after training for 50 epochs in 20 seconds. Compared with the implementation in the previous section, the training is faster and better. This shows that using MindSpore to develop models not only has higher development efficiency, but also achieves better results. This is the advantage of using the deep learning framework MindSpore. 

Extend from binary to ten-category 

        Using the deep learning framework MindSpore, using its friendly package modules, model structure definition, loss function definition, gradient descent implementation and other processes, model training can be realized with simple function calls, which greatly improves the efficiency of model development.

 1. Load the dataset

 Load the full, ten-category dataset

import os
import numpy as np

import moxing as mox
import mindspore.dataset as ds

datasets_dir = '../datasets'
if not os.path.exists(datasets_dir):
    os.makedirs(datasets_dir)
    
if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
    mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip', 
                  os.path.join(datasets_dir, 'MNIST_Data.zip'))
    os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
    
# 读取完整训练样本和测试样本
mnist_ds_train = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/train"))
mnist_ds_test = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/test"))
train_len = mnist_ds_train.get_dataset_size()
test_len = mnist_ds_test.get_dataset_size()
print('训练集规模:', train_len, ',测试集规模:', test_len)

Training set size: 60000, test set size: 10000

 view 10 samples

from PIL import Image
items_train = mnist_ds_train.create_dict_iterator(output_numpy=True)

train_data = np.array([i for i in items_train])
images_train = np.array([i["image"] for i in train_data])
labels_train = np.array([i["label"] for i in train_data])

batch_size = 10  # 查看10个样本
batch_label = [lab for lab in labels_train[:10]]
print(batch_label)
batch_img = images_train[0].reshape(28, 28)
for i in range(1, batch_size):
    batch_img = np.hstack((batch_img, images_train[i].reshape(28, 28)))  # 将一批图片水平拼接起来,方便下一步进行显示
Image.fromarray(batch_img)
[0, 2, 2, 7, 8, 4, 9, 1, 8, 8]

 2. Process the dataset

        Data sets are very important for training. A good data set can effectively improve training accuracy and efficiency. Before using the data set, some processing is usually performed on the data set.

         Perform data augmentation operations

import mindspore.dataset.vision.c_transforms as CV
import mindspore.dataset.transforms.c_transforms as C
from mindspore.dataset.vision import Inter
from mindspore import dtype as mstype

num_parallel_workers = 1
resize_height, resize_width = 28, 28

# according to the parameters, generate the corresponding data enhancement method
resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR)  # 对图像数据像素进行缩放
type_cast_op = C.TypeCast(mstype.int32)  # 将数据类型转化为int32。
hwc2chw_op = CV.HWC2CHW()  # 对图像数据张量进行变换,张量形式由高x宽x通道(HWC)变为通道x高x宽(CHW),方便进行数据训练。

# using map to apply operations to a dataset
mnist_ds_train = mnist_ds_train.map(operations=resize_op, input_columns="image", num_parallel_workers=num_parallel_workers)
mnist_ds_train = mnist_ds_train.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers)
mnist_ds_train = mnist_ds_train.map(operations=hwc2chw_op, input_columns="image", num_parallel_workers=num_parallel_workers)

buffer_size = 10000
mnist_ds_train = mnist_ds_train.shuffle(buffer_size=buffer_size)  # 打乱训练集的顺序

        Perform data normalization

        Standardize and normalize the image data so that the value of each pixel is in the range of (0,1), which can improve the training efficiency. 

rescale = 1.0 / 255.0
shift = 0.0

rescale_nml = 1 / 0.3081
shift_nml = -1 * 0.1307 / 0.3081

rescale_op = CV.Rescale(rescale, shift) 
mnist_ds_train = mnist_ds_train.map(operations=rescale_op, input_columns="image", num_parallel_workers=num_parallel_workers)

rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)
mnist_ds_train = mnist_ds_train.map(operations=rescale_nml_op, input_columns="image", num_parallel_workers=num_parallel_workers)

mnist_ds_train = mnist_ds_train.batch(60000, drop_remainder=True)  # 对数据集进行分批,此处加载完整的训练集

3. Encapsulate into a function

At this point, the preparation of the training data is completed, and the above operations can be encapsulated into the load_data_all function and the process_dataset function, so that they can be used again later.

        Define data processing operations

   Define a function process_dataset to perform data enhancement and processing operations:

  • Define some parameters needed for data augmentation and processing.

  • According to the parameters, generate the corresponding data augmentation operation.

  • Apply data operations to datasets using the map mapping function.

  • Process the generated dataset.

%%writefile ../datasets/MNIST_Data/process_dataset.py
def process_dataset(mnist_ds, batch_size=32, resize= 28, repeat_size=1,
                   num_parallel_workers=1):
    """
    process_dataset for train or test

    Args:
        mnist_ds (str): MnistData path
        batch_size (int): The number of data records in each group
        resize (int): Scale image data pixels
        repeat_size (int): The number of replicated data records
        num_parallel_workers (int): The number of parallel workers
    """

    import mindspore.dataset.vision.c_transforms as CV
    import mindspore.dataset.transforms.c_transforms as C
    from mindspore.dataset.vision import Inter
    from mindspore import dtype as mstype
    
    # define some parameters needed for data enhancement and rough justification
    resize_height, resize_width = resize, resize
    rescale = 1.0 / 255.0
    shift = 0.0
    rescale_nml = 1 / 0.3081
    shift_nml = -1 * 0.1307 / 0.3081

    # according to the parameters, generate the corresponding data enhancement method
    resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR)
    rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)
    rescale_op = CV.Rescale(rescale, shift)
    hwc2chw_op = CV.HWC2CHW()
    type_cast_op = C.TypeCast(mstype.int32)
    c_trans = [resize_op, rescale_op, rescale_nml_op, hwc2chw_op]

    # using map to apply operations to a dataset
    mnist_ds = mnist_ds.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers)
    mnist_ds = mnist_ds.map(operations=c_trans, input_columns="image", num_parallel_workers=num_parallel_workers)

    # process the generated dataset
    buffer_size = 10000
    mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size)
    mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)
    mnist_ds = mnist_ds.repeat(repeat_size)

    return mnist_ds

         Define the data loading function

%%writefile ../datasets/MNIST_Data/load_data_all.py
def load_data_all(datasets_dir):
    import os
    if not os.path.exists(datasets_dir):
        os.makedirs(datasets_dir)
    import moxing as mox
    if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
        mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip', 
                      os.path.join(datasets_dir, 'MNIST_Data.zip'))
        os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
    
    # 读取完整训练样本和测试样本
    import mindspore.dataset as ds
    datasets_dir = '../datasets'
    mnist_ds_train = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/train"))
    mnist_ds_test = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/test"))
    train_len = mnist_ds_train.get_dataset_size()
    test_len = mnist_ds_test.get_dataset_size()
    print('训练集规模:', train_len, ',测试集规模:', test_len)
    
    return mnist_ds_train, mnist_ds_test, train_len, test_len

4. Load the processed test set

import os, sys 
sys.path.insert(0, os.path.join(os.getcwd(), '../datasets/MNIST_Data'))
from process_dataset import process_dataset
mnist_ds_test = process_dataset(mnist_ds_test, batch_size= 10000)

5. Define the network structure and evaluation function

import mindspore
import mindspore.nn as nn
import mindspore.ops as ops
from mindspore.common.initializer import Normal

class Network(nn.Cell):
    def __init__(self, num_of_weights):
        super(Network, self).__init__()
        self.fc = nn.Dense(in_channels=num_of_weights, out_channels=10, weight_init=Normal(0.02))  # 定义一个全连接层
        self.nonlinearity = nn.Sigmoid()
        self.flatten = nn.Flatten()
    
    def construct(self, x):  # 加权求和单元和非线性函数单元通过定义计算过程来实现
        x = self.flatten(x)
        z = self.fc(x)
        pred_y = self.nonlinearity(z)
        return pred_y
    
def evaluate(pred_y, true_y): 
    pred_labels = ops.Argmax(output_type=mindspore.int32)(pred_y)
    correct_num = (pred_labels == true_y).asnumpy().sum().item()
    return correct_num

6. Define the cross-entropy loss function and optimizer

# 损失函数
net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')

# 创建网络
network = Network(28*28)  
lr = 0.01
momentum = 0.9

# 优化器
net_opt = nn.Momentum(network.trainable_params(), lr, momentum)

7. Implement the training function

def train(network, mnist_ds_train, max_epochs= 50):
    net = WithLossCell(network, net_loss)
    net = TrainOneStepCell(net, net_opt)
    network.set_train()
    for epoch in range(1, max_epochs + 1):
        train_correct_num = 0.0
        test_correct_num = 0.0
        for inputs_train in mnist_ds_train:
            output = net(*inputs_train)
            train_x = inputs_train[0]
            train_y = inputs_train[1]
            pred_y_train = network.construct(train_x)  # 前向传播
            train_correct_num += evaluate(pred_y_train, train_y)
        train_acc = float(train_correct_num) / train_len
        
        for inputs_test in mnist_ds_test:
            test_x = inputs_test[0]
            test_y = inputs_test[1]
            pred_y_test = network.construct(test_x)
            test_correct_num += evaluate(pred_y_test, test_y)
        test_acc = float(test_correct_num) / test_len
        if (epoch == 1) or (epoch % 10 == 0):
            print("epoch: {0}/{1}, train_losses: {2:.4f}, tain_acc: {3:.4f}, test_acc: {4:.4f}" \
                  .format(epoch, max_epochs, output.asnumpy(), train_acc, test_acc, cflush=True))

 8. Configure running information

        Before formal training, use context.set_context to configure the information required for running, such as running mode, backend information, hardware and other information.

from mindspore import context
context.set_context(mode=context.GRAPH_MODE, device_target="CPU")  # device_target 可选 CPU/GPU, 当选择GPU时mindspore规格也需要切换到GPU

9. Start training

import time
from mindspore.nn import WithLossCell, TrainOneStepCell

max_epochs = 100
start_time = time.time()
print("*"*10 + "开始训练" + "*"*10)
train(network, mnist_ds_train, max_epochs= max_epochs)
print("*"*10 + "训练完成" + "*"*10)
cost_time = round(time.time() - start_time, 1)
print("训练总耗时: %.1f s" % cost_time)

**********开始训练**********

epoch: 1/100, train_losses: 2.2832, tain_acc: 0.1698, test_acc: 0.1626

epoch: 10/100, train_losses: 2.0465, tain_acc: 0.6343, test_acc: 0.6017

epoch: 20/100, train_losses: 1.8368, tain_acc: 0.7918, test_acc: 0.7812

epoch: 30/100, train_losses: 1.7602, tain_acc: 0.8138, test_acc: 0.8017

epoch: 40/100, train_losses: 1.7245, tain_acc: 0.8238, test_acc: 0.7972

epoch: 50/100, train_losses: 1.7051, tain_acc: 0.8337, test_acc: 0.8044

epoch: 60/100, train_losses: 1.6922, tain_acc: 0.8403, test_acc: 0.8047

epoch: 70/100, train_losses: 1.6827, tain_acc: 0.8454, test_acc: 0.8033

epoch: 80/100, train_losses: 1.6752, tain_acc: 0.8501, test_acc: 0.8051

epoch: 90/100, train_losses: 1.6689, tain_acc: 0.8536, test_acc: 0.8049 

epoch: 100/100, train_losses: 1.6635, tain_acc: 0.8569, test_acc: 0.8037 

***********training complete**** **** 

Total training time: 430.7 s

So far, with a small amount of modification to the code based on the binary classification of handwritten digits, the ten-category of handwritten digit recognition has been quickly realized;

The modification process is very simple, but as can be seen from the above results, the model trained for 100 epochs, and only reached 80% accuracy in the handwritten digit recognition task, while in the previous section, the two-category task In practice, the model trained for 50 epochs achieved an accuracy rate of 99%, indicating that on a simple model such as a perceptron, handwritten digit recognition is more difficult than binary classification.

Guess you like

Origin blog.csdn.net/m0_54776464/article/details/125921059