Rapid construction of perceptron based on deep learning framework to realize binary classification of handwritten digits
Without relying on mathematical knowledge, use the MindSpore deep learning framework to quickly implement the method of model structure definition, loss function definition, and gradient descent.
In the previous section, the model structure of the perceptron was realized from scratch, the loss function and evaluation function were defined, and the formula of gradient descent was manually derived. Finally, after 3000 epoch training, the perceptron model made the handwritten numbers 0 and 1 The accuracy rate of more than 0.9 has been achieved on the two classification task.
This method of implementing the entire machine learning process from scratch is relatively laborious, especially in the manual derivation of the gradient descent formula, which requires certain mathematical knowledge. If another loss function is replaced, a new gradient descent formula has to be deduced again, which is laborious for the adjustment model.
Fortunately, there are already many deep learning frameworks today, which have friendly encapsulated the process of model structure definition, loss function definition, gradient descent implementation, etc., and only need to make some simple function calls to complete the machine learning training process. There is no need to pay attention to how the underlying gradient descent is implemented, which greatly improves the efficiency of model development.
Next, use the MindSpore framework to quickly implement the perceptron model, and perform binary classification of handwritten digits.
1. Load the dataset
Since the load_data_zeros_ones function has been defined in the previous section, it can be called directly
import os
import sys
import moxing as mox
datasets_dir = '../datasets'
if not os.path.exists(datasets_dir):
os.makedirs(datasets_dir)
if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip',
os.path.join(datasets_dir, 'MNIST_Data.zip'))
os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
sys.path.insert(0, os.path.join(os.getcwd(), '../datasets/MNIST_Data'))
from load_data_zeros_ones import load_data_zeros_ones
train_images, train_labels, test_images, test_labels = load_data_zeros_ones(datasets_dir)
Number 0, training set size: 5923, test set size: 980
Number 1, training set size: 6742, test set size: 1135
The data format returned by the load_data_zeros_ones function is np.ndarray format, but the format required in MindSpore is Tensor format, so the following code should be executed to convert the data format
import mindspore
from mindspore import Tensor
# 重新调整数据集形状
train_images = train_images.reshape((-1,1,28,28))
train_labels = train_labels.flatten()
test_images = test_images.reshape((-1,1,28,28))
test_labels = test_labels.flatten()
train_size = len(train_labels)
test_size = len(test_labels)
# 转变为mindspore支持的tensor格式的数据
train_images = Tensor(train_images, mindspore.float32)
train_labels = Tensor(train_labels, mindspore.int32)
test_images = Tensor(test_images, mindspore.float32)
test_labels = Tensor(test_labels, mindspore.int32)
2. Define the network structure and evaluation function
It is very simple to use MindSpore to implement the perceptron model. You only need to call nn.Dense to define a fully connected layer, plus a Sigmoid unit, and nn.Dense will automatically initialize the weight w and threshold bias b. The code is as follows :
import mindspore.nn as nn
import mindspore.ops as ops
from mindspore.common.initializer import Normal
class Network(nn.Cell):
def __init__(self, num_of_weights):
super(Network, self).__init__()
self.fc = nn.Dense(in_channels=num_of_weights, out_channels=2) # 定义一个全连接层
self.nonlinearity = nn.Sigmoid()
self.flatten = nn.Flatten()
def construct(self, x): # 加权求和单元和非线性函数单元通过定义计算过程来实现
x = self.flatten(x)
z = self.fc(x)
pred_y = self.nonlinearity(z)
return pred_y
# 评价函数
def evaluate(pred_y, true_y):
pred_labels = ops.Argmax(output_type=mindspore.int32)(pred_y)
correct_num = (pred_labels == true_y).asnumpy().sum().item()
return correct_num
3. Define the cross-entropy loss function and optimizer
To train a neural network model, a loss function and an optimizer need to be defined.
The loss functions supported by MindSpore include SoftmaxCrossEntropyWithLogits, L1Loss, MSELoss, etc. Here the cross entropy loss function SoftmaxCrossEntropyWithLogits is used.
The optimizers supported by MindSpore include Adam, AdamWeightDecay, SGD, Momentum, etc. Here we use the Momentum optimizer as an example.
# 损失函数
net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
# 创建网络
network = Network(28*28)
lr = 0.01
momentum = 0.9
# 优化器
net_opt = nn.Momentum(network.trainable_params(), lr, momentum)
4. Implement the training function
def train(network, max_epochs= 50):
net = WithLossCell(network, net_loss)
train_network = TrainOneStepCell(net, net_opt)
train_network.set_train()
for epoch in range(1, max_epochs + 1):
train_correct_num = 0.0
test_correct_num = 0.0
output = train_network(train_images,train_labels)
pred_train_labels = network.construct(train_images) # 前向传播
train_correct_num = evaluate(pred_train_labels, train_labels)
train_acc = float(train_correct_num) / train_size
if (epoch == 1) or (epoch % 10 == 0):
pred_test_labels = network.construct(test_images)
test_correct_num = evaluate(pred_test_labels, test_labels)
test_acc = test_correct_num / test_size
print("epoch: {0}/{1}, train_losses: {2:.4f}, tain_acc: {3:.4f}, test_acc: {4:.4f}" \
.format(epoch, max_epochs, output.asnumpy(), train_acc, test_acc, cflush=True))
5. Configure running information
Before formal training, use context.set_context to configure the information required for running, such as running mode, backend information, hardware and other information.
from mindspore import context
context.set_context(mode=context.GRAPH_MODE, device_target="CPU") # device_target 可选 CPU/GPU, 当选择GPU时mindspore规格也需要切换到GPU
6. Start training
import time
from mindspore.nn import WithLossCell, TrainOneStepCell
max_epochs = 50
start_time = time.time()
print("*"*10 + "开始训练" + "*"*10)
train(network, max_epochs= max_epochs)
print("*"*10 + "训练完成" + "*"*10)
cost_time = round(time.time() - start_time, 1)
print("训练总耗时: %.1f s" % cost_time)
**********开始训练********** epoch: 1/50, train_losses: 0.7050, tain_acc: 0.3516, test_acc: 0.3759 epoch: 10/50, train_losses: 0.5338, tain_acc: 0.9901, test_acc: 0.9943 epoch: 20/50, train_losses: 0.3990, tain_acc: 0.9949, test_acc: 0.9981 epoch: 30/50, train_losses: 0.3593, tain_acc: 0.9935, test_acc: 0.9972 epoch: 40/50, train_losses: 0.3468, tain_acc: 0.9934, test_acc: 0.9967 epoch: 50/50, train_losses: 0.3410, tain_acc: 0.9938, test_acc: 0.9967 **********训练完成********** 训练总耗时: 9.1 s
From the above results, we can see that the perceptron model implemented by MindSpore achieved an accuracy rate of 0.9967 after training for 50 epochs in 20 seconds. Compared with the implementation in the previous section, the training is faster and better. This shows that using MindSpore to develop models not only has higher development efficiency, but also achieves better results. This is the advantage of using the deep learning framework MindSpore.
Extend from binary to ten-category
Using the deep learning framework MindSpore, using its friendly package modules, model structure definition, loss function definition, gradient descent implementation and other processes, model training can be realized with simple function calls, which greatly improves the efficiency of model development.
1. Load the dataset
Load the full, ten-category dataset
import os
import numpy as np
import moxing as mox
import mindspore.dataset as ds
datasets_dir = '../datasets'
if not os.path.exists(datasets_dir):
os.makedirs(datasets_dir)
if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip',
os.path.join(datasets_dir, 'MNIST_Data.zip'))
os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
# 读取完整训练样本和测试样本
mnist_ds_train = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/train"))
mnist_ds_test = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/test"))
train_len = mnist_ds_train.get_dataset_size()
test_len = mnist_ds_test.get_dataset_size()
print('训练集规模:', train_len, ',测试集规模:', test_len)
Training set size: 60000, test set size: 10000
view 10 samples
from PIL import Image
items_train = mnist_ds_train.create_dict_iterator(output_numpy=True)
train_data = np.array([i for i in items_train])
images_train = np.array([i["image"] for i in train_data])
labels_train = np.array([i["label"] for i in train_data])
batch_size = 10 # 查看10个样本
batch_label = [lab for lab in labels_train[:10]]
print(batch_label)
batch_img = images_train[0].reshape(28, 28)
for i in range(1, batch_size):
batch_img = np.hstack((batch_img, images_train[i].reshape(28, 28))) # 将一批图片水平拼接起来,方便下一步进行显示
Image.fromarray(batch_img)
[0, 2, 2, 7, 8, 4, 9, 1, 8, 8]
2. Process the dataset
Data sets are very important for training. A good data set can effectively improve training accuracy and efficiency. Before using the data set, some processing is usually performed on the data set.
Perform data augmentation operations
import mindspore.dataset.vision.c_transforms as CV
import mindspore.dataset.transforms.c_transforms as C
from mindspore.dataset.vision import Inter
from mindspore import dtype as mstype
num_parallel_workers = 1
resize_height, resize_width = 28, 28
# according to the parameters, generate the corresponding data enhancement method
resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR) # 对图像数据像素进行缩放
type_cast_op = C.TypeCast(mstype.int32) # 将数据类型转化为int32。
hwc2chw_op = CV.HWC2CHW() # 对图像数据张量进行变换,张量形式由高x宽x通道(HWC)变为通道x高x宽(CHW),方便进行数据训练。
# using map to apply operations to a dataset
mnist_ds_train = mnist_ds_train.map(operations=resize_op, input_columns="image", num_parallel_workers=num_parallel_workers)
mnist_ds_train = mnist_ds_train.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers)
mnist_ds_train = mnist_ds_train.map(operations=hwc2chw_op, input_columns="image", num_parallel_workers=num_parallel_workers)
buffer_size = 10000
mnist_ds_train = mnist_ds_train.shuffle(buffer_size=buffer_size) # 打乱训练集的顺序
Perform data normalization
Standardize and normalize the image data so that the value of each pixel is in the range of (0,1), which can improve the training efficiency.
rescale = 1.0 / 255.0
shift = 0.0
rescale_nml = 1 / 0.3081
shift_nml = -1 * 0.1307 / 0.3081
rescale_op = CV.Rescale(rescale, shift)
mnist_ds_train = mnist_ds_train.map(operations=rescale_op, input_columns="image", num_parallel_workers=num_parallel_workers)
rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)
mnist_ds_train = mnist_ds_train.map(operations=rescale_nml_op, input_columns="image", num_parallel_workers=num_parallel_workers)
mnist_ds_train = mnist_ds_train.batch(60000, drop_remainder=True) # 对数据集进行分批,此处加载完整的训练集
3. Encapsulate into a function
At this point, the preparation of the training data is completed, and the above operations can be encapsulated into the load_data_all function and the process_dataset function, so that they can be used again later.
Define data processing operations
Define a function process_dataset to perform data enhancement and processing operations:
-
Define some parameters needed for data augmentation and processing.
-
According to the parameters, generate the corresponding data augmentation operation.
-
Apply data operations to datasets using the map mapping function.
-
Process the generated dataset.
%%writefile ../datasets/MNIST_Data/process_dataset.py
def process_dataset(mnist_ds, batch_size=32, resize= 28, repeat_size=1,
num_parallel_workers=1):
"""
process_dataset for train or test
Args:
mnist_ds (str): MnistData path
batch_size (int): The number of data records in each group
resize (int): Scale image data pixels
repeat_size (int): The number of replicated data records
num_parallel_workers (int): The number of parallel workers
"""
import mindspore.dataset.vision.c_transforms as CV
import mindspore.dataset.transforms.c_transforms as C
from mindspore.dataset.vision import Inter
from mindspore import dtype as mstype
# define some parameters needed for data enhancement and rough justification
resize_height, resize_width = resize, resize
rescale = 1.0 / 255.0
shift = 0.0
rescale_nml = 1 / 0.3081
shift_nml = -1 * 0.1307 / 0.3081
# according to the parameters, generate the corresponding data enhancement method
resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR)
rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)
rescale_op = CV.Rescale(rescale, shift)
hwc2chw_op = CV.HWC2CHW()
type_cast_op = C.TypeCast(mstype.int32)
c_trans = [resize_op, rescale_op, rescale_nml_op, hwc2chw_op]
# using map to apply operations to a dataset
mnist_ds = mnist_ds.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers)
mnist_ds = mnist_ds.map(operations=c_trans, input_columns="image", num_parallel_workers=num_parallel_workers)
# process the generated dataset
buffer_size = 10000
mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size)
mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)
mnist_ds = mnist_ds.repeat(repeat_size)
return mnist_ds
Define the data loading function
%%writefile ../datasets/MNIST_Data/load_data_all.py
def load_data_all(datasets_dir):
import os
if not os.path.exists(datasets_dir):
os.makedirs(datasets_dir)
import moxing as mox
if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip',
os.path.join(datasets_dir, 'MNIST_Data.zip'))
os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
# 读取完整训练样本和测试样本
import mindspore.dataset as ds
datasets_dir = '../datasets'
mnist_ds_train = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/train"))
mnist_ds_test = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/test"))
train_len = mnist_ds_train.get_dataset_size()
test_len = mnist_ds_test.get_dataset_size()
print('训练集规模:', train_len, ',测试集规模:', test_len)
return mnist_ds_train, mnist_ds_test, train_len, test_len
4. Load the processed test set
import os, sys
sys.path.insert(0, os.path.join(os.getcwd(), '../datasets/MNIST_Data'))
from process_dataset import process_dataset
mnist_ds_test = process_dataset(mnist_ds_test, batch_size= 10000)
5. Define the network structure and evaluation function
import mindspore
import mindspore.nn as nn
import mindspore.ops as ops
from mindspore.common.initializer import Normal
class Network(nn.Cell):
def __init__(self, num_of_weights):
super(Network, self).__init__()
self.fc = nn.Dense(in_channels=num_of_weights, out_channels=10, weight_init=Normal(0.02)) # 定义一个全连接层
self.nonlinearity = nn.Sigmoid()
self.flatten = nn.Flatten()
def construct(self, x): # 加权求和单元和非线性函数单元通过定义计算过程来实现
x = self.flatten(x)
z = self.fc(x)
pred_y = self.nonlinearity(z)
return pred_y
def evaluate(pred_y, true_y):
pred_labels = ops.Argmax(output_type=mindspore.int32)(pred_y)
correct_num = (pred_labels == true_y).asnumpy().sum().item()
return correct_num
6. Define the cross-entropy loss function and optimizer
# 损失函数
net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
# 创建网络
network = Network(28*28)
lr = 0.01
momentum = 0.9
# 优化器
net_opt = nn.Momentum(network.trainable_params(), lr, momentum)
7. Implement the training function
def train(network, mnist_ds_train, max_epochs= 50):
net = WithLossCell(network, net_loss)
net = TrainOneStepCell(net, net_opt)
network.set_train()
for epoch in range(1, max_epochs + 1):
train_correct_num = 0.0
test_correct_num = 0.0
for inputs_train in mnist_ds_train:
output = net(*inputs_train)
train_x = inputs_train[0]
train_y = inputs_train[1]
pred_y_train = network.construct(train_x) # 前向传播
train_correct_num += evaluate(pred_y_train, train_y)
train_acc = float(train_correct_num) / train_len
for inputs_test in mnist_ds_test:
test_x = inputs_test[0]
test_y = inputs_test[1]
pred_y_test = network.construct(test_x)
test_correct_num += evaluate(pred_y_test, test_y)
test_acc = float(test_correct_num) / test_len
if (epoch == 1) or (epoch % 10 == 0):
print("epoch: {0}/{1}, train_losses: {2:.4f}, tain_acc: {3:.4f}, test_acc: {4:.4f}" \
.format(epoch, max_epochs, output.asnumpy(), train_acc, test_acc, cflush=True))
8. Configure running information
Before formal training, use context.set_context to configure the information required for running, such as running mode, backend information, hardware and other information.
from mindspore import context
context.set_context(mode=context.GRAPH_MODE, device_target="CPU") # device_target 可选 CPU/GPU, 当选择GPU时mindspore规格也需要切换到GPU
9. Start training
import time
from mindspore.nn import WithLossCell, TrainOneStepCell
max_epochs = 100
start_time = time.time()
print("*"*10 + "开始训练" + "*"*10)
train(network, mnist_ds_train, max_epochs= max_epochs)
print("*"*10 + "训练完成" + "*"*10)
cost_time = round(time.time() - start_time, 1)
print("训练总耗时: %.1f s" % cost_time)
**********开始训练********** epoch: 1/100, train_losses: 2.2832, tain_acc: 0.1698, test_acc: 0.1626 epoch: 10/100, train_losses: 2.0465, tain_acc: 0.6343, test_acc: 0.6017 epoch: 20/100, train_losses: 1.8368, tain_acc: 0.7918, test_acc: 0.7812 epoch: 30/100, train_losses: 1.7602, tain_acc: 0.8138, test_acc: 0.8017 epoch: 40/100, train_losses: 1.7245, tain_acc: 0.8238, test_acc: 0.7972 epoch: 50/100, train_losses: 1.7051, tain_acc: 0.8337, test_acc: 0.8044 epoch: 60/100, train_losses: 1.6922, tain_acc: 0.8403, test_acc: 0.8047 epoch: 70/100, train_losses: 1.6827, tain_acc: 0.8454, test_acc: 0.8033 epoch: 80/100, train_losses: 1.6752, tain_acc: 0.8501, test_acc: 0.8051 epoch: 90/100, train_losses: 1.6689, tain_acc: 0.8536, test_acc: 0.8049 epoch: 100/100, train_losses: 1.6635, tain_acc: 0.8569, test_acc: 0.8037 ***********training complete**** **** Total training time: 430.7 s
So far, with a small amount of modification to the code based on the binary classification of handwritten digits, the ten-category of handwritten digit recognition has been quickly realized;
The modification process is very simple, but as can be seen from the above results, the model trained for 100 epochs, and only reached 80% accuracy in the handwritten digit recognition task, while in the previous section, the two-category task In practice, the model trained for 50 epochs achieved an accuracy rate of 99%, indicating that on a simple model such as a perceptron, handwritten digit recognition is more difficult than binary classification.