Introductory example of Matlab deep learning: building a convolutional neural network CNN from 0 (with complete code)

There are already a large number of explanations of convolutional neural networks on the Internet, so this article will not repeat them. This article is aimed at those who already understand the basic structure and principles of CNN. An example is used to build a simple convolutional neural network as a formal step into deep learning. first step.

We use the most classic case of deep learning - the recognition of handwritten digits, and a classic CNN - LeNet for this study.

Matlab is very powerful. Its built-in deep learning toolbox can save us from writing the underlying algorithm and quickly build a convolutional neural network. At the same time, it comes with handwritten digital pictures for learning. The address is as follows. The author uses The one is Matlab2022a.

We copy the DigitDataset to the folder where the code is currently written, and delete the two Excels contained in it to get the following picture.

The first step is to load the handwritten digital sample picture, the code is as follows:

clear
clc

% 第一步：加载手写数字样本
imds = imageDatastore( ...
    'DigitDataset', ...
    'IncludeSubfolders',true, ...
    'LabelSource','foldernames');

'IncludeSubfolders', true: include all files and subfolders in each folder;

'LabelSource','foldernames': Assign labels based on folder names and store them in the Labels property.

In the second step , the sample is divided into training set and test set, and the number of categories is counted. The code is as follows:

% 第二步：
% 将样本划分为训练集与测试集
[imdsTrain,imdsValidation] = splitEachLabel(imds,0.7);

% 统计训练集中分类标签的数量
numClasses = numel(categories(imdsTrain.Labels));

imdsTrain is the training sample data, imdsValidation is the verification sample data, and 0.7 is the ratio of the training sample.

The third step is to build LeNet and perform visual analysis. The code is as follows:

% 第三步：构建LeNET卷积网络并进行分析
% 构建LeNET卷积网络
LeNET= [
   imageInputLayer([60 20 1],'Name','input','Normalization','zscore')                  
   convolution2dLayer([5 5],6,'Padding','same','Name','Conv1') 
   maxPooling2dLayer(2,'Stride',2,'Name','Pool1')              
   convolution2dLayer([5 5],16,'Padding','same','Name','Conv2')
   maxPooling2dLayer(2,'Stride',2,'Name','Pool2')              
   convolution2dLayer([5 5],120,'Padding','same','Name','Conv3')
   fullyConnectedLayer(84,'Name','fc1')                     
   fullyConnectedLayer(numClasses,'Name','fc2')
   softmaxLayer( 'Name','softmax')
   classificationLayer('Name','output') 
                                         ];  

% 对构建的网络进行可视化分析
lgraph = layerGraph(LeNET);
analyzeNetwork(lgraph)

Since the size of the handwritten digital picture is 60*20*1, it is necessary to adjust the size of the input layer;

The LeNet structure is as follows:

The first convolution layer: the size of the convolution kernel is 5, the number is 6, and the convolution method is 0 padding;

The first pooling layer: two-dimensional maximum pooling, the area is 2, and the step size is 2;

The second convolution layer: the convolution kernel size is 5, the number is 16, and the convolution method is 0 padding;

The second pooling layer: two-dimensional maximum pooling, the area is 2, and the step size is 2;

The third convolution layer: the convolution kernel size is 5, the number is 12, and the convolution method is 0 padding;

The first fully connected layer: the output size is 84;

The second fully connected layer: the output size is numClasses;

Softmax layer: get the probability of each output of the fully connected layer;

Classfication layer: Determine the class according to the probability.

analyzeNetwork allows us to perform visual analysis on the network. The result of the code operation is as follows:

The fourth step is to adjust the image size of the training set and input set to be the same as the LeNet input layer. The code is as follows:

% 第四步：将训练集与验证集中图像的大小调整成与LeNet输入层的大小相同
inputSize = [60 20 1];
augimdsTrain = augmentedImageDatastore(inputSize(1:2),imdsTrain);
augimdsValidation = augmentedImageDatastore(inputSize(1:2),imdsValidation);

This step can be omitted in this example.

Step 5 : Configure training options and train the network, the code is as follows:

% 第五步：配置训练选项并对网络进行训练
% 配置训练选项   
options = trainingOptions('sgdm', ...
    'InitialLearnRate',0.001, ...    
    'MaxEpochs',3, ...               
    'Shuffle','every-epoch', ...
    'ValidationData',augimdsValidation, ...
    'ValidationFrequency',30, ...
    'Verbose',true, ...
    'Plots','training-progress');                
 
% 对网络进行训练
net = trainNetwork(augimdsTrain,LeNET,options);

The training options are as follows:

The training method is sgdm;

The initial learning rate is 0.001;

The maximum number of rounds is 3;

'Shuffle','every-epoch': shuffle the data before each round of training;

The data used during training is augimdsValidation;

The verification frequency is 30 times/round;

Set open command window output;

Set to open the training progress graph.

We can see the training progress as shown below:

Step 6 : Use the trained network to classify the new input image and calculate the accuracy

% 第六步：将训练好的网络用于对新的输入图像进行分类，并计算准确率

YPred = classify(net,augimdsValidation);
YValidation = imdsValidation.Labels;
accuracy = sum(YPred == YValidation)/numel(YValidation)

figure
confusionchart(YValidation,YPred)

confusionchart can generate a confusion matrix so that we can see the results of LeNet verification more intuitively.

It can be seen that the accuracy of the prediction result is relatively low. We can improve LeNet by adding convolution and pooling layers or using more advanced neural networks such as AlexNet for training. The entire code of this example is as follows:

clear
clc

% 第一步：加载手写数字样本
imds = imageDatastore( ...
    'DigitDataset', ...
    'IncludeSubfolders',true, ...
    'LabelSource','foldernames');

% 第二步：
% 将样本划分为训练集与测试集
[imdsTrain,imdsValidation] = splitEachLabel(imds,0.7);

% 统计训练集中分类标签的数量
numClasses = numel(categories(imdsTrain.Labels));

% 第三步：构建LeNET卷积网络并进行分析
% 构建LeNET卷积网络
LeNET= [
   imageInputLayer([60 20 1],'Name','input','Normalization','zscore')                  
   convolution2dLayer([5 5],6,'Padding','same','Name','Conv1') 
   maxPooling2dLayer(2,'Stride',2,'Name','Pool1')              
   convolution2dLayer([5 5],16,'Padding','same','Name','Conv2')
   maxPooling2dLayer(2,'Stride',2,'Name','Pool2')              
   convolution2dLayer([5 5],120,'Padding','same','Name','Conv3')
   fullyConnectedLayer(84,'Name','fc1')                     
   fullyConnectedLayer(numClasses,'Name','fc2')
   softmaxLayer('Name','softmax')
   classificationLayer('Name','output') 
                                         ];  

% 对构建的网络进行可视化分析
lgraph = layerGraph(LeNET);
analyzeNetwork(lgraph)

% 第四步：将训练集与验证集中图像的大小调整成与LeNet输入层的大小相同
inputSize = [60 20 1];
augimdsTrain = augmentedImageDatastore(inputSize(1:2),imdsTrain);
augimdsValidation = augmentedImageDatastore(inputSize(1:2),imdsValidation);
     
% 第五步：配置训练选项并对网络进行训练
% 配置训练选项   
options = trainingOptions( ...
    'sgdm', ...
    'InitialLearnRate',0.001, ...    
    'MaxEpochs',3, ...               
    'Shuffle','every-epoch', ...
    'ValidationData',augimdsValidation, ...
    'ValidationFrequency',30, ...
    'Verbose',true, ...
    'Plots','training-progress');                
 
% 对网络进行训练
net = trainNetwork(augimdsTrain,LeNET,options); 
    
% 第六步：将训练好的网络用于对新的输入图像进行分类，并计算准确率
YPred = classify(net,augimdsValidation);
YValidation = imdsValidation.Labels;
accuracy = sum(YPred == YValidation)/numel(YValidation)
   
figure
confusionchart(YValidation,YPred)

Introductory example of Matlab deep learning: building a convolutional neural network CNN from 0 (with complete code)

Guess you like