[Graduation project] Design and implementation of convolutional neural network classifier [source code + paper]

Design and Implementation of Convolutional Neural Network Classifier

Contents
1. Experiment purpose 3
2. Experiment method 3
3. Experimental environment 5
4. Dataset introduction 6
5. Experimental steps 8
6. Experimental results 10
7. Classifier evaluation 11
8. Experimental summary 12

1. Purpose of the experiment
Classifier design is the main research direction of machine learning. It is widely used in the fields of big data and artificial intelligence. The classification model can map the data in the database to one of the categories, and then can be applied to new data prediction. With the development of research, new classifier methods are constantly emerging in the field of machine learning. From early decision trees, nearest neighbor methods, to support vector machines, neural networks, etc., these methods have been effectively used in specific fields.
The design and use of the classifier is mainly divided into three steps. First, the sample data set is selected, and all samples are divided into two parts, training samples and test samples. Then select a classifier and run the classifier algorithm on the training samples to generate a classification model. Finally, the test samples are used to predict and evaluate the classification model to measure the performance of the classifier.
There are two main criteria to measure the pros and cons of a classifier: classification accuracy and generalization ability. The classifier represented by Support Vector Machine (SVM) is a typical traditional classifier method, which was first used in classification and recognition tasks such as portrait recognition and handwritten character recognition. However, this type of method requires cumbersome feature construction when performing recognition, and manual feature extraction often weakens the generalization ability of the model, which greatly limits the development of these classifiers. At the same time, due to the drive of big data and hardware computing power, current classifier methods based on convolutional neural networks have received more attention.
Convolutional Neural Networks (CNN) is a feed-forward neural network that includes convolution calculations and has a deep structure. It is a machine learning method that expresses its inherent laws by abstracting the underlying features of sample data. CNN uses convolution operation instead of general matrix multiplication to achieve feature extraction. At the same time, CNN has two major characteristics of local perception and parameter sharing. These two points perfectly fit the locality and repeatability of image features, making CNN in the field of image classification and recognition. Have an excellent performance.
Therefore, this paper will choose convolutional neural network as the classifier design method, and realize the classifier by classifying and recognizing handwritten digital images.
2. Experimental method
1. Convolutional neural network classifier structure
Figure 1 shows the general structure of a convolutional neural network. After the LeNet-5 model of CNN was proposed, its composition structure was basically determined. The network generally consists of an input layer, a convolutional layer, a pooling layer, a fully connected layer, and an output layer.

Figure 1 The general structure of convolutional neural network
. The input layer and the output layer are used as the first and last for image data input and result output representation. The middle part usually uses multiple sets of convolutional layers and pooling layers alternately for feature extraction. Several layers before the output layer The first-level network is usually set as a fully connected layer. The fully connected layer performs nonlinear fitting on the features mentioned above to map the feature space to the label space, and the final result is passed to the output. The following is a brief introduction to the specific composition and structure of the convolutional neural network classifier.
① Input layer
The main processing of this layer is to preprocess the original image data, which includes averaging, that is, centering all dimensions of the input data to 0, the purpose is to return the center of the sample to the origin of the coordinates; normalization, that is, the amplitude Normalize to the same range to reduce the interference caused by the different value ranges of the data in each dimension.
② Convolutional layer
Each convolutional layer in the convolutional neural network consists of several convolutional units. The convolution operation regards the input image as an n-dimensional matrix, and then takes an m*m-dimensional convolution kernel from left to right, Perform "scanning" from top to bottom, and perform convolution operation with the corresponding window every time you move to a window, that is, multiply the corresponding elements and then add and sum. The calculation process is shown in the figure below.

Figure 2 Convolution calculation process
The purpose of the convolution operation is to extract various features of the input. The first convolutional layer may only extract some low-level features such as edges, lines, and corners. More layers of the network can iteratively extract more complex features from lower level features. The result of the convolution calculation will then be calculated by the nonlinear function for nonlinear mapping output to realize the nonlinear process of the classifier.
③ Pooling layer
The pooling layer is sandwiched between consecutive convolutional layers to compress the amount of data and parameters and reduce overfitting. The main function of the pooling layer is to compress images. The method used by the pooling layer has the largest Value pooling and mean pooling.
④ Fully connected layer
This layer is equivalent to a multi-layer perceptron, which plays a role in classification in the entire convolutional neural network. Through the processing of the previous "convolution-activation-pooling" layers, the characteristics of the data to be processed have been With a significant improvement, the input data of the fully connected layer has been repeatedly purified, so the classification quality of the output is much higher.
2. Convolutional neural network classifier training method
CNN is essentially an input-to-output mapping, which can learn a large number of mapping relationships between input and output without requiring any precise mathematics between input and output Expression, as long as the convolutional network is trained with known patterns, the network has the ability to map between input and output pairs. The specific training process is as follows:
① Select the training group, and randomly seek N samples from the sample set as the training group;
② Set each weight and threshold to a small random value close to 0, and initialize the precision control parameters and Learning rate;
③ Take an input pattern from the training group and add it to the network, and give its target output vector; ④ Calculate the output vector of the
intermediate layer, and calculate the actual output vector of the network;
⑤ Combine the elements in the output vector with the target The elements in the vector are compared to calculate the output error; the hidden unit of the middle layer also needs to calculate the error; ⑥ calculate the
adjustment amount of each weight and the adjustment amount of the threshold in turn;
⑦ adjust the weight and adjust the threshold;
⑧ After experiencing M, judge whether the index meets the accuracy requirements, if not, return to (3) and continue iterating; if satisfied, go to the next step; ⑨ When the
training is over, save the weight and threshold in the file. At this time, it can be considered that each weight has reached stability, and the classifier has been formed. For training again, the weights and thresholds are directly exported from the file for training without initialization.
According to the above structural principles, the convolutional neural network classifier designed in this paper consists of an input layer, three sets of convolutions, and an output layer, as shown in the following figure:

Figure 3 The structure of the convolutional neural network classifier designed in this paper
3. Experimental environment
Since the convolutional neural network has been well developed and used in MATLAB, this paper uses MATLAB2019 as the development environment, and the operating system is Win10. Do it on the GPU.
MATLAB is a commercial mathematical software produced by MathWorks in the United States, which is used in data analysis, deep learning, image processing and computer vision, signal processing and other fields. Using MATLAB as the development environment can call the existing library functions in a method, the programming operation is relatively simple, and the logic readability is good. MATLAB's neural network toolbox provides a large number of functions for building neural networks, learning and training networks, and displaying. Enter help nnet in the command line window to view the version and functions of the neural network toolbox. At the same time, the neural network toolbox of MATLAB provides many GUI tools, such as nctool, nftool, nprtool, etc., which are convenient for graphical display.
4. Dataset introduction
The classifier designed in this paper will be applied to the image data set of handwritten digits. The most commonly used and standard data set in this kind of data set is the 'Mnist data set'.
The data set was initiated by the National Institute of Standards and Technology. A total of 250 different people's handwritten digital pictures were counted, 50% of which were high school students and 50% were from the staff of the Census Bureau. The purpose of collecting this data set is to realize the classification and recognition of handwritten digits through algorithms. Since 1998, this data set has been widely used in the field of machine learning and deep learning to test the effect of algorithms, such as linear classifiers, K-nearest neighbor algorithms, support vector machines, neural networks, convolutional neural networks, etc. Yan LeCun and others published the paper "Gradient-Based Learning Applied to Document Recognition", and proposed the LeNet-5 convolutional neural network for the first time, which realized the recognition of handwritten fonts by using the above data set.
The official website of the Mnist dataset is http://yann.lecun.com/exdb/mnist/. The official website provides the download of the dataset, which mainly includes four files. The file names and their functions are as follows:

Figure 4 Mnist data set
In the above file, the training set contains a total of 60,000 images and labels, while the test set contains a total of 10,000 images and labels, and the size of the data set is uniformly 28 28 1 black and white images . The first 5000 in the test set are from the training set of the original NIST project, and the last 5000 are from the test set of the original NIST project. The first 5,000 numbers are more regular than the last 5,000 because the first 5,000 are from U.S. Census Bureau employees, while the last 5,000 are from college students.
After downloading the above four files and decompressing them, you will find that what you get is not a series of pictures, but .idx1-ubyte and .idx3-ubyte format files, which is an IDX data format, and its basic format is as follows:

Figure 5 Mnist dataset format
Among them, the magic number is 4 bytes, the first 2 bytes are always 0, the 3rd byte represents the format of the data, and the meaning of the 4th byte represents the number of dimensions. Therefore, it is necessary to write a separate program to extract the data set. The Matlab reading program for the training set data and sample labels in this paper is mainly as follows:

%% Read training image data file
[FileName,PathName] = uigetfile(' . ','Select training image data file train-images.idx3-ubyte');
TrainFile = fullfile(PathName,FileName);
fid = fopen(TrainFile ,'r');
a = fread(fid,16,'uint8');
MagicNum = ((a(1)*256+a(2))*256+a(3))*256+a(4) ;
ImageNum = ((a(5)*256+a(6))*256+a(7))*256+a(8); ImageRow = ((
a(9)*256+a(10))* 256+a(11))*256+a(12);
ImageCol = ((a(13)*256+a(14))*256+a(15)) 256+a(16);
savedirectory = uigetdir( '','Choose the path to save the training image:');
h_w = waitbar(0,'Please wait, processing>>');
for i=1:1000
b = fread(fid,ImageRow
ImageCol,'uint8') ;
c = reshape(b,[ImageRow ImageCol]);
d = c';
e = 255-d;
e = uint8(e);
savepath = fullfile(savedirectory,[‘TrainImage_’ num2str(i,‘%05d’) ‘.bmp’]);
imwrite(e,savepath,‘bmp’);
waitbar(i/ImageNum);
end

%% 读取训练图片数据标签
clc; clear all;
filename = ‘./train-labels-idx1-ubyte/train-labels.idx1-ubyte’;
fp = fopen(filename, ‘rb’);
assert(fp ~= -1, ['Could not open ', filename, ‘’]);
magic = fread(fp, 1, ‘int32’, 0, ‘ieee-be’);
assert(magic == 2049, ['Bad magic number in ', filename, ‘’]);
numLabels = fread(fp, 1, ‘int32’, 0, ‘ieee-be’);
Trainlabels = fread(fp, inf, ‘unsigned char’);

The data set read by the above program is shown in the figure below:

Figure 5 Mnist dataset training set display

Figure 6 Mnist data set test set display
Since the Mnist data set has tens of thousands of samples and a large amount of data, only some of them are used as the experimental data set in this paper. The training set uses a total of 1000 pieces of TrainImage_00001-TrainImage_01000 of the Mnist training set as the training set; the test set is TestImage_00001-TestImage_01000, which is also 1000 pieces.
V. Experimental steps
1. Import the data set.
First, convert the original data set into a picture form according to the method of reading the Mnist data set in Section 4, and then read it to convert it into the Matlab training neural network classifier. format, the main procedure is as follows:

%% read training data
load('Trainlabels.mat') % label
mineSetTrain = imageDatastore('./MinistTrain1000/', 'FileExtensions','.bmp',…
'IncludeSubfolders',false);
for i = 1:1000 % A total of 1000 data sets
mLabelsTrain{i,1} = num2str(Trainlabels(i));
end
mLabelsTrain1 = categorical(mLabelsTrain);
mineSetTrain.Labels = mLabelsTrain1;
disp(countEachLabel(mineSetTrain))

%% Read test data
load('Testlabels.mat') % label
mineSetTest = imageDatastore('./MinistTest1000/', 'FileExtensions','.bmp',...
'IncludeSubfolders',false);
for i = 1:1000 % A total of 1000 data sets
mLabelsTest{i,1} = num2str(Testlabels(i));
end
mLabelsTest1 = categorical(mLabelsTest);
mineSetTest.Labels = mLabelsTest1;
disp(countEachLabel(mineSetTest))

Through the above process, 1000 pictures of the original training set and test set are respectively converted into mineSetTrain and mineSetTest, which contain 1000 pieces of 28 28 1 image data and their corresponding label information.
2. Define the convolutional neural network structure
Reasonably set the hierarchical structure of the neural network according to the needs. The specific structure has been given in the experimental method in the second section of this paper, that is, the network consists of an input layer, three sets of convolutions, and an output layer. In the convolutional layer, the batch normalization layer is added to reduce overfitting. The activation function of the convolutional layer adopts the reluLayer function, the pooling layer adopts the maximum pooling method, and finally the output layer realizes classification through the softmax function.
The main procedure for defining the convolutional neural network structure is as follows:
%% Define the convolutional neural network structure
layers = [
% input layer
imageInputLayer([28 28 1], 'Name','input')
% convolutional layer 1
convolution2dLayer(5, 6,'Padding',2, 'Name','conv1') % convolution
batchNormalizationLayer('Name','batchNormal1') % batch normalization
reluLayer('Name','reluLayer1') % activation
maxPooling2dLayer(2, 'stride',2,'Name','maxPool1') % pooling
% convolution layer 2
convolution2dLayer(5,16, 'Name','conv2')
batchNormalizationLayer(‘Name’,‘batchNormal2’)
reluLayer(‘Name’,‘reluLayer2’)
maxPooling2dLayer(2,‘stride’,2, ‘Name’,‘maxPool2’)
% 卷积层3
convolution2dLayer(5,120, ‘Name’,‘conv3’)
batchNormalizationLayer(‘Name’,‘batchNormal3’)
reluLayer(‘Name’,‘reluLayer3’)
% 输出层
fullyConnectedLayer(10, ‘Name’,‘full’)
softmaxLayer(‘Name’,‘softmax’)
classificationLayer(‘Name’,‘classOutput’)
];

3. Set the convolutional network training parameters and train.
The network training parameters are mainly the optimizer selection and the maximum Epochs setting. Other parameters adopt the default values ​​given by Matlab, and set to display training information and progress. Network training is carried out using trainNetwork. The input is the training set data mineSetTrain, network structure layers and network parameter options. The main procedure is as follows:

%% Set convolutional network training parameters
options = trainingOptions('sgdm',... % sgdm optimizer
'MaxEpochs',30,... % set the maximum Epochs to 50
'Verbose',true,... % display information
'Plots','training -progress'); % show the progress of training
%% train neural network
net = trainNetwork(mineSetTrain, layers, options);

4. Test the classification effect of the data set
After the training is completed, test the obtained classifier model net, the classification function is classify, input the network model net and the test set mineSetTest, and compare the performance of the classifier by comparing the classification result YPred of the classifier with the real result YValidation classification recognition effect. The main procedure is as follows:

%% Test data set classification effect
YPred = classify(net, mineSetTest);
YValidation = mineSetTest.Labels;
% Calculation accuracy rate
= sum(YPred == YValidation)/numel(YValidation)*100;

6. Experimental results
1. During the training process
Under the above test methods, data sets, and steps, the printed information during the classifier training process is as follows:

Figure 7 Print information during training

Figure 8 Training progress
It can be seen from Figure 7 and Figure 8 that with the increase of iterations, the batch accuracy rate of the training set is significantly increased from 16% to 100%, and the batch loss is reduced to 0.01, indicating that the perfect effect is achieved on the training set.
2. Classification effect
of the test set The test results of the test set show that 925 of the 1000 test sets are classified correctly, and the overall recognition rate is 92.5%. Figure 9 shows the effect diagram of the classification of the test set. It can be seen from the figure that the overall effect of the classification is good, and the misclassified labels only account for a small proportion.

Figure 9 Classification effect of the test set
Figure 10 shows the classification results of the 10 test sets, of which the second one is wrongly recognized, and 5 is recognized as 8. It can be seen from the figure that the writing of '5' in the wrong picture is not standard, and it looks like '8' to a certain extent, so the classifier recognizes it as 8.

Figure 10 Test set sample display
7. Classifier evaluation
According to the above experimental results, the convolutional neural network classifier designed in this paper has achieved good results in classification and recognition of handwritten digital images. Under the condition of a training set of 1000 samples, It can achieve a recognition rate of 92.5%. In addition, in the training when the data set sample is 5000, the obtained classifier can achieve an accuracy rate of 95% for the test set, and its reliability is somewhat higher than the artificial level.
It shows that the convolutional neural network classifier is an efficient classifier that does not need to manually extract features. It has a remarkable effect in classifying and recognizing images, and the recognition rate can be effectively improved by increasing the data set samples. This is the classification in the era of big data. Improve convenience.
Eight. Experimental summary
In this classifier design and implementation experiment, I implemented a classification system, completed the recognition of handwritten digital images, and achieved good results. In the experiment, through the research and analysis of the convolutional neural network, I felt the charm and powerful functions of the neural network, and I became interested in the research of the neural network. At the same time, in order to realize the function in the experiment, the matlab programming ability was also improved by searching for information, especially in the process of data set processing and classifier design. Through continuous trial and error, the ideal effect was gradually obtained, which will also be a great contribution to the follow-up study. Work offers help.
insert image description here
insert image description here

Guess you like

Origin blog.csdn.net/qq_19657403/article/details/130180772