Face identity recognition matlab simulation based on Googlenet deep learning network

Table of contents

1. Preview of algorithm operation renderings

2.Algorithm running software version

3. Some core programs

4. Overview of algorithm theory

5. Algorithm complete program engineering


1. Preview of algorithm operation renderings

2.Algorithm running software version

matlab2022a

3. Some core programs

.....................................................................
% 定义修改的范围
Pixel_Range = [-30 30];
Scale_Range = [0.9 1.1];

% 现在修改图像
Image_Augmenter = imageDataAugmenter(...
    'RandXReflection', true, ...
    'RandXTranslation', Pixel_Range, ...
    'RandYTranslation', Pixel_Range,... 
     'RandXScale', Scale_Range, ...
     'RandYScale', Scale_Range);

% 调整图像以适应Googlenet的第1层
Augmented_Training_Image = augmentedImageDatastore(Input_Layer_Size(1:2), Training_Dataset, ...
    'DataAugmentation', Image_Augmenter);

Augmented_Validation_Image = augmentedImageDatastore(Input_Layer_Size(1:2),Validation_Dataset);



% 指定训练选项
Size_of_Minibatch = 5;
Validation_Frequency = floor(numel(Augmented_Training_Image.Files)/Size_of_Minibatch);
Training_Options = trainingOptions('sgdm',...
    'MiniBatchSize', Size_of_Minibatch, ...
    'MaxEpochs', 10,...
    'InitialLearnRate', 3e-4,...
    'Shuffle', 'every-epoch', ...
    'ValidationData', Augmented_Validation_Image, ...
    'ValidationFrequency', Validation_Frequency, ...
    'Verbose', false, ...
    'Plots', 'training-progress');

% 开始训练
net = trainNetwork(Augmented_Training_Image, Layer_Graph, Training_Options);
54

4. Overview of algorithm theory

      VGG was proposed by the famous research group vGG (Visual Geometry Group) of Oxford University in 2014, and won the first place in the Localization Task (positioning task) and the second place in the Classification Task (classification task) in the imageNet competition that year. The first place in Classification Task is GoogleNet. GoogleNet is a deep network structure developed by Google. The reason why it is called "GoogLeNet" is to pay tribute to "LeNet". Human behavior and action recognition is one of the important applications in the field of computer vision and deep learning. In recent years, deep learning networks have achieved remarkable results in human behavior and action recognition.

1. Principle
1.1 Deep Learning and Convolutional Neural Network (CNN)
       Deep learning is a machine learning technology that simulates the connection between neurons in the human brain by constructing a multi-layer neural network to realize data learning and feature extraction . Convolutional neural network (CNN) is an important structure in deep learning, especially suitable for image recognition tasks. It extracts and learns the features of images layer by layer through convolutional layers, pooling layers and fully connected layers.

1.2 GoogLeNet
       GoogLeNet is a deep convolutional neural network proposed by Google in 2014. It introduces the Inception module to solve the problems of too many parameters and large amount of calculation in deep networks. The Inception module uses convolution kernels of different sizes and pooling operations to extract features in parallel, and then stitches them together to obtain a richer feature representation.

Highlights of GoogLenet network
1. Introduced the Inception structure (fusion of feature information of different scales)
2. Use 1x1 convolution kernel for dimension reduction and mapping processing
3. Add two auxiliary classifiers to help training
4. Discard the fully connected layer and use the average Pooling layer (dramatically reduces model parameters)

2. Implementation process
2.1 Data preprocessing
      In the ore type recognition task, it is first necessary to prepare a marked data set, including images or video frames of different behaviors. Then, the image is preprocessed, including image resizing, normalization and other operations, in order to be input into the deep learning network.

2.2 Building a network model
       The GoogLeNet model can be built through deep learning frameworks such as TensorFlow or PyTorch. The basic structure of the model includes convolutional layers, pooling layers, Inception modules and fully connected layers. The network can be modified and customized according to specific tasks.

2.3 Data input and training
       The preprocessed image is used as input, and the output of the network is obtained through forward propagation. Then, by comparing with the label, the loss function is calculated and backpropagated to update the weight parameters of the network. Through multiple iterations of training, the network gradually learns features and improves recognition capabilities.

2.4 Model evaluation and tuning
        During the training process, the data set needs to be divided into a training set, a validation set, and a test set. Monitor the performance of the model through the validation set, and tune the model based on the performance of the validation set. Evaluate on the test set to get the recognition accuracy of the model on unseen data.

5. Algorithm complete program engineering

OOOOO

OOO

O

Guess you like

Origin blog.csdn.net/aycd1234/article/details/132587726