Computer Vision (5) - Image Classification

Table of contents

5. Image classification

5.1 AlexNet

5.2 VGG

5.3 GoogLeNet、Inception

5.3.1 Inception V1 

5.3.2 Inception V2

5.3.3 Inception V3 

5.3.4 Inception V4 

5.4 ResNet Residual Network

5.4.1 ResNet

5.4.2 ResNeXt

5.5 CNN design guidelines

5. Image classification

5.1 AlexNet

 

5.2 VGG

 

5.3 GoogLeNet、Inception

5.3.1 Inception V1 

                There are too many parameters in the fully connected layer, so we need to improve 

         GAP: global average pooling 

Where the gradient disappears, pass it in again

5.3.2 Inception V2

(1) Set the traning parameter to True during training, and set the training parameter to False during verification. In pytorch, it can be controlled through the model.train() and model.eval() methods of the created model.
(2) Set the batch size as large as possible. When set to a small size, the performance may be very poor. The larger the setting, the closer the mean and variance are to the mean and variance of the entire training set.
(3) Generally, the bn layer is placed between the convolutional layer (Conv) and the activation layer (such as Relu), and the convolutional layer does not use bias.

5.3.3 Inception V3 

 

5.3.4 Inception V4 

VGG is the backbone model of most, and Google’s scalability is relatively poor.

5.4 ResNet Residual Network

5.4.1 ResNet

        Optimization: Use 1×1 convolution to first reduce the dimension to reduce the amount of calculation, then increase the dimension and combine it with the previous residual block

FLOPs here refers to the amount of calculation required 

5.4.2 ResNeXt

 

5.5 CNN design guidelines

Here, grouped convolution is not necessarily stronger than the entire convolution.

Guess you like

Origin blog.csdn.net/qq_47941078/article/details/130501358