Introduction to deep learning-inception module

This article briefly introduces the improvement of the inception module, including inception v1, inception v2, inception v3 and inception v4. Refer to related blogs: Explain Inception structure in detail: from Inception v1 to Xception , Inception module
1. The development process of inception module
First introduce a picture
insert image description here
Since AlexNet made a historical breakthrough in 2012, until GoogLeNet came out, the mainstream network structure breakthrough is roughly The deeper the network (number of layers), the wider the network (number of neurons). So everyone ridicules deep learning as "deep parameter tuning", but the disadvantages of purely increasing the network:

1. Too many parameters, easy to overfit, if the training data set is limited;

2. The larger the network, the greater the computational complexity, which is difficult to apply;

3. The deeper the network, the easier the gradient will disappear (gradient dispersion), making it difficult to optimize the model.

Then the way to solve the above problems is of course to increase the depth and width of the network while reducing the parameters. Inception came into being under such circumstances.
2. Inception v1 model
The figure below shows the original Inception (native inception) structure and the Inception v1 structure used in GoogLeNet. GoogleNet using the Inception v1 Module is not only deeper than Alex, but also has 12 times fewer parameters than AlexNet. The network size It is about 1/20 of VGGNet.
The left picture is the original version of inception, and the right picture is the improved version of inception. By improving the module, the use of 1×1 convolution can reduce the accumulation of parameters, so the depth is increased while the width is increased and the model parameters are reduced.
insert image description here
Features:
1. Deep splicing. Reduce parameter
2, 1×1 convolution. Further reduce parameters
3. Inception v2
Inception v2 is further improved based on the v1 version, and the BN layer is introduced to normalize the output of each layer. At the same time, two 3×3 convolutions are used instead of one 5×5 convolution, and on this basis, the number of parameters is optimized again and the operation speed is improved.
insert image description here
Features:
1. Batch Normalization. This can speed up the convergence of the network.
2. The small convolution kernel replaces the large convolution kernel. This connection method reduces the amount of parameters while maintaining the range of the receptive field, and can avoid expression bottlenecks and deepen nonlinear expression capabilities.
4. Inception v3
convolution solution, changing an n×n convolution to 1×n and n×1 convolution
insert image description here
Features
1. Asymmetric convolution solution
5. Inception v4
Improve on the basis of residual convolution, introduce inception v3
insert image description here
to replace the convolution structure of the residual module with the Inception structure, that is, get the Inception Residual structure. In addition to the structure in the above right picture, the author combined 20 similar modules, and finally formed the network structure of InceptionV4.
6. Summary
(1) General design principles of deep networks

1. Avoid expression bottlenecks. The feature map should gradually become smaller, and cannot be too small, resulting in a large amount of information loss.
2. High-dimensional features are easier to handle. High-dimensional features are easier to distinguish and will speed up training.
3. Collect more. The spatial width is expanded in a low-dimensional space, and then brought together in one place by deep stitching.
4. Low-level dimensionality reduction. As the size of the feature map decreases, it can continue to reduce its dimension and reduce the number of channels. Using 1x1 conv will not only not affect the accuracy of the model, but can also speed up the convergence speed.
The above can not be used directly to improve network quality, but only for guidance in the general environment.

(2) Factorizing Convolutions

Symmetric convolution decomposition, multiple small convolutions instead of large convolutions
Asymmetric convolution decomposition, changing one n×n convolution to 1×n and n×1 two convolutions

Related papers and download addresses of the network models involved:

[v1] Going Deeper with Convolutions, 6.67% test error : Going Deeper with Convolutions

[v2] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, 4.8% test error : http://arxiv.org/abs/1502.03167

[v3] Rethinking the Inception Architecture for Computer Vision, 3.5% test error : Rethinking the Inception Architecture for Computer Vision

[v4] Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, 3.08% test error : Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning

Guess you like

Origin blog.csdn.net/self_Name_/article/details/126447229