Four network VGGNet

First, the characteristics

1, to improve AlexNet, a convolution in the first layer in a smaller convolution kernel and a stride of
2, multi-scale training (training and testing, the entire map of the different scales)

Thus, VGG simple structure, ability to extract features, widespread application scenarios

Comparative test results from a single scale:
Network architecture

Second, the comparison of different structures

VGG provided a total of six versions of the network, a network to explore the effect of different contrast.
The following brief analysis about the various versions of network configuration details:

Structure A: AlexNet and similar, divided into five layers convolution stage, 3 full connection layer, except that the layer is a convolution 3x3 convolution kernel size;
the structure A-LRN: AlexNet retained in LRN operation, other structures are the same as a;
structure B: a stage2 and stage3 a layer of a 3x3 convolution increased, a total of 10 layers of the convolution;
structure C: on the basis of B, stage3, stage4, stage5 increase respectively a layer of 1x1 convolution, convolution 13 layers, 16 layers in total;
structure D: on the basis of C, stage3, stage4, stage5 layer increased a 3x3 convolution, convolution layer 13, a total of 16 layer;
structure E: D on the basis of, stage3, stage4, stage5 layer respectively add a 3x3 convolution, convolution 16 layers, 19 layers in total.

FIG effect of contrast structure

Each structural effect of contrast:

A comparison with the LRN-A: A-LRN A no good result, LRN effect not described;
A is compared with B, C, D, E: A layer number is minimum among these, effective as compared to A B, C, D, E, the better the deeper layers described;
B and C comparison: 1x1 convolution kernel to increase, adding additional non-linear effect lift;
Comparative C and D: a 3x3 convolution kernel (structure D) ratio 1x1 good (structure C) effect. (Note !!!!)
comparison between the C, D, E, will improve the accuracy of multi-scale

Third, the discussion of the advantages of the convolution kernel

1. Why use a 3 × 3 convolution kernel?
(1) 3 3 × 3 convolution kernel of the receptive field of a convolution kernel of the 7 × 7 equivalent receptive field, but the added intermediate activation function, as compared with a 7 × 7 convolution kernel, the depth deeper and increases the nonlinear
(2) reduce the amount of parameters:
(C. 3 × × ×. 3 C) = 27C. 3 ^ 2 ×
C × × C. 7. 7 = 49C × 2 ^
2,1 × convolution kernel role. 1 ( While these two functions can be achieved, but the parameter is greater) convolution kernel with the other
(1) non-linearly increase
(2), and l-dimensional dimensionality reduction

Fourth, the training data preprocessing

The first step: The picture of same-sex zoom, minimum side length of 256
Step Two: randomly taken image block is 224 × 224
The third step: the cropped image block and random horizontal flip transform RGB color

Added: You can also use dense evalation not cropped images directly into the network, will be connected to the back of the whole layer instead convolution layer

Guess you like

Origin www.cnblogs.com/liuboblog/p/11622132.html