In AlexNet, 11 × 11 11\times11 is used to increase the receptive field11×11、 5 × 5 5\times5 5×5 and3 × 3 3\times33×3 Three convolution kernels. And VGG16 illustrates two3 × 3 3\times33×3 convolution kernel and a5 × 5 5\times55×The receptive field of the convolution kernel of 5 is the same, so that two3 × 3 3\times3can be used3×3 convolution kernel instead of a5 × 5 5\times55×Convolution kernel of 5 . Similarly, you can use three3 × 3 3\times33×3 convolution kernel instead of a7 × 7 7\times77×Convolution kernel of 7 , using four3 × 3 3\times33×3 convolution kernel instead of a9 × 9 9\times99×The convolution kernel of 9 , and so on.
1 Why two 3 × 3 3\times33×3 convolution kernel and a5 × 5 5\times55×5 convolution kernels have the same receptive field
1.1 Illustrate with pictures
As shown below:
1.2 Calculation instructions
Suppose the feature map is 28 × 28 28\times2828×2 to 8 , assuming that the step length of the volume isstep = 1 step = 1step=1, p a d d i n g = 0 padding = 0 padding=0:
- Use a layer of 5 × 5 5\times55×5 convolution kernel, by(28 − 5) / 1 + 1 = 24 (28-5)/1+1=24(28−5)/1+1=2 4 is available, the output feature map is24 ∗ 24 24*2424∗2 4 of
- Use two layers 3 × 3 3\times33×3 convolution kernel
- The first layer, by (28 − 3) / 1 + 1 = 26 (28-3)/1+1=26(28−3)/1+1=2 6 is available, the output feature map is26 × 26 26\times2626×2 6 of
- The second layer, by (26 − 3) / 1 + 1 = 24 (26-3)/1+1=24(26−3)/1+1=2 4 is available, the output feature map is24 × 24 24\times2424×2 4 of
You can see that the final result is the same.
2 The benefits of using small convolution instead of winder
- Under the conditions of ensuring the same receptive field, the depth of the network is improved, and the effect of the network is improved to a certain extent (from this point, it also illustrates the great role of ResNet)
- Under the condition of ensuring the same receptive field, the amount of calculation and the amount of parameters are reduced