Convolution layer parameter calculation and setting

Convolution layer dimension calculation and setting

Convolution structure

CNN structure:

Input (input layer)---->Conv (convolution)---->Relu (activation)---->Pool (pooling)---->FC (fully connected)

Introduction to input layer parameters:

  1. batch_size: equivalent to the number of samples for one training
  2. weight/height: image width and height
  3. channels: Number of picture channels, 1 is black and white, 3 is RGB

Introduction to convolutional layer parameters:

  1. filter = convolution kernel(1x1,3x3,5x5)
  2. feature map = output after convolution
  3. weight/height: convolution kernel size
  4. in_channel: equal to the number of channels of the input image (this can be set according to your own needs)
  5. out_channel: equal to the number of output channels
  6. padding: padding value, add a certain number of rows and columns to each side of the input feature map, so that the length and width of the output feature map = the length and width of the input feature map
  7. stride: stride, the sampling interval of the convolution kernel through the input feature map

Convolution calculation formula:
N output size = (W input size − Filter + 2Padding )/Stride+1
Deconvolution calculation formula:
N output size = (W input size − 1 )*Stride+Filter - 2 *Padding
Note: Volume The product is rounded down, and the pooling is rounded up.

Introduction to pooling layer parameters:

  1. Filter: convolution kernel size
  2. stride: stride, the sampling interval of the convolution kernel through the input feature map

Pooling calculation formula:
output size = (input size − Filter)/Stride+1

Function:
maxpooling has local invariance and can extract significant features while reducing the parameters of the model, thereby reducing the overfitting of the model.
Because only significant features are extracted and insignificant information is discarded, the parameters of the model are reduced, which can alleviate the occurrence of overfitting to a certain extent.

How to choose the size, number, and number of layers of the convolution kernel?

1. Convolution kernel size:

Theoretically, the size of the convolution kernel can be arbitrary, but most of the convolution kernels used in CNN are odd-sized squares, such as 1x1, 3x3, 5x5, 7x7, 11x11, etc., and there are also rectangular convolutions. The core, such as 3x3 in Inceptionv3, becomes 1x3 and 3x1.
Why are the convolution kernels in CNN generally square and not rectangular?

The size selection is generally 3x3, the smaller the better, which reduces the number of parameters and reduces complexity. Multiple small convolution kernels are better than one large convolution kernel for two reasons:

1. The number of parameters is reduced to
2.3 3x3 convolution kernels = 1 7x7 convolution kernel = 1 5x5 convolution kernel

Take Mnist as an example, the picture is 28x28, use a 5x5 convolution kernel, stride=1 to convolve it, Result=(28-5)/1+1=24,
use two 3x3 convolution kernels:
(28-3)/ 1+1=26
(26-3)/ 1+1=24
Insert image description here

2. Number of convolution kernels

It is equal to the number of channels of the output feature map. The greater the number of convolution kernels, the more types of features are extracted. It is usually 2^n, which is multiplied by a multiple of 16.

3. Number of convolutional layers

To set the number of convolution layers, choose the model with the best performance. Set how many layers it has. This is influenced by many aspects such as training data, activation function, gradient update algorithm, etc., and it cannot be found simply by trying it out.

4. Padding selection

Referring to a ppt, I found:
Insert image description here

When the convolution kernel is 3, the padding is selected.
When the convolution kernel is 5, the padding is selected.
When the convolution kernel is 7, the padding is selected. Select 3.
The above are just settings in principle. In fact, it is better to set the parameters according to the model. After all, everyone Parameter adjustment has been optimized. . . .

5.Stride selection

The step size usually does not exceed the width or length of the convolution kernel. When the step size is greater than 1, there is a downsampling effect. For example, when the step size is 2, the size of the feature map can be reduced by half.

LeNet parameter examples

Insert image description here

Examples of AlexNet parameters

Insert image description here

VGG parameter examples

Insert image description here

Reference

1. How are the size of the convolution kernel, the number of convolution layers, and the number of maps in each layer of the convolutional neural network determined?
2. Detailed understanding of stride and padding in CNN
3. Parameter calculation of convolutional neural network
4. Detailed explanation of the settings of each convolution layer and output size calculation in convolutional neural network

Guess you like

Origin blog.csdn.net/qq_42740834/article/details/123757816