Convolution layer dimension calculation and setting
Convolution structure
CNN structure:
Input (input layer)---->Conv (convolution)---->Relu (activation)---->Pool (pooling)---->FC (fully connected)
Introduction to input layer parameters:
- batch_size: equivalent to the number of samples for one training
- weight/height: image width and height
- channels: Number of picture channels, 1 is black and white, 3 is RGB
Introduction to convolutional layer parameters:
- filter = convolution kernel(1x1,3x3,5x5)
- feature map = output after convolution
- weight/height: convolution kernel size
- in_channel: equal to the number of channels of the input image (this can be set according to your own needs)
- out_channel: equal to the number of output channels
- padding: padding value, add a certain number of rows and columns to each side of the input feature map, so that the length and width of the output feature map = the length and width of the input feature map
- stride: stride, the sampling interval of the convolution kernel through the input feature map
Convolution calculation formula:
N output size = (W input size − Filter + 2Padding )/Stride+1
Deconvolution calculation formula:
N output size = (W input size − 1 )*Stride+Filter - 2 *Padding
Note: Volume The product is rounded down, and the pooling is rounded up.
Introduction to pooling layer parameters:
- Filter: convolution kernel size
- stride: stride, the sampling interval of the convolution kernel through the input feature map
Pooling calculation formula:
output size = (input size − Filter)/Stride+1
Function:
maxpooling has local invariance and can extract significant features while reducing the parameters of the model, thereby reducing the overfitting of the model.
Because only significant features are extracted and insignificant information is discarded, the parameters of the model are reduced, which can alleviate the occurrence of overfitting to a certain extent.
How to choose the size, number, and number of layers of the convolution kernel?
1. Convolution kernel size:
Theoretically, the size of the convolution kernel can be arbitrary, but most of the convolution kernels used in CNN are odd-sized squares, such as 1x1, 3x3, 5x5, 7x7, 11x11, etc., and there are also rectangular convolutions. The core, such as 3x3 in Inceptionv3, becomes 1x3 and 3x1.
Why are the convolution kernels in CNN generally square and not rectangular?
The size selection is generally 3x3, the smaller the better, which reduces the number of parameters and reduces complexity. Multiple small convolution kernels are better than one large convolution kernel for two reasons:
1. The number of parameters is reduced to
2.3 3x3 convolution kernels = 1 7x7 convolution kernel = 1 5x5 convolution kernel
Take Mnist as an example, the picture is 28x28, use a 5x5 convolution kernel, stride=1 to convolve it, Result=(28-5)/1+1=24,
use two 3x3 convolution kernels:
(28-3)/ 1+1=26
(26-3)/ 1+1=24
2. Number of convolution kernels
It is equal to the number of channels of the output feature map. The greater the number of convolution kernels, the more types of features are extracted. It is usually 2^n, which is multiplied by a multiple of 16.
3. Number of convolutional layers
To set the number of convolution layers, choose the model with the best performance. Set how many layers it has. This is influenced by many aspects such as training data, activation function, gradient update algorithm, etc., and it cannot be found simply by trying it out.
4. Padding selection
Referring to a ppt, I found:
When the convolution kernel is 3, the padding is selected.
When the convolution kernel is 5, the padding is selected.
When the convolution kernel is 7, the padding is selected. Select 3.
The above are just settings in principle. In fact, it is better to set the parameters according to the model. After all, everyone Parameter adjustment has been optimized. . . .
5.Stride selection
The step size usually does not exceed the width or length of the convolution kernel. When the step size is greater than 1, there is a downsampling effect. For example, when the step size is 2, the size of the feature map can be reduced by half.
LeNet parameter examples
Examples of AlexNet parameters
VGG parameter examples
Reference
1. How are the size of the convolution kernel, the number of convolution layers, and the number of maps in each layer of the convolutional neural network determined?
2. Detailed understanding of stride and padding in CNN
3. Parameter calculation of convolutional neural network
4. Detailed explanation of the settings of each convolution layer and output size calculation in convolutional neural network