35. Summary of convolution algorithm

I spent about 10 sections introducing the convolution algorithm, which I think is worthwhile.

Because in the CNN network, convolution is the absolute core, and there are many algorithms that can be derived from convolution, and the algorithm idea of ​​convolution is reflected in many other algorithms, such as matrix multiplication and fully connected algorithm. . This section summarizes the algorithm.

Parameters of the convolution algorithm

There are two most basic inputs for a convolution operation, one is the input image and the other is the convolution kernel. The most basic output is one, which is the feature map.

Of course, in some networks, you may also see a third parameter, which is bias. Bias is easy to understand, which means adding a bias term to the final output as a whole.

This bias term, like the convolution kernel (weight), is also a learnable parameter and is used to improve the training accuracy of the model. We can ignore this for the time being, because it is very simple, that is, performing an addition operation based on the output of the convolution.

The most basic parameters of a convolution operation are the following three:

padding: used to pad pixel values ​​around the input image

Stride: Indicates the step size of each slide of the convolution kernel when scanning the input image.

dilation: expansion rate of convolution kernel

The above are all the parameters, inputs and outputs of the convolution algorithm.

The role of convolution

The pixel range on the input image seen through the convolution kernel is called the "receptive field". Under different sizes of receptive fields, the features learned by convolution are different. A large receptive field may learn more macroscopic features such as contours. , a small receptive field may learn more abstract features.

Convolution is to complete the feature extraction of pixels in the receptive field range by performing a multiplication and accumulation operation on the pixels in the receptive field range.

By connecting multiple layers of convolution kernels in series and adding an activation function in the middle, the features extracted by multi-layer convolution can be further fused, and finally the feature representation of the original image can be obtained.

Guess you like

Origin blog.csdn.net/dongtuoc/article/details/135005001