Convolutional Neural Network Study Notes and Experience (3) Convolution

A digital image is a two-dimensional discrete signal. To perform a convolution operation on a digital image is to use a convolution kernel (convolution template) to slide on the image, and compare the pixel gray value on the image point with the corresponding convolution kernel. The values ​​are multiplied, and then all the multiplied values ​​are added together as the gray value of the pixel on the image corresponding to the middle pixel of the convolution kernel.

From the effect of convolution, when convolution is performed on a two-dimensional image, the convolution kernel assigns more weights to pixels that meet certain conditions in the region, and other pixels assign less weight, which can be seen as A filtering behavior, so the convolution kernel of a convolutional neural network is sometimes called a filter, and the area where the convolution kernel is located is called the local receptive field. If there are pixels in the local perceptual domain that meet the conditions of increasing the weight, these pixels are said to have a certain feature, or the convolution kernel has captured a certain feature.

write picture description here
The figure above shows the result of convolving a 5*5 image with a 3*3 convolution kernel. The original image is a "cross" shape, the middle column of the convolution kernel is 1, and the rest of the elements are 0. The results show that the middle column of the original image has been strengthened. Intuitively, the center of the original image is in the vertical direction. features with consecutive pixels, and the convolution kernel captures this feature.

If the middle column of the original image is shifted, the convolution kernel can still capture this feature (may need to fill the periphery of the original image), or the result of the convolution operation has nothing to do with the position of the feature in the image. It is the basis for the resistance of convolutional neural networks to image translation. When using larger size images and convolution kernels, convolutional neural networks can also maintain insensitivity to rotation and perspective transformation to a certain extent.

Before convolutional neural networks became popular, there was a recognition model using gabor filter + SVM, and gabor filtering on two-dimensional images was also implemented by convolution. Unlike convolutional neural networks, gabor + svm model used svm as Classifiers, convolutional neural networks use traditional neural networks as classifiers. The most important point is that the gabor kernel needs to be designed manually, while the convolutional neural network convolution kernels are learned through backpropagation, which makes the convolutional neural network possible. Learning features that cannot be designed by humans, combined with activation functions, can bring the ability to express nonlinear features to the model and further improve the performance of the model.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325922603&siteId=291194637