The convolution kernel is also called the receptive field (receptive filed)
The sum of the inner product of the convolution kernel and the local image is a pixel of the next layer of image. That pixel only sees part of the previous image.
--------------------------------------------------------------------------------------------------------------------------------
Due to the multiple convolution kernels, the 5 blue neurons in the figure below have the same receptive field, but they have different roles. Because the parameters of the convolution kernel are different
--------------------------------------------------------------------------------------------------------------------------------
The pooling layer is a downsampling process to reduce the size and the number of weight parameters.
pooling does not change image depth
When pooling, the stride of the convolution kernel needs to make the convolution kernel not repeatedly cover the image, that is, downsampling the image division area.
2*2 filter, stride=2, will halve the size, which is very common.
The j convolution kernel weight parameter is a templete (template). The convolution kernel of the initial layer is the template of the edge, and the convolution kernel of the subsequent layer is the template of high-order features.
Each layer utilizes the features extracted by the previous layer.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
https://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html
You can view the results of conv net internal layers and weight visualization
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
development trend:
1 Use smaller convolution kernels and deeper networks
2 Do not use pooling and fully connected layers
Because pooling can do it, it can also be done by stride>1 of the convolution kernel.