The difference between convolution kernel (kernel) and filter (filter)

The difference between convolution kernel (kernel) and filter (filter)

  1. The convolution kernel is specified by length and width, which is a two-dimensional concept.
  2. The filter is specified by length, width and depth, which is a three-dimensional concept.
  3. A filter can be seen as a collection of convolution kernels.
  4. Filters have one dimension higher than convolution kernels - depth.
    The following combined with a multi-channel example can be understood immediately:
    insert image description here Figure 1 Figure
    1 is a convolution operation on a 3-channel image. The size of the convolution kernel is 3×3, and the number of convolution kernels is 3. At this time, the filter refers to the set of these three convolution kernels, and the dimension is 3×3×3. The previous 3×3 refers to the height (H) and width (W) of the convolution kernel, and the latter 3 refers to the number of convolution kernels.

The above operation is to perform a convolution operation on the three channels respectively, then add the results of the convolution, and finally output a feature map.

That is: a filter corresponds to a feature map.

Pay attention to the corresponding relationship of this number. Can it still be established if it is replaced by a convolution kernel?

In general, the concept of filter should be applied in the case of multi-channel, because in the case of multi-channel, we can't say that a convolution operation can generate a feature map, so the concept of filter is used to describe this situation. (This is my guess, hahahaha)

Let's look at a single-channel example:
insert image description here
Figure 2
Figure 2 is to perform a convolution operation on a single-channel image. The size of the convolution kernel is 3 × 3, and the number of convolution kernels is 1, and finally a feature map is obtained.

In the case of a single channel, in fact, the filter and the convolution kernel can be regarded as one thing, that is, filter=kernel.

If you want to get multiple feature maps, you only need to add a few more convolution kernels.

At this time, a convolution kernel corresponds to a feature map.

The difference between the convolution kernel and the filter described above is a view I personally agree with. But I have also seen in many blogs that many people directly use the convolution kernel to represent the filter. In the case of multi-channel, the convolution kernel is a three-dimensional representation. In addition to the width and height, it also increases the depth - the number of convolution kernels. It is hard to say which method is absolutely right or wrong here, because there are more people using both methods, and it is enough to understand what the author wants to express when encountering them. But when describing to others, I prefer to use filters to explain.
Reprinted: https://blog.csdn.net/weixin_38481963/article/details/109906338

Guess you like

Origin blog.csdn.net/Adam897/article/details/126388684