Easy to understand convolution

Textbooks generally define functions

Insert picture description hereConvolution as follows:
a continuous form:
Insert picture description here
discrete form:
Insert picture description here
and explains the first flip function of g, the g-axis corresponds to the number of functions to pleat from right to left, which is the convolution of "volume" Origin.

Then shift the g function to n, multiply the corresponding points of the two functions at this position, and then add them. This process is the "product" process of convolution.

This article mainly wants to explain two problems:

  1. How is the term convolution explained? What does "roll" mean? What does "product" mean?
  2. What is the meaning behind convolution and how to explain it?

## Application scenarios considered

In order to better understand these issues, we first give two typical application scenarios:

  1. Signal analysis After
    an input signal f (t) passes through a linear system (its characteristics can be described by the unit impulse response function g (t)), what should the output signal be? In fact, the output signal can be obtained by convolution operation.

  2. Image processing After
    inputting an image f (x, y) and performing convolution processing with a specially designed convolution kernel g (x, y), the output image will get various effects such as blurring and edge enhancement.

Understanding of convolution

Understanding of the term convolution: The so-called convolution of two functions is essentially to flip a function first and then perform sliding superposition.

In the continuous case, superposition refers to the integration of the product of two functions. In the discrete case, it is the weighted summation. For simplicity, it is collectively called superposition.

The whole looks like this process:

Flip—> Slide—> Overlay—> Slide—> Overlay—> Slide—> Overlay ... A series of overlay values ​​obtained by multiple slides constitute a convolution function.

The "convolution" of convolution refers to the process of turning the function from g (t) to g (-t); meanwhile, "convolution" also means sliding. If the convolution is translated as "fold", then the word "fold" only has the meaning of flipping.

The "product" of convolution refers to integration / weighted summation.

Some articles only emphasize sliding superposition and summation, but do not say that the function is flipped, I think it is not comprehensive; some articles actually understand the "volume" as "product", I think it is Zhang Guanli Dai.

Understanding the meaning of convolution :

  1. From the process of "product", we can see that the superimposed value we obtained is a global concept. Taking signal analysis as an example, the result of convolution is not only related to the response value of the input signal at the current time, but also related to the response of the input signal at all times in the past, considering the accumulation of the effects of all inputs in the past. In image processing, the result of the convolution process is actually to take into account the pixels around each pixel, even the pixels of the entire image, and perform some weighting processing on the current pixel. Therefore, "product" is a global concept, or a kind of "mixing", which mixes two functions in time or space.

  2. Why do you want to "roll"? Is it not good to directly multiply? I understand that the purpose of "rolling" (flipping) is actually to impose a constraint, which specifies what to use as a reference during the "product". In the signal analysis scene, it specifies the specific point in time before and after the "product", in the spatial analysis scene, it specifies the location around which to perform the accumulation process.

For example, here are a few examples to explain why it should be flipped and the meaning of superposition and summation.

Example 2: Throwing the dice
How to explain convolution easily in this question? Ranked No. 1 in the classmate Ma gave a good example (some pictures below are excerpted from classmate Ma's article, thank you here), the use of dice to illustrate the application of convolution.

The problem to be solved is: there are two dice, throw them both, what is the probability that the two dice will add up to 4?

Insert picture description here
To analyze, there are three cases when the points of the two dice add up to 4: 1 + 3 = 4, 2 + 2 = 4, 3 + 1 = 4

Therefore, the probability that the two dice points add up to 4 is:
Insert picture description here
The way to write the convolution is:
Insert picture description here
Here I want to further explain with the logic of flipping and sliding superposition above.

First, because the sum of the points of the two dice is 4, in order to satisfy this constraint, we still flip the function g, and then multiply the corresponding numbers above and below the shadow area, and then add up, which is equivalent to finding the convolution value of the independent variable 4. , As shown in the following figure:
Insert picture description here
Further, after such a flip, it can be easily generalized to find the probability when the sum of the two dice points is n, which is the convolution f * g (n) of f and g, as shown in the following figure :
Insert picture description here
As can be seen from the above figure, the sliding of the function g brings about an increase in the sum of points. The constraint on f and g in this example is the sum of points, which is also the independent variable of the convolution function. If you are interested, you can also calculate that if the probability of each point of the dice is equal, then the probability of the two dice and n = 7 is the largest.

Example 3: Image processing

Or how to explain the convolution in a simple and understandable way? Examples of Chinese and Malaysian students. The image can be expressed in the form of a matrix (the following picture is taken from the article of classmate Ma)
Insert picture description here
. The processing function of the image (such as smoothing, or edge extraction) can also be represented by a g matrix
Insert picture description here
. It is already a two-dimensional function, which is equivalent to:
Insert picture description here
Insert picture description here
then how to calculate the convolution [formula] of the functions f and g at (u, v)?

According to the definition of convolution, the two-dimensional discrete form of the convolution formula should be:
Insert picture description here
from the definition of convolution, it should be accumulated in both x and y directions (corresponding to the two subscripts i and j in the discrete formula above) , And is unbounded, from negative infinity to positive infinity. However, the real world is bounded. For example, the image processing function g listed above is actually a 3x3 matrix, which means that the values ​​of all points except the vicinity of the origin are 0. Considering this factor, the above formula actually degenerates. It only selects points near the coordinates (u, v) for calculation. So, the real calculation is as follows:
Insert picture description here
First, we take out the matrix at (u, v) from the original image matrix:
Insert picture description here
and then flip the image processing matrix (this flip is a bit interesting, there can be several different understandings, and its effect It is equivalent: (1) first flip along the x axis, then flip along the y axis; (2) first flip along the x axis, then flip along the y axis;), as follows:

Original matrix: Matrix
Insert picture description here
after flipping:
Insert picture description here
(1) First flip along the x axis, then flip along the y axis
Insert picture description here
(2) First flip along the y axis, then flip along the x axis When
Insert picture description here
calculating the convolution, you can use [formula] and [formula ] inner product:
Insert picture description hereInsert picture description here
Insert picture description here
author: palet
link: https: //www.zhihu.com/question/22298352/answer/637156871
source: know almost
copyrighted by the author. For commercial reproduction, please contact the author for authorization, and for non-commercial reproduction, please indicate the source.

Please note that the above formula has a characteristic that the sum of the subscripts of the two corresponding variables a and b for multiplication is (u, v), and its purpose is to constrain this weighted summation. This is why the matrix g must be flipped. The reason why the above matrix subscripts are so written, and flipped, is to let everyone see the relationship with the convolution more clearly. The advantage of this is that it is easy to promote and understand its physical meaning. In fact, in the calculation, the matrix after flipping is used, and the inner product of the matrix is ​​directly calculated.

The above calculation is the convolution at (u, v), sliding along the x axis or y axis, you can find the convolution at various positions in the image, and the output result is the processed image (that is, after smoothing, edge extraction, etc.) Various processed images).

Think about it further. When calculating the image convolution, we directly took the matrix at (u, v) from the original image matrix. Why did we choose the matrix at this position, in essence, to meet the above constraints. Because we want to calculate the convolution at (u, v), and the g matrix is ​​a 3x3 matrix, and the sum of the subscript and this 3x3 matrix is ​​(u, v), we can only take the original image with (u, v) The 3x3 matrix at the center is the matrix of the shaded area in the figure.

By extension, if the g matrix is ​​not 3x3, but 7x7, then we have to take the 7x7 matrix centered on (u, v) in the original image for calculation. It can be seen that this kind of convolution is to consider all the adjacent pixels in the original image and mix them. The range of adjacent regions depends on the dimension of the g matrix. The larger the dimension, the more peripheral pixels are involved. The design of the matrix determines whether the mixed output image is blurred or sharper than the original image.

For example, the following image processing matrix will make the image smoother and appear more blurry, because it performs average processing with the surrounding pixels:
Insert picture description here
The following image processing matrix will make the pixel value change more obvious, strengthen the edge, the change did not affect gentle place to achieve the purpose of extracting the edge:
Insert picture description here
author: palet
link: https: //www.zhihu.com/question/22298352/answer/637156871
source: know almost

Published 19 original articles · Likes2 · Visits 740

Guess you like

Origin blog.csdn.net/zan1763921822/article/details/104512607