OpenGL.Shader: Zhige teaches you to write a live filter client (8) Visual filter: what is convolution? Image sharpening

OpenGL.Shader: Zhige teaches you to write a live filter client(8)

 

1. What is convolution

This chapter records the basic knowledge of learning image processing-convolution. Here is a brief introduction to what is convolution: how to understand convolution? You can think of convolution as a means of mixing information. Imagine two buckets full of information. We pour them into one bucket and stir them according to a certain rule. In other words, convolution is a process of mixing two kinds of information. In fact, convolution is a mathematical operation, which is not essentially different from subtraction, addition, multiplication and division.

When we apply convolution on an image, we perform convolution in two dimensions-horizontal and vertical. We mix two buckets of information: the first bucket is the input image, composed of three matrices-RGB three channels, where each element is an integer between 0 and 255. The second bucket is the convolution kernel (kernel), a single matrix of floating-point numbers. Think of the size and mode of the convolution kernel as a way of mixing images. The output of the convolution kernel is a modified image, which is often called a feature map in deep learning. There is a feature map for each color channel.

How is this done? We will now demonstrate how to mix these two kinds of information through convolution. One method is to take out a block of the same size as the convolution kernel from the input picture-here assume that the picture is 100×100 and the convolution kernel size is 3×3, then the block size we take out is 3×3— —Then perform multiplication and summation for each pair of elements at the same position (different from matrix multiplication, but similar to vector inner product, here is the "dot multiplication" of two matrices of the same size). The sum of the products generates a pixel in the feature map. When a pixel is calculated, move one pixel to take the next block and perform the same operation. The calculation of the feature map ends when it can no longer move to obtain a new block.

The above is the primary understanding of convolution. Of course, convolution is not only the above knowledge, but also the inner volume, outer volume and so on, and there will be opportunities to learn more in the future. Here is a reprinted foreign translation , a zero-based introduction to convolution and deep learning, students who are interested can take a closer look.

Talking about the theory, the content is summarized as : Convolution is a mathematical operation, and it is the same as addition, subtraction, multiplication and division. In the image field, the input input is the value of each pixel of the image; the convolution operation requires an operator called the convolution kernel, also known as the filter factor, the algorithm used is different, the operator is also different, and the effect is more It is very different; ouput input also corresponds to a pixel value.

 

2. Application of convolution-sharpening

First release the reference code address: https://github.com/MrZhaozhirong/NativeCppApp/blob/master/app/src/main/cpp/gpufilter/filter/GpuSharpenFilter.hpp

So how is convolution applied to image processing? The most classic is the processing of image sharpening and blurring. Using GPU for image sharpening on mobile devices generally uses spatial filters to perform template convolution processing on images. Among them, the Laplacian algorithm is more suitable for improving image blur, and is a more commonly used edge enhancement processing operator.

Among them, the sharpening filter of GPUImage is a kind of Laplace sharpening based on Laplacian. Let's take a look at how pull convolution is used on GL.

The first is the vertex shader SHARPEN_VERTEX_SHADER

attribute vec4 position;
attribute vec4 inputTextureCoordinate;

uniform float imageWidthFactor; // screen width step factor
uniform float imageHeightFactor; // screen height step factor
uniform float sharpness; // sharpening core value, input by external users

varying vec2 textureCoordinate; // 当前纹理坐标
varying vec2 leftTextureCoordinate;
varying vec2 rightTextureCoordinate;
varying vec2 topTextureCoordinate;
varying vec2 bottomTextureCoordinate;
varying float centerMultiplier; // Laplacian算子中心值
varying float edgeMultiplier; // Laplacian算子边缘值
void main()
{
    gl_Position = position;
    vec2 widthStep = vec2(imageWidthFactor, 0.0);
    vec2 heightStep = vec2(0.0, imageHeightFactor);
    textureCoordinate = inputTextureCoordinate.xy;
    leftTextureCoordinate = inputTextureCoordinate.xy - widthStep;
    rightTextureCoordinate = inputTextureCoordinate.xy + widthStep;
    topTextureCoordinate = inputTextureCoordinate.xy + heightStep;
    bottomTextureCoordinate = inputTextureCoordinate.xy - heightStep;
    centerMultiplier = 1.0 + 4.0 * sharpness;
    edgeMultiplier = sharpness;
}

At first glance at the vertex shader, I am really confused. How do I understand the four texture coordinates of left/top/right/bottom? There are also those two screen step factors, let’s take a look at the input of the upper code:

void init() {
    GpuBaseFilter::init(SHARPEN_VERTEX_SHADER.c_str(), SHARPEN_FRAGMENT_SHADER.c_str());
    mSharpnessLocation = glGetUniformLocation(mGLProgId, "sharpness");
    mImageWidthFactorLocation = glGetUniformLocation(mGLProgId, "imageWidthFactor");
    mImageHeightFactorLocation = glGetUniformLocation(mGLProgId, "imageHeightFactor");
    mSharpness = 0.0f;
}

void onOutputSizeChanged(int width, int height) {
    GpuBaseFilter::onOutputSizeChanged(width, height);
    glUniform1f(mImageWidthFactorLocation, 1.0f / width);
    glUniform1f(mImageHeightFactorLocation, 1.0f / height);
}

void setAdjustEffect(float percent) {
    mSharpness = range(percent * 100.0f, -2.0f, 5.0f);
    // from -4.0 to 4.0, with 0.0 as the normal level
}

The screen step length is the inverse of the current screen width and height, which is divided according to the number of pixels of the current screen. With the current texture coordinates corresponding to the central core of the laplacian operator, and the left and right offset 1/width=1 pixel, the remaining 8 grid pixels of the laplacian operator core are obtained. Understand as shown in the figure below

Because I chose the fast pull transform, the positions of LT, RT, LB, RB are not calculated, because the laplacian factor is 0 and the multiplication result is also 0, so there is no need to waste the output variables of the vertex shader.

 

Next look at the fragment shader SHARPEN_FRAGMENT_SHADER

varying highp vec2 textureCoordinate;
varying highp vec2 leftTextureCoordinate;
varying highp vec2 rightTextureCoordinate;
varying highp vec2 topTextureCoordinate;
varying highp vec2 bottomTextureCoordinate;
varying float centerMultiplier;
varying float edgeMultiplier;

uniform sampler2D SamplerY;
uniform sampler2D SamplerU;
uniform sampler2D SamplerV;
mat3 colorConversionMatrix = mat3(
                   1.0, 1.0, 1.0,
                   0.0, -0.39465, 2.03211,
                   1.13983, -0.58060, 0.0);
vec3 yuv2rgb(vec2 pos)
{
   vec3 yuv;
   yuv.x = texture2D(SamplerY, pos).r;
   yuv.y = texture2D(SamplerU, pos).r - 0.5;
   yuv.z = texture2D(SamplerV, pos).r - 0.5;
   return colorConversionMatrix * yuv;
}
void main()
{
    mediump vec3 textureColor = yuv2rgb(textureCoordinate);
    mediump vec3 leftTextureColor = yuv2rgb(leftTextureCoordinate);
    mediump vec3 rightTextureColor = yuv2rgb(rightTextureCoordinate);
    mediump vec3 topTextureColor = yuv2rgb(topTextureCoordinate);
    mediump vec3 bottomTextureColor = yuv2rgb(bottomTextureCoordinate);
    gl_FragColor = vec4((textureColor*centerMultiplier - \
    (leftTextureColor*edgeMultiplier + rightTextureColor*edgeMultiplier + topTextureColor*edgeMultiplier + bottomTextureColor*edgeMultiplier)), 1.0);
}

The fragment shader is relatively simple. The color value of the current position is extracted according to the input yuv. The laplacian operator is used as the convolution kernel. The convolution operation is performed according to the concept of convolution operation. Finally, it is completed in the format of vec4. The a value of rgba four channels=1.0, output to gl_FragColor for rastering.

 

to sum up:

Other sharpening algorithms are similar, mainly due to the difference in the convolution kernel. Using OpenGL's shader to do convolution is somewhat similar to the concept of CUDA's parallel operation. It is not like the traditional OpenCV for loop to extract a row of pixel value operations. And GL does not have to worry about the edge out of bounds/null value problem, because GL texture loading has an extended mode or a repeat mode.

The next chapter introduces the concepts of image quality and noise, and learns image blurring algorithms (Gaussian blur/square blur/bilateral blur)

Project address: https://github.com/MrZhaozhirong/NativeCppApp    laplacian sharpening filter  cpp/gpufilter/filter/GpuSharpenFilter.hpp

That is All.

Interest discussion group: 703531738. Code: Zhige 13567

 

Guess you like

Origin blog.csdn.net/a360940265a/article/details/106615865