convolution

Convolution definition

Simply put, a kernel performs a weighted summation on the image, which can be expressed as

$H(x,y) = \sum _{i = 0} ^{M_i-1} \sum _{j = 0} ^{M_j-1}I(x+i-a_i,y+j-a_j)K(i,j)$

The anchor point of the convolution kernel refers to the update output point of the convolution kernel, which is generally the center point of the convolution kernel;

The step size of the convolution is the distance that the convolution kernel moves once on the picture.

Convolution boundary problem

When the convolution kernel moves to the boundary of the image, part of the convolution kernel will exceed the boundary of the original image. At this time, the operation cannot be performed, and the boundary of the original image must be expanded before the operation can be performed.

Expansion method

BORDER_DEFAULT: Fill with known edge mirroring

BORDER_CONSTANTP: Fill the edges with specified pixels

BORDER_REPLICATE: Fill the edges with the most edge pixels

BORDER_WRAP: Fill the edge with pixels on the other side

BORDER_ISOLATED: Fill the edges with 0

Convolution operator

First-order operator

Rebert

Roberts operator is also called cross differential operator. It is a gradient algorithm based on cross difference, which detects edge lines through local difference calculation. It is commonly used to process steep and low-noise images. When the edge of the image is close to plus or minus 45 degrees, the processing effect of this algorithm is more ideal. The disadvantage is that the positioning of the edge is not accurate, and the extracted edge line is thick.

$G_x = \begin{bmatrix} +1 & 0 \\ 0 & -1 \end{bmatrix}$ $G_y=\begin{bmatrix} 0 & +1\\ -1 & 0 \end{bmatrix}$

prewitt

Prewitt is a differential operator for image edge detection. Its principle is to use the difference generated by the pixel gray value in a specific area to achieve edge detection. Since the Prewitt operator uses a 3*3 template to calculate the pixel values in the area, and the Robert operator’s template is 2*2, the edge detection results of the Prewitt operator are more obvious in the horizontal and vertical directions than the Robert operator , Prewitt operator is suitable for identifying images with more noise and gradual gray scale.

$G_x = \begin{bmatrix} -1 & 0 & 1 \\ -1 & 0 & 1 \\ -1 & 0 & 1\end{bmatrix}$ $G_y = \begin{bmatrix} -1 & -1 & -1 \\ 0 & 0 & 0 \\ 1 & 1 & 1\end{bmatrix}$

The classic Prewitt operator believes that: any pixel with a new gray value greater than or equal to the threshold is an edge point. That is, select an appropriate threshold T, if P(i,j)≥T, then (i,j) is the edge point, and P(i,j) is the edge image. This kind of judgment is unreasonable and will cause misjudgment of edge points, because the gray value of many noise points is also very large, and for edge points with small amplitude, the edges are lost instead.

　　Prewitt operator has an effect on noise suppression. The principle of suppressing noise is through pixel averaging, but pixel averaging is equivalent to low-pass filtering of the image, so Prewitt operator is not as good as Roberts operator for edge positioning.

Sobel

Sobel operator is a discrete differential operator used for edge detection, which combines Gaussian smoothing and differential derivation. This operator is used to calculate the approximate value of image brightness. According to the brightness of the edge of the image, the specific points exceeding a certain number in the area are recorded as edges. The Sobel operator adds the concept of weight on the basis of the Prewitt operator. It is believed that the distance between adjacent points has a different impact on the current pixel. The closer the pixel is, the greater the impact of the current pixel, so as to realize the image Sharpen and highlight edge contours.

The edge positioning of the Sobel operator is more accurate, and it is often used for images with more noise and gradual grayscale.

$G_x = \begin{bmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1\end{bmatrix}$ $G_y = \begin{bmatrix} -1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1\end{bmatrix}$

The Sobel operator detects the edge based on the gray-weight difference of the pixel point up and down, the left and right adjacent points, and reaches the extreme value at the edge. It has a smoothing effect on noise and provides more accurate edge direction information. Because the Sobel operator combines Gaussian smoothing and differential derivation (differentiation), the result will have more noise resistance. When the accuracy requirements are not very high, Sobel operator is a more commonly used edge detection method.

The approximate value of the horizontal and vertical gradient of each pixel of the image can be combined with the following formula to calculate the magnitude of the gradient.

$G = \sqrt{G_x^2+Gy^2}$

However, in actual engineering, the above formula is often replaced with the following formula to simplify calculations:

$G=\left | G_x \right | + \left | G_y \right |$

Isotropic Sobel算子

　　Another form of the Sobel operator is the (Isotropic Sobel) operator, a weighted average operator. The weight is inversely proportional to the distance between the zero point and the center store. When the edges are detected in different directions, the gradient amplitude is the same, which is commonly referred to as isotropic Sobel (Isotropic Sobel) operator. There are also two templates, one for detecting horizontal edges, and the other for detecting vertical flat edges. Compared with ordinary Sobel operators, the isotropic Sobel operator has more accurate position weighting coefficients, and the magnitude of the gradient is the same when detecting edges in different directions.

$G_x = \begin{bmatrix} -1 & 0 & 1 \\ -\sqrt{2} & 0 & \sqrt{2}\\ -1 & 0 & 1\end{bmatrix}$ $G_y = \begin{bmatrix} -1 & -\sqrt{2} & -1 \\ 0 & 0 & 0 \\ 1 & \sqrt{2} & 1\end{bmatrix}$

Scharr operator

The difference between Scharr operator and Sobel operator is in the smoothing part. The smoothing operator used here is 1/16 *[3, 10, 3], compared to 1/4*[1, 2, 1], the center The weight of the element is heavier. This may be relative to the image, which is a more random signal, and the domain is not relevant. Therefore, the neighborhood smoothing should use a Gaussian function with a relatively small standard deviation, which is thinner and taller. template.

　　Because the Sobel operator calculates relatively small kernels, the accuracy of the approximate derivative is relatively low. For example, a 3*3 Sobel operator, when the gradient angle is close to the horizontal or vertical direction, its inaccuracy becomes more obvious. . The Scharr operator is as fast as the Sobel operator, but the accuracy is higher, especially for scenarios with smaller cores. Therefore, the use of 3*3 filters to achieve image edge extraction is more recommended to use Scharr operator.

The Sobel operator is a form of filter operator to extract edges. Each of the X and Y directions uses a template, and the two templates are combined to form a gradient operator. The X-direction template has the greatest impact on the vertical edge, and the Y-direction template has the greatest impact on the horizontal edge.
Robert operator is a kind of gradient operator. It expresses the gradient by cross-checking. It is an operator that uses the local difference operator to find the edge. It has the best effect on images with steep and low noise.
The prewitt operator is a weighted average operator, which has the effect of suppressing noise, but the pixel average is equivalent to the same filtering of the image, so the prewitt operator is not as good as the robert operator in positioning the edge.

Second-order operator

Laplace

Laplacian operator is a second-order differential operator in n-dimensional Euclidean space, which is often used in the field of image enhancement and edge extraction. It calculates the pixels in the neighborhood by gray difference. The basic process is to determine the gray value of the center pixel of the image and the gray value of other pixels around it. If the gray value of the center pixel is higher, the gray value of the center pixel is increased; Reduce the gray level of the center pixel, so as to achieve image sharpening operation. In the algorithm implementation process, the Laplacian operator calculates the gradients in the four or eight directions of the center pixel in the neighborhood, and then adds the gradients to determine the relationship between the gray level of the center pixel and the gray levels of other pixels in the neighborhood, and finally through the gradient operation As a result, the pixel grayscale is adjusted.

$G_4 = \begin{bmatrix} 0 & 1 & 0 \\ 1 & -4 & 1 \\ 0 & 1 & 0\end{bmatrix}$ $G_8= \begin{bmatrix} 1 & 1 & 1 \\ 1 & -8 & 1 \\ 1 & 1 & 1\end{bmatrix}$

The basic process of the Laplacian operator is to determine the gray value of the central pixel of the image and the gray value of other pixels around it. If the gray of the center pixel is higher, then the gray of the center pixel is increased; otherwise, the gray of the center pixel is reduced. So as to realize the image sharpening operation. In the algorithm implementation process, the Laplacian operator calculates the gradients in the four or eight directions of the center pixel in the neighborhood, and then adds the gradients to determine the relationship between the gray level of the center pixel and the gray levels of other pixels in the neighborhood, and finally through the gradient operation As a result, the pixel grayscale is adjusted.

Laplace operator is an isotropic operator, second-order differential operator, with rotation invariance. It is more appropriate when you only care about the position of the edge without considering the grayscale difference of the surrounding pixels. The response of Laplace operator to isolated pixels is stronger than the response to edges or lines, so it is only suitable for noise-free images. In the presence of noise, low-pass filtering is required before edge detection using the Laplacian operator. Therefore, the usual segmentation algorithm combines the Laplacian operator and the smoothing operator to generate a new template.

LOG（Laplacian of Gaussian）

The LOG operator first performs Gaussian filtering on the image, and then finds the Laplacian second derivative, and detects the boundary of the image based on the pot zero point of the second derivative, that is, by detecting the zero crossings of the filtering result. To get the edge of the image or object.

　　The LOG operator comprehensively considers the two directions of noise suppression and edge detection, and combines the Gauss smoothing filter and the Laplacian sharpening filter to smooth out the noise first, and then perform edge detection, so the effect will be better . This operator is similar to the mathematical model in visual physiology, so it has been widely used in the field of image processing. It has strong anti-interference ability, high boundary positioning accuracy, good edge continuity, and can effectively extract boundaries with weak contrast.

API introduction

Fill the edges

copyMakeBorder(src,dst,top,bottom,left,right,borderType,color)
// src 原图   dst 目标图
// top,bottom,left,right 上下左右填充宽度
//borderType 填充类型 
//color 当使用BORDERTPYPE_CONSTANTP时，才用

Sobel (Scharr) operator

Sobel(  //Scharr相同
InputArray Src,   //输入图像
QutputArray dst,  //输出图像
int depth,        //输出位深度 设为"-1"会自适应 选择合适的
int dx,    //x方向，几阶导数
int dy,    //y方向，几阶导数      
int ksize,        //kernel大小，必须是奇数
double scale = 1, //输出是否缩放，即乘因子
double delta = 0, //输出是否加偏差，即加因子
int borderType = BORDER_DEFAULT  //
)

Laplacian算子

Laplacian( 
src_gray,     //src_gray: 输入图像
dst,          //dst: 输出图像
ddepth,       //ddepth: 输出图像的深度因为输入图像的深度是 CV_8U ，这里我们必须定义 ddepth = CV_16S 以避免外溢。
kernel_size,  //kernel_size: 内部调用的 Sobel算子的内核大小，此例中设置为3。
scale, 
delta, 
BORDER_DEFAULT 
);

Code practice

Edge supplement

#include <iostream>
#include <math.h>
#include <opencv2/opencv.hpp>

using namespace cv;
using namespace std;

int main(int argc, char* argv[])
{
	//src = imread("src.jpg");
	Mat src = imread("cat.png"),dst;
	if (!src.data)
	{
		cout << "cannot open image" << endl;
		return -1;
	}
	namedWindow("input image", WINDOW_AUTOSIZE);
	imshow("input image", src);
	Mat d1, d2, d3, d4,d5,d6,d7,d8;
	int top = (int)(0.10 * src.rows);
	int bottom = (int)(0.10 * src.rows);
	int left = (int)(0.10 * src.cols);
	int right = (int)(0.10 * src.cols);

	copyMakeBorder(src, d1, top, bottom, left, right, BORDER_CONSTANT, Scalar(255, 255, 255));
	copyMakeBorder(src, d2, top, bottom, left, right, BORDER_DEFAULT);
	copyMakeBorder(src, d3, top, bottom, left, right, BORDER_REFLECT);
	copyMakeBorder(src, d4, top, bottom, left, right, BORDER_REPLICATE);
	copyMakeBorder(src, d5, top, bottom, left, right, BORDER_WRAP);
	copyMakeBorder(src, d6, top, bottom, left, right, BORDER_ISOLATED);
	copyMakeBorder(src, d7, top, bottom, left, right, BORDER_REFLECT101);

	imshow("BORDER_CONSTANT", d1);
	imshow("BORDER_DEFAULT", d2);
	imshow("BORDER_REFLECT", d3);
	imshow("BORDER_REPLICATE", d4);
	imshow("BORDER_WRAP", d5);
	imshow("BORDER_ISOLATED", d6);
	imshow("BORDER_REFLECT101", d7);

	waitKey(0);
	return 0;
}

Operator extract edge

#include <iostream>
#include <math.h>
#include <opencv2/opencv.hpp>

using namespace cv;
using namespace std;

int main(int argc, char* argv[])
{
	//src = imread("src.jpg");
	Mat src = imread("src.jpg"),gray_src;
	if (!src.data)
	{
		cout << "cannot open image" << endl;
		return -1;
	}
	namedWindow("input image", WINDOW_AUTOSIZE);
	imshow("input image", src);
	GaussianBlur(src, src, Size(3, 3), 0, 0);
	cvtColor(src,src,COLOR_BGR2GRAY);

	//Robert算子
	//X
	Mat dst_rx, dst_ry;
	Mat kernel_x1 = (Mat_<int>(2, 2) << 1, 0, 0, -1);
	filter2D(src, dst_rx, -1, kernel_x1, Point(-1, -1), 0.0);
	
	//Y
	Mat kernel_y1 = (Mat_<int>(2, 2) << 0 ,1, -1, 0);
	filter2D(src, dst_ry, -1, kernel_y1, Point(-1, -1), 0.0);

	//拼接两个方向的梯度
	Mat xygrad = Mat(src.size(), src.type());
	int width = xygrad.cols;
	int height = xygrad.rows;
	for (int row = 0; row < height; row++)
	{
		for (int col = 0; col < width; col++)
		{
			int xg = dst_rx.at<uchar>(row, col);
			int yg = dst_ry.at<uchar>(row, col);
			int xy = xg + yg;
			xygrad.at<uchar>(row, col) = saturate_cast<uchar>(xy);
		}
	}
	imshow("Rebort", xygrad);


	// prewitt算子
	Mat dst_sx, dst_sy;
	//X
	Mat kernel_x2 = (Mat_<int>(3, 3) << -1, 0, 1, -1, 0, 1, -1, 0, 1);
	filter2D(src, dst_sx, -1, kernel_x2, Point(-1, -1), 0.0);
	
	//Y
	Mat kernel_y2 = (Mat_<int>(3, 3) << -1, -2, -1, 0, 0, 0, 1, 2, 1);
	filter2D(src, dst_sy, -1, kernel_y2, Point(-1, -1), 0.0);
	
	for (int row = 0; row < height; row++)
	{
		for (int col = 0; col < width; col++)
		{
			int xg = dst_sx.at<uchar>(row, col);
			int yg = dst_sy.at<uchar>(row, col);
			int xy = xg + yg;
			xygrad.at<uchar>(row, col) = saturate_cast<uchar>(xy);
		}
	}
	imshow("Prewitt", xygrad);

	//Sobel 算子
	Sobel(src, dst_rx, CV_16S, 1, 0, 3);
	convertScaleAbs(dst_rx, dst_rx);
	Sobel(src, dst_ry, CV_16S, 0, 1, 3);
	convertScaleAbs(dst_ry, dst_ry);
	for (int row = 0; row < height; row++)
	{
		for (int col = 0; col < width; col++)
		{
			int xg = dst_rx.at<uchar>(row, col);
			int yg = dst_ry.at<uchar>(row, col);
			int xy = xg + yg;
			xygrad.at<uchar>(row, col) = saturate_cast<uchar>(xy);
		}
	}
	imshow("Sobel", xygrad);

	//Scharr 算子
	Scharr(src, dst_rx, CV_16S, 1, 0);
	convertScaleAbs(dst_rx, dst_rx);
	Scharr(src, dst_ry, CV_16S, 0, 1);
	convertScaleAbs(dst_ry, dst_ry);
	for (int row = 0; row < height; row++)
	{
		for (int col = 0; col < width; col++)
		{
			int xg = dst_rx.at<uchar>(row, col);
			int yg = dst_ry.at<uchar>(row, col);
			int xy = xg + yg;
			xygrad.at<uchar>(row, col) = saturate_cast<uchar>(xy);
		}
	}
	imshow("Scharr", xygrad);

	//Laplacian 算子
	Mat edge_image;
	Laplacian(src, edge_image, CV_16S, 3);
	convertScaleAbs(edge_image, edge_image);
	imshow("Laplacian", edge_image);
	

	waitKey(0);
	return 0;
}