OpenCV4 quick start (full study notes)

Chapter 1 Basics

1.1 Introduction to the basic structure

Author blog https://blog.csdn.net/shuiyixin?type=blog

https://blog.csdn.net/shuiyixin/article/details/106046827

1.1.1 Mat class

https://blog.csdn.net/shuiyixin/article/details/106014341

	Mat src, src_roi;
	src = imread("./image/cat.jpg");
	
	if (!src.data)
	{
		cout << "ERROR : could not load image.\n";
		waitKey(0);
		return -1;
	}
 
	imshow("input image", src);
	src_roi = Mat(src, Rect(100, 100, 300, 200));//很常用的构建mat类型的方法
	imshow("roi image", src_roi);
image-20220307152613133

1.1.2 Rect_structure

https://blog.csdn.net/shuiyixin/article/details/106085233

1.1.3 Scalar_structure

Scalar is actually a template class for a quaternary vector derived from Vec

    Scalar_(_Tp v0, _Tp v1, _Tp v2=0, _Tp v3=0);
	
    Scalar(0, 0, 255); // 红色
	
    Scalar(0, 255, 0); // 绿色
	
    Scalar(255, 0, 0); // 蓝色

In general, we only assign the first three values. The first parameter represents blue, the second parameter represents green, and the third parameter represents red, which is BGR:

https://blog.csdn.net/shuiyixin/article/details/106111460

1.1.4 RNG class

https://blog.csdn.net/weixin_43588171/article/details/104810295

1.2 Basic functions

1.2.1 saturate_cast()

https://blog.csdn.net/qq_27278957/article/details/88648737

In opencv , the role of saturate_cast is to prevent data overflow and protect it. The principle can be understood as the following code function:

if(data < 0)
     data = 0;
else if(data > 255)
	 data = 255;

1.2.2 reshape()

C++: Mat Mat::reshape(int cn, int rows=0) const

cn: indicates the number of channels (channels), if it is set to 0, it means to keep the number of channels unchanged, otherwise it will become the set number of channels.

rows: Indicates the number of matrix rows. If it is set to 0, it means to keep the original number of lines unchanged, otherwise it will become the set number of lines.

1.2.3 Create()

函数原型:
        inline void Mat::create(int _rows, int _cols, int _type)
		inline void Mat::create(Size _sz, int _type)
		void Mat::create(int ndims, const int* sizes, int type)
函数功能:
        1)如果需要,分配新的数组数据
		2)创建一个图像矩阵的矩阵体

1.3 Procedure

1.3.1 Exiting the program

char key = (char)waitKey();
		if (key == 27 || key == 'q' || key == 'Q')
		{
			break;
		}

Chapter 2 Data Loading, Displaying and Saving

2.1 Image storage container

Digital images are stored in the form of a matrix in the computer, and each element in the matrix describes certain image information, such as brightness, color, etc.

OpenCV uses the Mat class to store data, and uses automatic memory management technology to solve the problem of automatic release of memory. When the variable is no longer needed, the memory is released immediately.

image-20220402172122725

image-20220306111549663

CV_8U is unsigned 8 bits/pixel - that is, a pixel has a value in the range 0-255, which is the normal range for most image and video formats.

CV_32F is a float - a pixel is any value between 0-1.0, which is useful for calculations with some datasets, but it must be converted to 8 bits by multiplying each pixel by 255 to save or display.

2.1.2 Mat class construction and assignment

1. Construction of the Mat class

cv::Mat::Mat(3,3,CV_8S);//对应代码清单2-5
cv::Mat::Mat(cv::Size(3, 3), CV_8UC1);//对应代码清单2-6

cv::Mat A = cv::Mat_<double>(3, 3);
	cv::Mat f = cv::Mat::Mat(3,3,CV_8S);
	cv::Mat g = cv::Mat::Mat(cv::Size(3, 3), CV_8UC1);
	cv::Mat h(cv::Size(3, 3), CV_8UC1);
	cv::Mat i = g;
	cv::Mat j(g);//修改j会修改g的值
	cv::Mat k(g, cv::Range(2, 5), cv::Range::all());//range表示下标索引,,从0开始,0是第一行,这个意思是表示为下标为2的第3行到下标为5的第6行前(不包括下标为5的第6行)
	cv::Mat k(g, cv::Range(2, 5), cv::Range::all());//修改k会修改g的值
	cv::Mat::Mat(g, cv::Range(2, 5), cv::Range::all());//修改xx会修改g的值

2. Assignment of Mat class

	cv::Mat g = cv::Mat::Mat(cv::Size(3, 3), CV_8UC1, cv::Scalar(255));
	cv::Mat g = (cv::Mat_<int>(3, 3) << 1, 2, 3, 4, 5, 6, 7, 8, 9);

	cv::Mat c = cv::Mat_<int>(3, 3);
	for (int i = 0; i < c.rows; i++)
	{
		for (int j = 0; j < c.cols; j++)
		{
			c.at<int>(i, j) = i + j;//与cv::Mat_<int>(3, 3);的int必须一致
		}
	}

	//生成单位矩阵
	cv::Mat a(cv::Mat::eye(3, 3, CV_8UC1));
	//生成特定对角矩阵
	cv::Mat b = (cv::Mat_<int>(1, 3) << 1, 2, 3);
	cv::Mat c = cv::Mat::diag(b);
	//生成1矩阵、零矩阵
	cv::Mat d = cv::Mat::ones(3, 3, CV_8UC1);
	cv::Mat e = cv::Mat::zeros(3, 3, CV_8UC3);

	//矩阵运算
	cv::Mat a = (cv::Mat_<int>(3, 3) << 1, 2, 3, 4, 5, 6, 7, 8, 9);
	cv::Mat b = (cv::Mat_<int>(3, 3) << 1, 2, 3, 4, 5, 6, 7, 8, 9);
	cv::Mat c = (cv::Mat_<double>(3, 3) << 1.0, 2.1, 3.2, 4.0, 5.1, 6.2, 2, 2, 2);
	cv::Mat d = (cv::Mat_<double>(3, 3) << 1.0, 2.1, 3.2, 4.0, 5.1, 6.2, 2, 2, 2);
	cv::Mat e, f, g, h, i;
	e = a + b;//四则运算超了最大值按照最大值算
	f = c - d;
	g = 2 * a;
	h = d / 2.0;
	i = a - 1;

	cv::Mat j, m;/*Mat 类中的数据类型必须是CV_32FC I 、CV_64FCI 、CV _32FC2 , CV 64FC2
这4 种中的一种,也就是对于一个二维的Mat 类矩阵, 其保存的数据类型必、须是float类型或者
double 类型*/
	double k;
	j = c * d;
	k = a.dot(b);//相当于向量运算中的点成,也叫向量的内积、数量积
				 //Mat矩阵的dot方法扩展了一维向量的点乘操作,把整个Mat矩阵扩展成一个行(列)向量,之后执行向量的点乘运算,仍然要求参与dot运算的两个Mat矩阵的行列数完全一致。
				 //dot操作不对参与运算的矩阵A、B的数据类型做要求
				 //若参与dot运算的两个Mat矩阵是多通道的,则计算结果是所有通道单独计算各自.dot之后,再累计的和,结果仍是一个double类型数据。
	m = a.mul(b);//对应位相乘,放在原位
				 // 对数据类型没有要求,但A、B必须一致
				 // 产生的数据类型默认与A、B一致

/*在图像处理领域, 常用的数据类型是CV_8U ,其范围是0-255 ,当两个比较大的整数相乘时,
就会产生结果溢出的现象, 输出结果为255 ,因此,在使用mu1 0方法时, 需要防止出现数据溢出
的问题.*/
image-20220305105838052

image-20220305110324496

2.1.4 Reading of Mat elements

image-20220305122015202

	cv::Mat b(3, 4, CV_8UC3, cv::Scalar(0, 0, 1));
	cv::Vec3b vc3 = b.at<cv::Vec3b>(0, 0);//cv::Vec3x对应三通道数据,cv::Vec2x对应二通道数据,cv::Vec4x对应四通道数据
	int first = (int)vc3.val[0];
	int second = (int)vc3.val[1];
	int third = (int)vc3.val[2];
	std::cout << first << " " << second << " " << third << "" << std::endl;

	for (int i = 0; i < b.rows; i++)
	{
		uchar* ptr = b.ptr<uchar>(i);
		for (int j = 0; j < b.cols*b.channels(); j++)
		{
			std::cout << (int)ptr[j] << " ";
		}
		std::cout << std::endl;
	}
	//当读取第2行数据中第3个数据时, 可以用b.ptr<uchar>(1)[2]的形式来直接访问.




	cv::MatIterator_<cv::Vec3b> it = b.begin<cv::Vec3b>();//mat是什么类型,就要用什么类型的iterator接受指针,不然会报错
	cv::MatIterator_<cv::Vec3b> it_end = b.end<cv::Vec3b>();
	for (int i = 0; it != it_end; it++)
	{
		std::cout << *it << " ";//可以直接输出vec3b类型的值,,,*it
		if ((++i % b.cols) == 0)
		{
			std::cout << std::endl;
		}
	}

	std::cout << (int)(*(b.data + b.step[0] * 1 + b.step[1] * 1 + 1)) << std::endl;//https://blog.csdn.net/baoxiao7872/article/details/80210021讲解了data和step的用法
image.at<uchar>(i, j)

One thing to note is that i corresponds to the y coordinate of the point, and j corresponds to the x coordinate of the point, not the (x, y) we are used to.

image-20220306104303657

image-20220306113754298

2.2 Image reading and display

	cv::Mat cv::imread(const string & filename, int flags = IMREAD_COLOR)//这些标志参数在功能不冲突的前提下可以同时声明多个,不同参数之间用" 1 " 隔开。
    //在默认情况下,读取图像的像章数目必须小于2^30,可以通过修改革统变量中的OPENCV 10 MAX IMAGE: PIXELS 参数调整能够读取的最大像章数目.
        //empty()函数是否为真来判断是否成功读取图像,如果读取图像失败,那么data 属性返回值为0. emptyO函数返回值为1
    
   	void cv::namedWindow(const String & winname, int flags = WINDOW_AUTOSIZE)//namedWindow()窗口函数属性标志餐数
        
    void cv::imshow(const string & winname, InputArray mat)//如果后面程序执行完直接退出,那么显;示的图像有可能闪一下就消失,因此在需要显示图像的程序中,往往会在imshow()函数后跟有cv::waitKey()函数,用于将程序暂停一段时间.cv::waitKey()函数是以毫秒计的等待时长,如果参数默认或者为"0" ,那么表示等待用户按键结束该函数.

image-20220306154802090image-20220306154912279

2.3 Video loading and camera calling

//这是打开视频	
	cv::VideoCapture::VideoCapture();
	cv::VideoCapture::VideoCapture(const String & filename, int apiPreference = CAP_ANY);
//需要通过自Opened()函数进行判断.如果读取成功, 则返回值为true; 如果读取失败,则返回值为false .

//这事调用摄像头
	cv::VideoCapture::VideoCapture(int index, int apiPreference = CAP_ANY);//摄像头号从0开始

image-20220306162300019

2.4 Data storage

	bool cv::imwrite(const String & filename, InputArray img, const std::vector<int>& params = std::vectro<int>());
    
    cv::VideoWriter::VideoWriter(const String & filename, int fourcc, double fps, Size frameSize, bool isColor=true);

	cv::FileStroage::FileStorage();//需要用open()函数操作
	cv::FileStroage::FileStorage(const String & filename, int flags, const String & encoding  = String());
	cv::FielStorage::open(const String & filename, int flags, const String & encoding  = String());
	cv::FielStorage::write(const String & name, int val);//存在多个重载,可以写多种类型的变量值
#include <opencv2/opencv.hpp>
#include <iostream>
#include <string>

using namespace std;
using namespace cv;

int main(int argc, char** argv)
{
	system("color F0");  //修改运行程序背景和文字颜色
	//string fileName = "datas.xml";  //文件的名称
	string fileName = "datas.yaml";  //文件的名称
	//以写入的模式打开文件
	cv::FileStorage fwrite(fileName, cv::FileStorage::WRITE);
	
	//存入矩阵Mat类型的数据
	Mat mat = Mat::eye(3, 3, CV_8U);
	fwrite.write("mat", mat);  //使用write()函数写入数据
	//存入浮点型数据,节点名称为x
	float x = 100;
	fwrite << "x" << x;
	//存入字符串型数据,节点名称为str
	String str = "Learn OpenCV 4";
	fwrite << "str" << str;
	//存入数组,节点名称为number_array
	fwrite << "number_array" << "[" <<4<<5<<6<< "]";
	//存入多node节点数据,主名称为multi_nodes
	fwrite << "multi_nodes" << "{" << "month" << 8 << "day" << 28 << "year"
		<< 2019 << "time" << "[" << 0 << 1 << 2 << 3 << "]" << "}";

	//关闭文件
	fwrite.release();

	//以读取的模式打开文件
	cv::FileStorage fread(fileName, cv::FileStorage::READ);
	//判断是否成功打开文件
	if (!fread.isOpened())
	{
		cout << "打开文件失败,请确认文件名称是否正确!" << endl;
		return -1;
	}

	//读取文件中的数据
	float xRead;
	fread["x"] >> xRead;  //读取浮点型数据
	cout << "x=" << xRead << endl;

	//读取字符串数据
	string strRead;
	fread["str"] >> strRead;
	cout << "str=" << strRead << endl;

	//读取含多个数据的number_array节点
	FileNode fileNode = fread["number_array"];
	cout << "number_array=[";
	//循环遍历每个数据
	for (FileNodeIterator i = fileNode.begin(); i != fileNode.end(); i++)
	{
		float a;
		*i >> a;
		cout << a<<" ";
	}
	cout << "]" << endl;

	//读取Mat类型数据
	Mat matRead;
	fread["mat="] >> matRead;
	cout << "mat=" << mat << endl;

	//读取含有多个子节点的节点数据,不使用FileNode和迭代器进行读取
	FileNode fileNode1 = fread["multi_nodes"];
	int month = (int)fileNode1["month"];
	int day = (int)fileNode1["day"];
	int year = (int)fileNode1["year"];
	cout << "multi_nodes:" << endl 
		<< "  month=" << month << "  day=" << day << "  year=" << year;
	cout << "  time=[";
	for (int i = 0; i < 4; i++)
	{
		int a = (int)fileNode1["time"][i];
		cout << a << " ";
	}
	cout << "]" << endl;

	cout << "  time=[";
	for (FileNodeIterator i = fileNode1["time"].begin(); i != fileNode1["time"].end(); i++)
	{
		float a;
		*i >> a;
		cout << a << " ";
	}
	cout << "]" << endl;//两种读取方法都行,一种使用迭代器,一种使用地址。

	system("pause");
	//关闭文件
	fread.release();
	return 0;
}

Chapter 3 Basic Image Operation

3.1 Image color space

3.1.1 Color model and conversion - converTo()

The RGB model is in the order of BGR in OpenCV.

Adding transparency to the RGB model is the RGBA model

HSV、YUV、Lab、GRAY

void cv::Mat::convertTo(OutputArray m, int rtype, double alpha =1, double beta = 0);//rtype转换图像的数据类型,alpha转换过程中的缩放因子,beta转换过程中的偏置因子

Alpha and beta are used to declare the conversion relationship between two data types, such as m(x,y)=saturate_cast(α(*this)(x,y)+β), the conversion method is to convert the original There is a linear transformation of the data

3.1.2 Multi-channel separation and merging - split(), merge()

//多通道分离
void cv::split(const Mat & src, Mat * mvbegin);//需要定义mvbegin的长度
void cv::split(InputArray m, OutputArrayofArrays mv);//mv是vector形式,不需要提前定义数组长度

//多通道融合
void cv::merge(const Mat *mv, size_t count, OutputArray dst);//mv需要合并的图像数组,每个图像必须拥有相同的尺寸和数据类型.count输入的图像数组长度
void cv::merge(InputArrayofArrays mv, OutputArray dst);//mv需要合并的图像向量vector,每个图像必须拥有相同的尺寸和数据类型

//多通道融合可以超过4个通道,image watch上每个像素点内不显示4之后的通道值,但是上面会显示每个通道的值

The picture below is a 6-channel image that splits a lena img into three channels, puts them in a mat[6], and merges them using merge

image-20220307113229489

3.2 Image pixel operation processing

3.2.1 Image pixel statistics

1. Find the maximum and minimum values ​​of pixels - minMaxLoc()

void cv::minMaxLoc(InputArray src, 
                   double * minVal, 
                   double * maxVal = 0, 
                   Point * minLoc = 0, 
                   Point * maxLoc =0, 
                   InputArray mask = noArray());//src输入的单通道矩阵

The data type Point is used to represent the pixel coordinates of the image,

The pixel coordinate axis of the image takes the upper left corner as the coordinate origin, the horizontal direction is the x axis, and the vertical direction is the y axis

Point(xy) corresponds to the rows and columns of the image expressed as Point (number of columns, number of rows)

In OpenCV, a variety of data types are set for two-dimensional coordinates and three-dimensional coordinates. For two-dimensional coordinate data types, integer coordinates cv::Point2i (or cv::Point ), double type coordinates cv::Point2d, Floating-point coordinates cv: :Point2f. For three-dimensional coordinates, the above-mentioned coordinate data types are also defined, only need to change the number "2" into "3. For the specific data of the x, y, and z axes in the coordinates, you can Access through the x, y, z properties of the variable, for example, Point.x can read the x-axis data of the coordinates.

To find a multi-channel matrix, you need to use cv::Mat::reshape() to convert multiple channels into a single channel

cv::Mat MatA_reshape = MatA.reshape(int cn, int rows = 0)//cn指通道数,rows指转换后矩阵的行数,默认0与转换前相同

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-SPdtTvkr-1666660001256)(https://gitee.com/lyz-npu/typora-image/raw/master /img/%E5%9B%BE3-7.png)]

2. Calculate the mean and standard deviation of the image - mean, meanStdDev

The average value of the image indicates the brightness and darkness of the overall image , and the larger the average value of the image, the brighter the overall image.

The standard deviation indicates the contrast degree of the light and dark changes in the image , and the larger the standard deviation is, the more obvious the light and dark changes in the image are.

	//求平均值
	cv::Scalar cv::mean(InputArray src, InputArray mask = noArray())//用scalar结构接收
   	//求平均值和标准差
    void cv::meanStdDev(InputArray src, OutputArray mean, OutputArray stddev, InputArray mask = noArray())//用mat结构接收

3.2.2 Pixel manipulation between images

1. Comparison operation of two images - max(), min()

void cv::max(InputArray src1,InputArray src2,OutputArray dst)
void cv::min(InputArray src1,InputArray src2,OutputArray dst)

This comparison operation is mainly used in the processing of matrix type data. The comparison operation with the mask image can achieve the effect of matting or selecting channels.

//对两张彩色图像进行比较运算
	Mat img0 = imread("E:/graduate/learn/OpenCV/《OpenCV 4快速入门》数据集/data/lena.png");
	Mat img1 = imread("E:/graduate/learn/OpenCV/《OpenCV 4快速入门》数据集/data/noobcv.jpg");

	if (img0.empty() || img1.empty())
	{
		cout << "请确认图像文件名称是否正确" << endl;
		return -1;
	}
	Mat comMin, comMax;
	max(img0, img1, comMax);
	min(img0, img1, comMin);
	imshow("comMin", comMin);
	imshow("comMax", comMax);

	//与掩模进行比较运算
	Mat src1 = Mat::zeros(Size(512, 512), CV_8UC3);//生成一个512*512的0矩阵
	Rect rect(100, 100, 300, 300);//创建一个坐标点(100,100)开始的300*300的矩形
	src1(rect) = Scalar(255, 255, 255);  //生成一个低通300*300的掩模,,,,把rect矩形叠加进src1矩阵中,rect中的值用scalar修改为255,255,255
	Mat comsrc1, comsrc2;
	min(img0, src1, comsrc1);
	imshow("comsrc1", comsrc1);

	Mat src2 = Mat(512, 512, CV_8UC3, Scalar(0, 0, 255));  //生成一个显示红色通道的低通掩模
	min(img0, src2, comsrc2);
	imshow("comsrc2", comsrc2);

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-UUdc5njU-1666660001256)(https://gitee.com/lyz-npu/typora-image/raw/master /img/%E5%9B%BE3-11.png)]image-20220307154002824

Use the black frame picture as a low-pass mask and the lena picture to find min(), and you can cut off the outside value. Red mask is the same

2. Logical operation of two pictures - bitwise_()

	//逻辑与
	void cv::bitwise_and(InputArray src1, Input Array src2, OutputArray dst, InputArray mask = noArray())
    //逻辑与
    void cv::bitwise_or(InputArray src1, Input Array src2, OutputArray dst, InputArray mask = noArray())
    //逻辑异或
    void cv::bitwise_xor(InputArray src1, Input Array src2, OutputArray dst, InputArray mask = noArray())
    //逻辑非
    void cv::bitwise_not(InputArray src, OutputArray dst, InputArray mask = noArray())

image-20220307155741961

image-20220307155721970

3.2.3 Image binarization——threshold()

	double cv::threshold(InputArray src, 
                         OutputArray dst, 
                         double thresh, 
                         double maxval, 
                         int type
                        )
        //thresh二值化的阈值
        //maxval二值化过程的最大值,只有THRESH_BINARY和THRESH_BINARY_INV两种二值化方法才使用,
        //type二值化方法
image-20220308093000075

The two flags THRESH_OTSU and THRESH_TRIANGLE are methods for obtaining thresholds, not threshold comparison methods. These two flags can be used together with the previous five flags, such as " THRESH_BINARY | THRESH_OTSU". The first five flags are all used when calling the function It is necessary to set the threshold artificially. If you do not understand the image and set the threshold unreasonably, it will have a serious impact on the effect after processing. These two signs respectively represent the use of the Otsu method (OTSU) and the triangle method (TRlANGLE). Obtain the threshold value of binarization like the gray value distribution feature, and the threshold value is given in the form of the function return value. Therefore, if the last parameter of the function sets any of these two flags, then the third function of the function The parameter thresh will be automatically given by the system, but it still cannot be defaulted when calling the function, but the program will not use this value . It should be noted that so far, OpenCV 4 only supports the input of images of type CV_8UC1 for these two flags.

Otsu method - the maximum variance method between classes: https://blog.csdn.net/yxswhy/article/details/77838622?locationNum=10&fps=1

Triangle method - suitable for unimodal histogram: https://blog.csdn.net/qq_45769063/article/details/107102117

Only one threshold value is used globally by the threshold() function. In actual situations, due to uneven illumination and the existence of shadows, only one threshold value globally will cause the white area in the shadow area to be binarized into black by the function, so adaptiveThreshold( ) function provides two local adaptive threshold binarization methods. The basic idea is to select a block centered on the target pixel, and then perform Gaussian or mean calculation on the pixels in the block area, and obtain the average or The Gaussian value is used as the threshold of the target pixel to binarize the target pixel grid. Performing such an operation on each pixel grid of the image completes the binarization of the entire image.

void cv::adaptiveThreshold(InputArray src, 
                           OutputArray dst, 
                           double maxValue,
                           int adaptiveMehtod,
                           int threshType,
                           int blockSize,
                           double C
                          )
        //maxValue二值化的最大值
        //adaptiveMethod自适应确定阈值的方法,均值法ADAPTIVE_THRESH_MEAN_C和高斯法ADAPTIVE_GAUSSIAN_MEAN_C两种.
        //thresholdType选择图像二值化的方法,只能是THRESH_BINARY和THRESH_BINARY_INV两种二值化方法
    	//blockSize自适应确定阈值的像素邻域大小,一般为3,5,7的奇数
    	//从平均值或者加权平均值中减去的常数

3.2.4 Lookup Table - LUT()

Lookup table, Look-Up-Table

void cv::LUT(InputArray src,//只能输入CV_8U类型
			 InputArray lut,//256 个像素灰度值的查找表,单通道或者与src 通道数相同.
			 OutputArray dst//输出图像矩阵,尺寸与src 相同,数据类型与lut 相同.
			 )

The data type of the function output image is consistent with the data type of the LUT , not the original image.

If the second parameter is a single channel, each channel in the input variable is mapped according to a LUT, if the second parameter is multi-channel, the i-th channel in the input variable is mapped according to the i-th of the second parameter channel LUT for mapping.

3.4 Image transformation

3.3.1 Image concatenation - cv::vconcat, cv::hconcat

	void cv::vconcat(const Mat * src, size_t nsrc, OutputArray dst)//src是Mat矩阵类型的数组,nsrc数组中Mat类型数据的数目
    void cv::vconcat(InputArray src1, InputArray src2, OutputArray dst)

vconcat is a vertical connection**, both need to have the same width, data type and number of channels**

	void cv::hconcat(const Mat * src, size_t nsrc, OutputArray dst)//src是Mat矩阵类型的数组,nsrc数组中Mat类型数据的数目
    void cv::hconcat(InputArray src1, InputArray src2, OutputArray dst)

hconcat is a vertical connection**, both need to have the same height, data type and number of channels**

Figure 3-19

3.3.2 Image size transformation - resize()

void cv::resize(InputArray src, 
                OutputArray dst, 
                Size dsize, //输出图像的尺寸
                double fx = 0,//水平轴的比例因子 
                double fy = 0,//垂直轴的比例因子
                int interpolation = INTER_LINEAR//插值方法的标志
               )

The dsize and fx ( fy ) of this function can adjust the parameters of the output image at the same time, so the two types of parameters only need to be used in actual use. When the output image size calculated according to the two parameters is inconsistent, the image set with dsize size prevails .

dsize=Size(round(fx*src.cols), round(fy*src.rows))

Shrink the image, usually using the INTER_AREA flag will have a better effect

Enlarge the image, using the INTER_CUBIC and INTER_LINEAR flags usually has a better effect

flag parameter shorthand effect
INTER_NEAREAST 0 nearest neighbor interpolation
INTER_LINEAR 1 bilinear interpolation
INTER_CUBIC 2 bicubic interpolation
INTER_AREA 3 Resampling using pixel area relationship, preferred for image reduction, similar to INTER_NEAREST when image is enlarged
INTER_LANCZOS4 4 Lanczos interpolation
INTER_LINEAR_EXACT 5 bit-accurate bilinear interpolation
INTER_MAX 7 Interpolate with mask

Nearest neighbor interpolation, bilinear interpolation: https://www.cnblogs.com/wanghui-garcia/p/11171954.html

Bicubic interpolation: https://blog.csdn.net/u013185349/article/details/84529982?

3.3.3 Image Flip - flip()

void cv::flip(InputArray src, 
              OutputArray dst, 
              int flipCode//    >0,绕y轴翻转;
              			  //	=O,绕x轴翻转;
              			  //	<O,绕x,y轴翻转
              )

3.3.4 Image Affine Transformation——getRotationMatrix2D(), warpAffine(), getAffineTransform()

OpenCV4 provides the getRotationMatrix2D() function to calculate the rotation matrix, and the warpAffine() function to implement the affine transformation of the image.

Mat cv::getRotationMatrix2D(Point2f center, 
                            double angle, 
                            double scale)//获取旋转矩阵

After the rotation matrix is ​​determined, the rotation of the matrix can be realized by performing radial transformation through the warpAffine() function.

void cv::warpAffine(InputArray src, 
                    OutputArray dst, 
                    InputArray M, //2*3的变换矩阵
                    Size dsize, //输出图像的尺寸
                    int flags = INTER_LEANER, //插值方法
                    int borderMode = BORDER_CONSTANT, //像素边界外推方法的标志
                    const Scalar & borderValue = Scalar()//填充边界使用的数值, 默认情况下为0.
                   )

Compared with image size transformation, two interpolation methods are added, which can be used together with other interpolation methods

image-20220308212751302

image-20220313115055932

Affine transformation is also called three-point transformation

Mat cv::getAffineTransform(const Point2f src[], const Point2f dst[]);//利用3个对应点俩去欸的那个变换矩阵M
Figure 3-23

3.3.5 Image perspective transformation——getPerspectiveTransform(), warpPerspective()

Figure 3-24

Perspective transformation is also called four-point transformation.

Mat cv::getPerspectiveTransform(const Point2f src[], //四个像素坐标
                                const Point2f dst[]
                               	int solveMethod = DECOMP_LU//计算透视变换矩阵方法的标志,默认选择最佳主轴元素的高斯消元法DECOMP_LU
                               	);
void cv::warpPerspective(InputArray src, 
                    	OutputArray dst, 
                    	InputArray M, //
                    	Size dsize, //输出图像的尺寸
                    	int flags = INTER_LEANER, //插值方法
                    	int borderMode = BORDER_CONSTANT, //像素边界外推方法的标志
                    	const Scalar & borderValue = Scalar()//填充边界使用的数值, 默认情况下为0.
                   		)

3.3.6 Polar coordinate transformation - warpPolar()

Polar coordinate transformation is to transform the image between the rectangular coordinate system and the polar coordinate system.

void cv::warpPolar(InputArray src, 
                   OutputArray dst, 
                   Size dsize, //输出图像的尺寸
                   Point2f center,
                   double maxRadius,
                   int flags//插值方法与极坐标映射方法标志, 两个方法之间通过"+"或者'"|" 号连接.
                   )

3.4 Drawing geometry on an image

draw circle

image-20220309092509456

draw straight line

image-20220309092517268

draw an ellipse

image-20220309092523897

draw elliptical polygon

image-20220309092757788

draw polygon

image-20220309093427425

image-20220309093757831

image-20220309094010527

3.5 Region of interest (ROI), deep copy and shallow copy

Rect_(_Tp _x, _Tp _y, _Tp _width, _Tp _height)
cv::Range(int start, int end)

screenshot image in img

img(Rect(p.x, p.y, width, height))
img(Range(rows_start, rows_end), Range(cols_start,cols_end))

The image interception mentioned in OpenCV and the assignment through = are shallow copies

Deep copy method , OpenCV 4 implements two types of methods through the **copyTo()** function

	void cv::Mat::copyTo(OutputArray m)const
    void cv::Mat::copyTo(OutputArray m, InputArray mask)const//掩膜矩阵只能是CV_8U,尺寸相同,但是对通道数没有要求
    //当掩模矩阵中某一位置不为0 时,表示复制原图像中相同位置的元素到新的图像中,
    void cv::copyTo(InputArray src, OutputArray dst, InputArray mask)

3.6 Image Pyramid

https://blog.csdn.net/zhu_hongji/article/details/81536820

The bottom of the image "pyramid" is a high-resolution representation of the image to be processed, and the top is a low-resolution representation.

Basic Theory and Technology of Image Pyramid in Feature Detection

Gaussian pyramid (Gaussian pyramid): used for down/down sampling, the main image pyramid

Laplacian pyramid: It is used to reconstruct the upper unsampled image from the lower image of the pyramid. In digital image processing, it is also the prediction residual. It can restore the image to the greatest extent, and it is used together with the Gaussian pyramid.
The downsampling and upsampling here refer to the size of the image (opposite to the direction of the pyramid) . ** Upwards means that the image size doubles, and downwards means that the image size is halved. **And if we understand the direction of the pyramid shown in the picture above, the upward image of the pyramid is actually shrinking, which is exactly the opposite.

3.6.1 Gaussian "Pyramid" - Resolving Scale Uncertainty

Scale: In popular understanding, the content contained in a grid of images ,,, a reduced image, a grid contains more information, is a large scale; a grid contains less information, is a small scale.

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-9V1hBRqL-1666660001260)(https://gitee.com/lyz-npu/typora-image/raw/master /img/%E5%9B%BE3-30.png)]

(1) Down-sampling of images

In order to obtain a pyramid image with a level of G_i+1, we use the following method:

**<1>** Perform Gaussian kernel convolution on the image G_i

**<2>** Remove all even-numbered rows and columns

Each layer up will reduce the size of the image once, usually, the size will be reduced to half of the original

The pyrDown() function is provided in OpenCV 4, which is specially used for image down-sample calculation

convolution:

https://www.cnblogs.com/geeksongs/p/11132692.html——It is easier to understand

https://www.cnblogs.com/skyofbitbit/p/4471675.html——Talk more carefully

//向下采样
void cv::pyrDown(InputArray src, 
                 OutputArray dst, 
                 const Size & dstsize = Size(), 
                 int borderType = BORDER_DEFAULT
                )

Gaussian image "pyramid" uses a convolution kernel

image-20220309111855094

Gaussian distribution https://blog.csdn.net/weixin_39124778/article/details/78411314

//向上采样
void cv::pyrUp(InputArray src, 
               OutputArray dst, 
               const Size & dstsize = Size(), 
               int borderType = BORDER_DEFAULT
               )

2. Laplacian Pyramid
Laplacian Pyramid is up-sampled, reconstructs the upper unsampled image, and predicts the residual in image processing. The resulting image is larger in size than the original image. Upsampling is as follows:
(1) Expand the image to twice the original size in each direction, and fill the new rows and columns with 0
(2) Use the same previous kernel (multiplied by 4) to convolve with the enlarged image , or an approximation of the "additional pixel"

Multiplication by 4 is because the original one-pixel Gaussian convolution kernel is used to operate the newly added three 0 pixels,,, a total of four pixel blocks, and a total of four pixels are operated with one pixel

3.7 Window Operation

3.7.1 Window interactive operation

int cv::createTracker(const String & trackbarname,//滑动条名
                     const String & winname,//滑动条窗口名
                     int * value,//指向整数变量的指针
                     int count,//最大值
                     TrackerbarCallback onChange = 0,//每次滑块位置更改时调用的函数指针,如果回调时NULL指针,不会调用任何回调,只跟新数值
                     void * userdata  = 0//传给回调函数的可选参数
                     )

What is the callback function

https://www.zhihu.com/question/19801131

3.7.2 Mouse response

void cv::setMouseCallback(const String & winname,
                         MouseCallback onMouse,//鼠标响应的回调函数
                         void * userdata = 0
                         )
    typedef void(* cv::MouseCallback)(int event,//鼠标响应事件标志,参数为EVENT*_
                                     int x,
                                     int y,
                                     int flags,
                                     void *userdata
                                     )
    
    
    if (event == EVENT_MOUSEMOVE && (flags & EVENT_FLAG_LBUTTON))
    // 提取flags的CV_EVENT_FLAG_LBUTTON 标志位

image-20220310102311436

image-20220310102321589

Chapter 4 Image Histogram and Template Matching

4.1 Drawing of image histogram (one-dimensional)

Whether the same object is rotated or translated, it has the same gray value in the image, so the histogram has the advantages of translation invariance and scaling invariance.

The image histogram is to count the number of each gray value in the image, and then use the gray value of the image as the horizontal axis, and use the number of gray values ​​or the proportion of the gray value as the vertical axis to draw a statistical graph.

void cv::calcHist(const Mat * images,//数组中所有的图像应具有相同的尺寸和数据类型,数据类型只能是CV_8U 、CV_16U和CV_32F这3种中的一种
                 int nimages,
                 const int * channels,
                 InputArray mask,
                 OutputArray hist,
                 int dims,//需要计算直方图的维度,必须是整数,并且不能大于CV_MAX_DIMS = 32
                 const int * histSize,
                 const float ** ranges,
                 bool uniform = true,
                 bool accumulate = false
                 )

calcHist can generate two-dimensional histogram

image-20220310202333074

https://www.bilibili.com/video/BV1i54y1m7tw?p=26

image-20220311101251989

4.2 Histogram operation

4.2.1 Histogram normalization - normalize()

Another commonly used normalization method is to find the maximum value in the statistical results and divide all the results by the maximum value to normalize all the data to 0~1 .

If normalization is not performed, the statistics of the number of pixels at each gray value will change as the image size becomes larger

void cv::normalize(InputArray src,
                  OutputArray dst,
                  double alpha = 1,//归一化下边界的标准值
                  double beta =0,//归一化的上线范围
                  int norm_type = NORM_L2,//归一化过程中数据范数种类标志
                  in dtype=-1,//输出数据类型选择标志,负数为输出数据与src拥有相同类型,否则与src拥有相同的通道数
                  InputArray mask = noArray()
                  )

image-20220310203457432

Offset normalization:img

The values ​​of the array are translated or scaled to a specified range, linearly normalized.

4.2.2 Histogram comparison - compareHist()

double cv::compareHist(InputArray H1,InputArray H2,int method)//method比较方法标志

image-20220310204907384

The correlation coefficient is calculated by different methods, representing different meanings

4.3 Histogram Application

4.3.1 Histogram equalization - equalizeHist()

Through the mapping relationship, the range of gray values ​​in the image is expanded, and the difference between the original two gray values ​​is increased to increase the contrast of the image , thereby highlighting the texture in the image. This process is called image histogram. Graph equalization .

After equalizing the histogram of the original image, although it is not very flat, it is much flatter than the histogram of the original image (the number of times each color appears is equal); the dynamic range is expanded, which was narrow before, and now it is stretched; for Where the contrast is very dark or very bright, there is no contrast or it is concentrated in one piece; the image can be pulled apart by equalization;

https://blog.csdn.net/schwein_van/article/details/84336633 Histogram equalization operation

Using the cumulative distribution function of r as the transformation function, an image with uniform probability density of gray level distribution can be produced

void cv::equalizeHist(InputArray src,OutputArray dst)//但是该函数不能指定均衡化后的直方图分布形式.
Figure 4-6

4.3.2 Histogram matching (histogram specification)

Algorithm for mapping a histogram to a specified distribution

The histogram matching operation can purposely enhance a certain gray-scale interval . Compared with the histogram equalization operation, although the algorithm has one more input, the
result after transformation is more flexible.

Histogram specification is done on the basis of histogram equalization (the process of image equalization is skipped from the result)

https://blog.csdn.net/superjunenaruto/article/details/80037777

image-20220311093121763

Mapping according to the minimum principle of cumulative probability interpolation,,,, for example, the distance between the original histogram 5 pixels 0.94 cumulative probability and the target histogram 6 pixels 0.84 is greater than the distance to the target histogram 7 pixels 1, so the original histogram 5 pixels are mapped to The position of the target histogram 7 pixels.

4.3.3 Histogram Backprojection——calcBackProject()

void cv::calcBackProject(const Mat * images,
                        int nimages,
                        const int * channels,
                        InputArray hist,
                        OutputArray backProject,
                        const float ** ranges,
                        double scale =1,
                        bool uniform = true
                        )
void cv::applyColorMap(InputArray src, OutputArray dst, int COLORMAP_*);

4.3.4 Contrast-limited adaptive histogram equalization - CLANE()

Histogram and Enhanced Contrast - Programmer Ade's article - Zhihu https://zhuanlan.zhihu.com/p/98541241

4.4 Template Matching of Images

void cv::matchTemplate(InputArray image, 
                       OutputArray templ, 
                       OutputArray result, 
                       int method, //模板匹配方法
                       InputArray mask = noArray())

Chapter 5 Image Filtering

5.1 Image convolution

https://www.cnblogs.com/geeksongs/p/11132692.html

The image convolution operation often scales the convolution template so that the sum of all values ​​​​is 1, and then solves the situation where the value exceeds the boundary after convolution.

void cv::filter2D(InputArray src, 
                  OutputArray dst, 
                  int ddepth, //输入如下表所示的类型,默认-1
                  InputArray kernel, //CV_32FC1类型的矩阵
                  Point anchor = Point(-1,-1), 
                  double delta = 0, 
                  int borderType = BORDER_DEFAULT)

image-20220312145057921

If you need to use different convolution templates to perform convolution operations on different channels, you need to use the split() function to separate multiple channels of the image and then calculate the convolution operation for each channel separately.

The filter2D() function will not rotate the convolution template. If the convolution template is asymmetrical, you need to rotate the convolution template by 180° and input it to the function

5.2 Types and generation of noise

There are four main types of noise in images, Gaussian noise, salt and pepper noise, Poisson noise and multiplicative noise.

5.2.1 Salt and Pepper Noise

random number function prototype

int cvflann::rand()
double cvflann::rand_double(double high = 1.0,double low = 0)
int cvflann::rand_int(int high = RAND_MAX, int low = 0)//rand_max系统中最大为32767

There is no function to directly generate salt and pepper noise in OpenCV

Salt and pepper noise, also known as impulse noise, will randomly change the pixel values ​​in the image. It is black and white bright and dark point noise generated by camera imaging, image transmission, decoding processing, etc.

5.2.2 Gaussian noise

Gaussian noise, as the name implies, refers to a type of noise that obeys a Gaussian distribution (normal distribution), usually due to sensor noise caused by poor lighting and high temperature. Usually in RGB images, the appearance is more obvious.

Unlike salt and pepper noise, which randomly appears anywhere in the image, Gaussian noise appears everywhere in the image.

image-20220312165643101
void cv::RNG::fill(InputOutArray mat, 
                   int distType, //目前生成的随机数支持均匀分布(RUNG::UNIFORM, 0)和高斯分布(RNG::NORMAL, 1).
                   InputArray a, //确定分布规律的参数。当选择均匀分布时, 该参数表示均匀分布的最小下限, 当选择高斯分布时, 该参数表示高斯分布的均值.
                   InputArray b, //确定分布规律的参数。当选择均匀分布时, 该参数表示均匀分布的最大下限, 当选择高斯分布时, 该参数表示高斯分布的标准差.
                   bool saturateRange = false)//预饱和标志,仅用于均匀分布.

The RNG class of OpenCV 4 is a non-static member function

Static functions can be called directly through the class::function, instead of calling functions through objects, while non-static functions must be called through objects, which also involves memory allocation when instantiating objects.

cv::RNG rng;
rng.fill(mat, RNG::NORMAL, 10, 20);

5.3 Linear filtering

Image filtering refers to the method of removing unimportant content in the image to make the content of concern more clear, such as removing noise in the image, extracting certain information, etc.

Image filtering is divided into filtering to eliminate image noise and filtering to extract some feature information in the image.

The signal frequency band that the filter used in image filtering allows to pass determines whether the filtering operation is to remove noise ( low-pass , high-resistance: remove noise in the image) or extract feature information ( high-pass : extract, enhance and sharpen image edge information) role.).

In low-pass filtrate, blur can be equivalent to filtering, such as image Gaussian blur and image Gaussian low-pass filtering are concepts.

The convolution template in the convolution operation is called a filter chopsticks, filter or neighborhood operator in image filtering

5.3.1 Mean filtering - blur()

average of all data

image-20220313103802406

void cv::blur(InputArray src, 
              OutputArray dst, 
              Size ksize, //卷积核尺寸
              Point anchor = Point(-1,-1), 
              int borderType = 	BORDER_DEFAULT
             )

5.3.2 Box filtering - boxFilter()

Box filtering can choose not to be normalized, and the sum of all pixel values ​​​​is used as the filtering result instead of the average of all pixel values

boxFilter() is normalized by default, regardless of the data type, the result is the same as blur() filtering.

image-20220313110304309

void cv::boxFilter(InputArray  src, 
                   OutputArray dst, 
                   int ddepth, 
                   Size ksize, 
                   Poitn anchor = Point(-1,-1), 
                   bool normalize = true, //默认进行归一化
                   int borderType = BORDER_DEFAULT
                  )

The sqrBoxFiler() function implements the summation of the squares of each image value in the filter, and is mostly used to process image data of the CV_32F type. After normalization, the brightness will also dim while blurring

The normalization of this function is the same as dividing the number of elements of the kernel

void cv::sqrBoxFiler(InputArray  src, 
                   OutputArray dst, 
                   int ddepth, 
                   Size ksize, 
                   Poitn anchor = Point(-1,-1), 
                   bool normalize = true, //默认进行归一化
                   int borderType = BORDER_DEFAULT
                  )

The normalization of this function is the same as dividing the number of elements of the kernel

5.3.3 Gaussian filter - GaussianBlur()

Gaussian smoothing is also used in the preprocessing stage of computer vision algorithms to enhance the image effect of images at different scales (see scale space representation and scale space implementation)

imgimg

void cv::GaussianBlur(InputArray src, //任意的通道数目,但是数据类型必需为CV_8U、CV_16U、CV_16S、CV_32F或CV_64F .
                      OutputArray dst, 
                      Size ksize, 
                      double sigmaX, 
                      double sigmaY = 0, 
                      int borderType = BORDER_DEFAULT
                     )

It is recommended to explicitly give the third, fourth, and fifth parameters of the function.

There is a certain mutual conversion relationship between the size and standard deviation of the Gaussian filter.

Mat cv::getGaussianKernel(int ksize, double sigma, int ktype = CV_64F)

You can use this function to obtain a Gaussian filter in one direction, and then multiply the filters in two directions to get a k*k two-dimensional filter

5.3.4 Separable filtering - sepFilter2D()

Separability means that the result of filtering in the x ( y ) direction first, and then filtering in the y ( X ) direction is the same as the overall filtering result after combining the filters in the two directions. The union of the filters in the two directions is to multiply the filters in the two directions,

image-20220313113841963

void cv::sepFilter2D(InputArray src, 
                     OutputArray dst, 
                     int ddepth, 
                     InputArray kernelX, 
                     InputArray kernelY, 
                     Point anchor = Point(-1,-1), 
                     double delta = 0, 
                     int borderType = BORDER_DEFAULT)

You can input filters in two directions. sepFilter2D does not distinguish the filtering direction in the xy direction, but filter2D needs to distinguish

image-20220313115356200

5.4 Nonlinear Filtering

Common nonlinear filters include median filtrate and bilateral filtering

5.4.1 Median filtering - medianBlur()

The median filter does not depend on the values ​​in the filter that are very different from the typical values, so it has a better effect on the processing of speckle noise and salt and pepper noise .

void cv::medianBlur(InputArray src, //可以是单通道、三逼迫和四通道(二通道和更多通道不行)
                    OutputArray dst, 
                    int ksize)

5.4.2 Bilateral filtering - bilateralFilter()

Bilateral filtering can smooth the high-frequency fluctuating signal, while retaining the signal fluctuation with large value changes, so as to realize the effect of retaining the edge information in the image.

When the filter filters the pixels near the edge, the pixel values ​​farther from the edge will not affect the pixel values ​​on the edge too much, thus preserving the sharpness of the edge.

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-02ucXQ8J-1666660001265)(https://gitee.com/lyz-npu/typora-image/raw/master /img/%E5%9B%BE5-24.png)]

Bilateral filtering uses a combination of two Gaussian filters, spatial and grayscale distance
. Spatial distance refers to the Euclidean distance between the current point and the center point of the convolution kernel. The closer to the center point, the greater the weight coefficient.
Grayscale distance refers to the grayscale of the current point. The absolute value of the difference between the degree and the gray value of the center point, the closer the color is, the greater the weight coefficient

void cv::bilateralFilter(InputArrray src, 
                         OutputArray dst, 
                         int d, 
                         double sigmaColor, //颜色空间滤被器的标准差值.这个参数越大,表明该像素领域内有越多的颜色被混合到一起, 产生较大的半相等颜色区域.
                         double sigmaSpace, //空间坐标中滤波器的标准差值.这个参数越大,表明越远的像素会相互影响,从而使更大领域中有足够相似的颜色获取相同的颜色.
                         int borderType = BORDER_DEFAULT
                        )

5.5 Image edge detection

5.5.1 Principle of edge detection

image-20220313163428739

image-20220313163719109

The obtained positive value indicates that the pixel value suddenly changes from low to high, and the obtained negative value indicates that the pixel value changes from high to low. Both of these are the edges of the image. Therefore, in order to express these two kinds of edge information in the image at the same time, need Find the absolute value of the calculated result .

void cv::convertScaleAbs(InputArrray src, 
                         OutputArray dst, 
                         double alpha = 1, //缩放因子,默认只求绝对值不缩放
                         double beta = 0//偏值
                        )

5.5.2 Sobel operator——Sobel()

https://blog.csdn.net/qq_32811489/article/details/90312421

The Sobel operator also combines the idea of ​​Gaussian smoothing filter, and improves the filter size from ksize x 1 to ksize x ksize. It improves the response to the edge of the smooth area and the effect is better

image-20220313170328733

is a central difference that gives higher weight to the middle horizontal and vertical lines

image-20220313171819565

The first formula is the Gaussian smoothing operator of the expansion when n is equal to 2, that is, the coefficient of the binomial expansion, and the second formula represents the difference.
The final result is the third-order sobel edge detection operator.

void cv::Sobel(InputArray src, 
               OutputArray dst, 
               int ddepth, 
               int dx, //x方向的差分阶数
               int dy, //y方向的差分阶数
               int ksize = 3, //sobel算子尺寸,,,任意一个方向的差分阶数都需要小子算子的尺寸
               double scale = 1, 
               double delta = 0, //偏值
               int borderType = BORDRE_DEFAULT
              )

5.5.3 Scharr operator - Scharr()

image-20220313203713678

Operators only have the above two

void cv::Scharr(InputArray src, 
                OutputArray dst, 
                int dx, 
                int dy, 
                double scale = 1, 
                double delta = 0, 
                int borderType = BORDER_DEFAULT
               )

5.5.4 Generate edge detection filter - getDeriveKernels()

The Scharr() and Sobel() functions get the operators they use by calling the getDeriveKernels() function

void cv::getDeriveKernels(OutputArray kx, 
                		  OutputArray ky, 
                		  int dx, 
                		  int dy, 
                          int ksize,
                		  bool normalize = false,
                		  int ktype = CV_32F
               		 	 )

5.5.5 Laplacian operator - Laplacian()

Both the Sobel operator and the Scharr operator need to extract the edge in the x direction and the edge in the y direction, and then sum the edges in the two directions to obtain the overall edge of the image.

Laplacian operator has the characteristic of isotropy

image-20220313211641940

image-20220313211700000

image-20220313211735903

The Laplacian operator is unacceptably sensitive to noise

void cv::Laplacian(InputArray src, 
                   OutputArray dst, 
                   int ddepth, 
                   int ksize = 1, 
                   double scale = 1, 
                   double delta = 0, 
                   int borderType = BORDER_DEFAULT
                  )

image-20220313212027772

5.5.6 Canny Operator——Canny()

https://blog.csdn.net/likezhaobin/article/details/6892176

https://www.cnblogs.com/king-lps/p/8007134.html

The algorithm is not easily affected by noise, can identify weak and strong edges in the image, and combines the positional relationship of strong and weak edges to comprehensively give the edge information of the image as a whole.

void cv::Canny(InputArray image, //必须是三通道或者单通道图像
                   OutputArray edges, 
                   double threshold1, //第一个滞后阈值
                   double threshold2, //第二个滞后阈值
                   int apertureSize = 3,//Sobel算子直径
                   int L2gradient = false
                  )

5.5.7 LOG operator

image-20220314195058497

5.5.8 Blob Detection

LOG spot detection https://www.cnblogs.com/ronny/p/3895883.html

SimpleBlobDetector

Chapter 6 Image Morphological Operations

6.1 Pixel distance and connected domain

6.1.1 Image pixel distance transformation——distanceTransform()

Image morphology operations mainly include image erosion, dilation, opening and closing operations

Image morphology has a wide range of applications in image processing. It is mainly used to extract meaningful image points for expressing and describing the shape of the region from the image, so that the subsequent recognition work can capture the most essential shape characteristics of the object, such as boundaries, connected domain, etc.

Commonly used distances in image processing are Euclidean distance , block distance and chessboard distance

image-20220314205538374

image-20220314205639635

image-20220314205626339

This function is used to realize the distance transformation of the image. That is, the minimum distance between all pixels in the statistical image and 0 pixels .

void cv::distanceTransform(InputArray src, 
                           OutputArray dst, 
                           OutputArrya labels, //输出一个voronoi图
                           int distanceType, //距离类型
                           int maskSize, //距离变换掩码矩阵尺寸,参数可以选择的尺寸DIST_MASK_3(3*3),DIST_MASK_5(5*5)
                           int labelType = DIST_LABEL_CCOMP
                          )

image-20220314210038066

Voronoi diagram, also known as Thiessen polygon or Dirichlet diagram, is composed of a set of continuous polygons composed of perpendicular bisectors connecting straight lines between two adjacent points.

image-20220402173013626
void distanceTransform(InputArray src, 
                       OutputArray dst, 
                       int distanceType, 
                       int maskSize, 
                       int dstType = CV_32F
                      )

6.1.2 Image Connected Domain Analysis——connectedComponents()

The connected domain of an image refers to the area composed of pixels with the same pixel value and adjacent positions in the image.

.Extracting different connected domains in an image is a commonly used method in image processing, such as segmentation and recognition of regions of interest in the fields of license plate recognition, text recognition, and target detection

int cv::connectedComponents(InputArray image, 
                            OutputArray labels, 
                            int connectivity, //4邻域还是8邻域
                            int ltype, //输出图像类型
                            int ccltype//标记连通域使用的算法
                           )

image-20220314221114293

Commonly used image neighborhood analysis methods include two-pass scanning method and seed filling method .

https://blog.csdn.net/sy95122/article/details/80757281

//该函数既可以计算连通域,同时还可以标记处不同连通域的位置,面积信息
int cv::connectedComponentsWithStats(InputArray image, 
                                     OutputArray labels, 
                                     OutputArray stats, //不同连通域统计信息,如下表
                                     OutputArray centroids, //每个连通域质心的坐标
                                     int connectivity, 
                                     int ltype, 
                                     int ccltype)

image-20220314222752711

//该函数既可以计算连通域,同时还可以标记处不同连通域的位置,面积信息
int cv::connectedComponentsWithStats(InputArray image, 
                                     OutputArray labels, 
                                     OutputArray stats, 
                                     OutputArray centroids, //每个连通域质心的坐标
                                     int connectivity = 8, //4表示4邻域,8表示8邻域
                                     int ltype = CV_32S)

6.2 Corrosion and expansion

Erosion and dilation are the basic operations of morphology. Through these basic operations, noise in the image can be removed, independent regions can be segmented, or two connected domains can be connected together.

6.2.1 Image erosion - erode()

image-20220315095847214

[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-cFQHVRjl-1666660001271) (https://gitee.com/lyz-npu/typora-image/raw/master /img/%E5%9B%BE6-12.jpg)]

cv::Mat cv::getStructuringElement(int shape, //如下表
                                  Size ksize, 
                                  Point anchor = Point(-1,-1)
                                 )

image-20220315100445117

void cv::erode(InputArray stc, 
               OutputArray dst, 
               InputArray kernel, //可以用getStructuringElement()生成
               Point anchor = Point(-1,-1), 
               int iterations = 1, 
               int borderType = BORDER_DEFAULT, 
               const Scalar & borderValue = morphologyDefaultBorderValue()
              )

Image with white background and image with black background operate in reverse

6.2.2 Image dilation - dilate()

image-20220315102905971

void cv::dilate(InputArray src, 
                OutputArray dst, 
                InputArray kernel, 
                Point anchor = Point(-1,-1),
                int iterations = 1, 
                int borderType = BORDER_CONSTANT, 
                const Scalar & boderValue = morphologyDefaultBorderValue()
               )

6.3 Morphological applications

6.3.1 Opening operation——morphologyEx()

The image opening operation can remove the noise in the image, eliminate the smaller connected domain , retain the larger connected domain, and at the same time separate the two objects at the slender connection of the two objects, and do not change the area of ​​the larger connected domain significantly. Able to smooth the boundaries of connected domains.

Corrosion -> Expansion

void cv::morphologyEx(InputArray src, 
                OutputArray dst, 
                int op,//形态学操作类型的标志
                InputArray kernel, 
                Point anchor = Point(-1,-1),
                int iterations = 1, 
                int borderType = BORDER_CONSTANT, 
                const Scalar & boderValue = morphologyDefaultBorderValue()
                )

image-20220315104247212

image-20220315104254235

6.3.2 Closing operation

The image closing operation can remove small holes in the connected domain, smooth the outline of the object, and connect two adjacent connected domains .

Dilation -> Corrosion

6.3.3 Morphological Gradients

The morphological gradient can describe the boundary of the target . It is calculated according to the relationship between image erosion and expansion and the original image. The morphological gradient can be divided into basic gradient, internal gradient and external gradient. The basic gradient is the image after the expansion of the original image and the image after erosion. The difference image between images, the inner gradient image is the difference image between the original image and the corroded image, and the outer gradient image is the difference image between the dilated image and the original image.

6.3.4 Top-Hat Operation

The image top-hat operation is the difference between the original image and the result of the open operation, and is often used to separate the patches that are brighter than the adjacent ones

Corrosion—>Dilation—>Subtract image

image-20220315105657597

6.3.5 Black Hat Operation

The black hat operation is the difference between the original image and the result of the top hat operation, and is often used to separate patches that are darker than adjacent points .

Dilation—>Erosion—>Subtract image

image-20220315110103348

6.3.6 Hit-miss transformation

The hit-and-miss transformation requires the same structure as the structure element in the original image

image-20220315110208300

6.3.7 Image Thinning——thinning()

Image thinning is the process of reducing the lines of an image from multi-pixel widths to unit-pixel widths , sometimes called " skeletonization " or " axis transformation gas "

Thinning algorithms are mainly divided into iterative thinning algorithms and non-iterative thinning algorithms

Iterative thinning algorithm: serial thinning algorithm, parallel thinning algorithm

Zhang's refinement method is widely used.

void cv::ximgproc::thinning(InputArray src, 
                            OutputArray dst, 
                            int thinningType = THINNING_ZHANGSUEN//还有一种THINNING_GUOHALL,,,Guo细化方法
                           )

Chapter 7 Object Detection

7.1 Shape detection

7.1.1 Line detection - HoughLines()

Hough Transform (Hough Transfonn) is an important algorithm for detecting whether there is a straight line in image processing

image-20220315221030571image-20220315211835953

//标准霍夫变换,,多尺度霍夫变换
void cv::HoughLines(InputArray image, //必须是CV_8U的单通道二值图像
                    OutputArray lines, //NX2 的vector<Vec2f>
                    double rho, //常用1
                    double theta, //常用CV_PI/180
                    int threshold, 
                    double srn = 0, 
                    double stn = 0, 
                    double min_theta = 0, 
                    double max_theta = CV_PI
                   )

When using the standard Hough transform and the multi-scale Hough transform function HoughLine() to extract straight lines, the length of the straight lines or line segments in the image cannot accurately known , but only whether there are straight lines that meet the requirements in the image, and the polar coordinates of the straight lines can be obtained.

//渐进概率式霍夫变换函数
void cv::HoughLinesP(InputArray image, //必须是CV_8U的单通道二值图像
                    OutputArray lines, //NX4 的vector<Vec4i>
                    double rho, //常用1
                    double theta, //常用CV_PI/180
                    int threshold, 
                    double minLineLength = 0, //直线的最小长度,当检测直线的长度小于该数值时将被剔除.
                    double maxLineGap = 0//同一直线上相邻的两个点之间的最大距离.
                    )

The first two elements in Vec4i are the x-coordinate and y-coordinate of one end point of the line or line segment respectively, and the last two elements are the x-coordinate and y-coordinate of the other end point of the line or line segment respectively

//检测点集中的直线
void cv::HoughLinesPointSet(InputArray _Point, //必须是cv_32FC2或CV_32SC2图像
                            				   //vector<Point2f>或者vector<Point2i>
                            OutputArray _lines, //在输入点集合中可能存在的直线, 每一条直线都具有3个参数,分别是权重、直线距离坐标原点的距离r和坐标原点到直线的垂线与x轴的夹角a.,,,权重越大表示是直线的可靠性越高,
                            int lines_max,//检测直线的最大数目.如果数目过大, 检测到的直线可能存在权重较小的情况.
                            int threshold,
                            double min_rho, //检测直线长度的最小距离, 以像素为单位.
                            double max_rho, 
                            double rho_step,//以像素为单位的距离分辨率. 即距离r 离散化时的单位长度.
                            double min_theta, //检测直线的最小角度值,以弧度为单位.
                            double max_theta,
                            double theta_step//以弧度为单位的角度分辨率,即家教θ离散化时的单位角度.
                            )

7.1.2 Straight line fitting - fitLine()

Compared with straight line detection, the biggest feature of straight line fitting is to fit all data except one straight line

Least squares M-estimator method

void cv::fitLine(InputArray points, //输入待拟合直线的二维或者三维点集.vector<>或者Mat
                 OutputArray line, //二维点集描述参数为Vec4f 类型,三维点集描述参数为Vec6f类型。
                 int distType, //M-estirnator 算法使用的距离类型标志
                 double param, //某些距离类型的数值参数( C) . 如果数值为0 , 那么自动选择最佳值.
                 double reps, //坐标原点与拟合直线之间的距离精度,数值0表示选择自适应参数, 一般选择0.01 .
                 double aeps//拟合直线的角度精度,数值0表示选择自适应参数,一般选择0.01.
                )

7.1.3 Circle detection - HoughCircles()

https://blog.csdn.net/weixin_44638957/article/details/105883829

image-20220316114235600image-20220316114246194image-20220316114253382

void cv::HoughCircles(InputArray image, //数据类型必须是CV_8UC1.
                      OutputArray circles, //存放在vector<Vec3f>类型的变量,分别是圆心的坐标和圆的半径.
                      int method, //目前仅支持HOUGH_GRADIENT方法.
                      double dp, //离散化时分辨率与图像分辨率的反比.
                      double minDist, //检测结果中两个圆心之间的最小距离.
                      double paraml = 100, //Canny 检测边缘时两个阈值的级大值,较小阙值默认为较大值的一半.
                      double param2 = 100, //检测圆形的累加器阈值,
                      int minRadius = 0, 
                      int maxRadius = 0
                     )

7.2 Contour detection

7.2.1 Contour discovery and drawing - findContours(), drawContours()

https://blog.csdn.net/dcrmg/article/details/51987348

In order to describe the structural relationship between different contours, the contour level from the outside to the inside is defined to be lower and lower, that is, the higher-level contours surround the lower-level contours

image-20220316150703439

void cv::findContours(InputArray image, //数据类型为CV_8U 的单通道灰皮图像或者二值化图像.
                      OutputArrayOfArrays coutours, //检测到的轮廓,每个轮廓中存放像素的坐标.vector<vector<Point>>
                      OutputArray hierarchy, //轮廓结构关系描述向量.vector<Vec4i>
                      int mode, //轮廓检测模式标志
                      int method, //轮廓逼近方法标志
                      Point offset = Point()//每个轮廓点移动的可选偏移量.这个参数主要用在从ROI图像中找出轮廓并基于整个图像分析轮廓的场景中.
                     )

//不输出轮廓的结构关系,避免内存资源的浪费
void cv::findContours(InputArray image, 
                      OutputArrayOfArrays coutours, 
                      int mode,
                      int method,
                      Point offset = Point()
                     ) 

image-20220316151332985

image-20220316151340491

For the content connected to the frame of the picture (such as the hand in the picture below), the whole hand is not a closed figure. Therefore, the outer contour is found under the canny operator. Although RETR_EXTERNEL is used to find the outermost border, redundant border. In fact, the extra black outline inside is the same level as the outer outline of the hand

image-20220316210223580

void cv::drawContours(InputOutputArray image, //绘制轮廓的目标图像.
                      InputArrayOfArray coutours, //所有将要绘制的轮廓.//vector<Vector<Point>>
                      int contourIdx, //轮廓的索引编号。若为负值,则绘制所有轮廓。
                      const Scalar & color, 
                      int thickness, 
                      int lineType = LINE_8, //边界线的连接类型
                      InputArray hierarchy = noArray(), //可选的结构关系信息,默认值为noArray().
                      int maxLevel = INT_MAX, //表示绘制轮廓的最大等级, 默认值为INT_MAX .
                      Point offset = Point()//可选的轮廓偏移参数,按指定的移动距离绘制所有的轮廓。
                     )

image-20220316152134343

7.2.2 Contour area - contourArea()

The information implied by each contour can be further analyzed through the size of the contour area. For example, the size of objects can be distinguished by the contour area , different objects can be identified , etc.

double cv::contourArea(InputArray coutour, //轮廊的像素点.   vector<Point>或者Mat.
                       bool oriented = false//区域面积是否具有方向的标志. true 表示面积具有方向性. false 表示面积不具有方向性,默认值为面积不具有方向性的false.默认输出绝对值
                      )

When the value of the parameter is true, it means that the statistical area has directionality. When the contour vertices are given clockwise and counterclockwise, the statistical area is opposite to each other; when the parameter is false, it means that the statistical area does not have Directionality, the absolute value of the output contour area.

7.2.3 Contour length (perimeter) - arcLength()

double cv::arcLength(InputArray curve, //轮廓或者曲线的二维像素点.   vector<Point>或者Mat
                     bool closed//轮廓或者曲线是否闭合的标志, true 表示闭合.
                    )

If the second parameter is true, the perimeter of the three vertices of the triangle is the sum of the three sides,,,, if false, the perimeter is the sum of the two sides

7.2.4 Outline circumscribed polygon - boundingRect(), minAreaRect(), approxPolyDP()

//求取轮廓最大外接矩形
Rect cv::boundingRect(InputArray array)//array 表示输入的灰度图像或者二维点集,数据类型为vector<Point>或者Mat
    									//输入一个findContour的contour矩阵

//求取最小外接矩形
RotatedRect cv::minAreaRect(InputArray points)//point表示输入的二维点集合.

image-20220316160335426

//多边形逼近轮廓
void cv;:approxPolyDp(InputArray curve, //输入轮廓像素点.vector<Point>或者Mat
                      OutputArray approxCurve, //多边形逼近结果,以顶点坐标的形式给出.CV_32SC2类型的Nx1的Mat类矩阵,
                      double epsilon, //逼近的精度,即原始曲线和逼近曲线之间的最大距离.
                      bool closed//逼近曲线是否为封闭曲线的标志, true 表示曲线封闭,
                     )

7.2.5 Point-to-contour distance——pointPolygonTest()

The distance from the point to the contour plays an important role in calculating the position of the contour in the image, the distance between two contours, and determining whether a point on the image is inside the contour.

double cv::pointPolygonTest(InputArray coutour, //输入的轮廓.   vector<Point>或者Mat
                            Point2f pt, //需要计算与轮廓距离的像素点.
                            bool measureDist//计算的距离是否具有方向性的标志.当参数取值为true 时,点在轮廓内部时,距离为正,点在轮廓外部时,距离为负;;;;当参数取值为false时,只检测点是否在轮廓内.
                           )

7.2.6 Convex Hull Detection - convexHull()

The convex polygon formed by connecting the outermost points of the point set on the two-dimensional plane is called a convex hull. But the approximation result must be a convex polygon .

void cv::convexHull(InputArray points, //输入的二维点集或轮廓坐标.   vector<Point>或者Mat
                    OutputArray hull, //输出凸包的顶点.    vector<Point>或者vector<int>
                    bool clockwise = false, //当参数取值为true时,凸包顺序为顺时针方向:
                    bool returnPoints = true
                   )

7.3 Calculation of moments

7.3.1 Geometric moments and central moments - moments()

https://www.zhihu.com/question/26803016

Moments are operators that describe image features, and are widely used in image retrieval and identification, image matching, image reconstruction, image compression, and motion image sequence analysis and other fields.

image-20220316221143706

image-20220316221149778

Moments cv::moments(InputArray array, //计算矩的区域二维像素坐标集合或者单通道的CV_8U图像.
                    bool binaryImage = false//是否将所有非零像素值视为1的标志.
                   )

image-20220316223020185

7.3.2 Hu矩——HuMoments()

Hu moments are invariant to rotation, translation, and scaling , so they have broader applications when images have rotation and scaling.

//需要先计算图像的矩,将图像的矩输入到HuMonments()函数中
void cv::HuMoments(const Moments & moments, //输入的图像矩.
                   double hu[7]//输出Hu 矩的7 个值.
                  )
    
void cv::HuMoments(const Moments & m, 
                   OutputArray hu//输出Hu 矩的矩阵.
                  )

    Moments M = Moments(imgContours);
	HuMoments(m, hu);

7.3.3 Contour matching based on Hu moment - matchShapes()

Since the Hu moment is invariant to rotation, translation and scaling, the image contour can be matched by Hu.

double cv::matchShapes(InputArray contour1, //原灰度图像或者轮廓.
                       InputArray contour2, //模板图像或者轮廓.
                       int method, ///匹配方法的标志
                       double parameter//特定于方法的参数( 现在不支持) .可以设置为0
                      )

image-20220316224047818

7.4 Point Set Fitting

Figure 7-27image-20220317094919112

//该函数能够找到包含给定三维点集的最小区域的三角形,,,返回值为double 类型的三角形面积.
double cv::minEnclosingTriangle(InputArray points, //vector<>或者Mat类型的变量,CV_32S或CV_32F;
                                OutputArray triangle//三角形的3个顶点坐标,存放在vector<Point2f>变量中
                               )
//寻找二维点集的最小包围圆形
void cv::minEnclosingCircle(InputArray points, //vector<>或者Mat类型的变量
                            Point2f & center, 
                            float & radius
                           )

7.5 QR code detection

Figure 7-29

The recognition of QR two-dimensional code requires the help of third-party tools, and the commonly used one is the zbar decoding library. OpenCV4 provides decoding functions

Two processes for QR two-dimensional code recognition. OpenCV 4 provides multiple functions to realize each process. These functions are the detect() function for locating the QR two-dimensional code , and the decode( ) for decoding the two-dimensional code according to the positioning result. ) function , and the detectAndDecode() function that locates and decodes at the same time .

//定位不解码
bool cv::QRCodeDetector::detect(InputArray img, 
                                 OutputArray points//二维码的4 个顶点坐标.。。vector<Point>
                                )
//对定位结果进行解码
std::string cv::QRCodeDetector::decode(InputArray img, //含有QR 二维码的图像.
                                       InputArray points, //包含QR二维码的最小区域的四边形的4 个顶点坐标.
                                       OutputArray straight_qrcode = noArray()//经过校正和二值化的QR二维码.
                                      )
//一步完成定位和解码
std::string cv::QRCodeDetector::detectAndDecode(InputArray img, 
                                                OutputArray points = noArray(), 
                                                OutputArray straight_qrcode = noArray()
                                               )
    
cv::QRCodeDetector qrcodedetector;
qrcodedetector.detect();
qrcodedetector.decode();
qrcodedetector.detectAndDecode();

Chapter 8 Image Analysis and Restoration

8.1 Fourier transform

https://www.bilibili.com/video/BV1HJ411a7cp?spm_id_from=333.880

Popular explanation: Image Fourier Transform - Mahua Duanzi's article - Zhihu https://zhuanlan.zhihu.com/p/99605178

Any signal can be formed by the superposition of a series of sinusoidal signals. The one-dimensional field signal is the superposition of one-dimensional sine wave, and the two-dimensional field is the addition of two-dimensional plane waves. Since the image can be regarded as a two-dimensional signal, the image can be Fu Lie transform

Discrete Fourier transform of the image

Discrete Fourier transform is widely used in image denoising , filtering and other convolution fields

Understanding Fourier analysis: https://blog.csdn.net/u013921430/article/details/79683853

Fourier analysis of digital images: https://blog.csdn.net/u013921430/article/details/79934162

8.1.1 Discrete Fourier Transform - dft(), idft()

The result after the discrete Fourier transform of the image will be an image containing both real and imaginary numbers. In actual use, the result is often divided into real and imaginary images, or the transformation result is represented by the amplitude and phase of the complex number, which can be divided into amplitude images and phase images.

The frequency domain corresponding to the area with large pixel fluctuations in the image is the high-frequency area, so the high-frequency area reflects the details, edges, and texture information of the image, while the low-frequency information represents the content information of the image.

//对图像进行傅里叶变换
void cv::dft(InputArray src, //输入的图像或者数组矩阵,可以是实数也可以是复数.CV_32F 或者CV_64F
             OutputArray dst, //存放离散傅里叶变换结果的数组矩阵.
             int flags = 0, //变换类型可选标志
             int nonzeroRows = 0//输入、输出结果的形式,默认值为o .
            )

image-20220317220123488

//对图像进行离散傅里叶逆变换
idft(src, dst, flags)相当于dft(src, dst, flags | DFT_INVERSE)

The discrete Fourier transform algorithm tends to process some input matrices of specific length, rather than processing matrices of arbitrary size. Therefore, if the size is smaller than the optimal size for processing, it is often necessary to change the size of the input matrix to Make the function have a faster processing speed. A common way to adjust the size is to add multiple layers around the original matrix. pixel, so the fourth parameter of the dft() function will discuss the first non-zero row in the matrix.

How many rows and columns to fill need to be calculated by the following function, how many rows and columns are needed to calculate

//计算最优尺寸
int cv::getOptimalDFTSize(int vecsize);//vecsize是图像的rows,cols

After calculating the optimal size, change the size of the image and generate a frame around the image

void cv::copyMakeBorder(InputArray src, 
                        OutputArray dst, 
                        int top, 
                        int bottom, 
                        int left, 
                        int right, 
                        int borderType, 
                        const Scalar & value = Scalar()
                       )

Since the value obtained by the discrete Fourier transform may be a complex number of two channels, more attention is paid to the magnitude of the complex number in the actual use process, so OpenCV 4 provides the magnitude() function to calculate a two-dimensional vector matrix composed of two matrices The magnitude matrix of .

void cv::magnitude(InputArray x, 
                   InputArray y, 
                   OutputArray magnitude
                  )

image-20220318111141637

Because F(u,v) decays and changes relatively quickly, you can’t see any points by drawing directly, so you need to add 1,,, and then log operation,,, and then analyze the amplitude spectrum

.In this program, first calculate the optimal size suitable for the discrete Fourier transform of the image, then use the copyMakeBorder() function to expand the image size, then perform the discrete Fourier transform, and finally calculate the magnitude of the transformed result. In order to be able to display the transformation The magnitude in the result, the result is normalized. The origin of the transformation is located at the 4 vertices, so by image transformation, the origin of the transformation result is adjusted to the center of the image.

It should be noted that the frequency domain image normalization process is only helpful when observing the spectrogram. The same is true for logarithmic transformation, which is to better observe the energy distribution in the frequency domain, because most of the energy of the image is concentrated in the Low frequency area, so the result of the transformation is mostly to see some white spots on the low frequency

Why is the image normalized after Fourier transform? - DBinary's answer - Zhihu https://www.zhihu.com/question/354081645/answer/890215427

#include <opencv2/opencv.hpp>
#include <iostream>

using namespace std;
using namespace cv;

int main()
{
	//对矩阵进行处理,展示正逆变换的关系
	Mat a = (Mat_<float>(5, 5) << 1, 2, 3, 4, 5,
		2, 3, 4, 5, 6,
		3, 4, 5, 6, 7,
		4, 5, 6, 7, 8,
		5, 6, 7, 8, 9);
	Mat b, c, d;
	dft(a, b, DFT_COMPLEX_OUTPUT);  //正变换
	dft(b, c, DFT_INVERSE | DFT_SCALE | DFT_REAL_OUTPUT);  //逆变换只输出实数
	idft(b, d, DFT_SCALE);  //逆变换

	//对图像进行处理
	Mat img = imread("pic/lena.png");
	if (img.empty())
	{
		cout << "请确认图像文件名称是否正确" << endl;
		return -1;
	}
	Mat gray;
	cvtColor(img, gray, COLOR_BGR2GRAY);
	resize(gray, gray, Size(502, 502));
	imshow("原图像", gray);

	//计算合适的离散傅里叶变换尺寸
	int rows = getOptimalDFTSize(gray.rows);
	int cols = getOptimalDFTSize(gray.cols);

	//扩展图像
	Mat appropriate;
	int T = (rows - gray.rows) / 2;  //上方扩展行数
	int B = rows - gray.rows - T;  //下方扩展行数
	int L = (cols - gray.cols) / 2;  //左侧扩展行数
	int R = cols - gray.cols - L;  //右侧扩展行数
	copyMakeBorder(gray, appropriate, T, B, L, R, BORDER_CONSTANT);
	imshow("扩展后的图像", appropriate);

	//构建离散傅里叶变换输入量
	Mat flo[2], complex;
	flo[0] = Mat_<float>(appropriate);  //实数部分
	flo[1] = Mat::zeros(appropriate.size(), CV_32F);  //虚数部分
	merge(flo, 2, complex);  //合成一个多通道矩阵

	//进行离散傅里叶变换
	Mat result;
	dft(complex, result);

	//将复数转化为幅值
	Mat resultC[2];
	split(result, resultC);  //分成实数和虚数
	Mat amplitude;
	magnitude(resultC[0], resultC[1], amplitude);

	//进行对数放缩公式为: M1 = log(1+M),保证所有数都大于0
	amplitude = amplitude + 1;
	log(amplitude, amplitude);//求自然对数

	//与原图像尺寸对应的区域								
	amplitude = amplitude(Rect(T, L, gray.cols, gray.rows));//gray扩充了矩阵尺寸,,因此构建矩阵的时候把边框裁掉,匹配原来的初始图像
	normalize(amplitude, amplitude, 0, 1, NORM_MINMAX);  //归一化
	imshow("傅里叶变换结果幅值图像", amplitude);  //显示结果

	//重新排列傅里叶图像中的象限,使得原点位于图像中心
	int centerX = amplitude.cols / 2;
	int centerY = amplitude.rows / 2;
	//分解成四个小区域
	Mat Qlt(amplitude, Rect(0, 0, centerX, centerY));//ROI区域的左上
	Mat Qrt(amplitude, Rect(centerX, 0, centerX, centerY));//ROI区域的右上
	Mat Qlb(amplitude, Rect(0, centerY, centerX, centerY));//ROI区域的左下
	Mat Qrb(amplitude, Rect(centerX, centerY, centerX, centerY));//ROI区域的右下

	//交换象限,左上和右下进行交换
	Mat med;
	Qlt.copyTo(med);
	Qrb.copyTo(Qlt);
	med.copyTo(Qrb);
	//交换象限,左下和右上进行交换
	Qrt.copyTo(med);
	Qlb.copyTo(Qrt);
	med.copyTo(Qlb);

	imshow("中心化后的幅值图像", amplitude);
	waitKey(0);
	return 0;
}

8.1.2 Convolution by Fourier transform - mulSpecturms()

The Fourier transform can convert the convolution of two matrices into the product of the Fourier transform results of two matrices. In this way, the calculation speed of the convolution can be greatly improved. But the results of the Fourier transform of the image are complex A complex matrix of conjugate symmetry , multiplying two matrices needs to calculate the product of two complex numbers at the corresponding positions. OpenCV 4 provides the mulSepctrums() function for calculating the product of two complex matrices

void cv::mulSpecturms(InputArray a, 
                      InputArray b, 
                      OutputArray c, 
                      int flags, //DFT_COMPLEX_OUTPUT
                      bool conjB = false//是否对第二个输入矩阵进行共辄变换的标志.当参数为false时,不进行其辄变换;当参数为true时,进行共辄变换.
                     )

When the image is convolved by discrete Fourier transform, it is necessary to expand the convolution kernel to the same size as the image, and then perform size-optimized expansion matrix on the two images at the same time, and perform discrete Fourier transform on the product result The inverse exchange of a transformation.

8.1.3 Discrete Cosine Transform

https://blog.csdn.net/akadiao/article/details/79778095

Discrete cosine transform is often used in the field of signal processing and image processing, mainly for lossy data compression of signals and images. Discrete cosine transform has the characteristic of "energy concentration". After the signal is transformed, the energy is mainly concentrated in the low frequency of the result part.

void cv::dct(InputArray src, //处理单通道数据
             OutputArray dst, 
             int flags = 0
            )

Currently, the dct() function only supports even-sized arrays.

//逆变换
void cv::idct(InputArray src, 
              OutputArray dst, 
              int flags = 0
             )

Discrete cosine transform has a strong "energy concentration" characteristic, the upper left is called low-frequency data, and the lower right is called high-frequency data.

Figure 8-6

8.2 Integral image

Used to quickly calculate the average gray level of pixels in certain areas of the image

.The integral image is a new image that is 1 larger than the size of the cause image. For example, if the size of the cause image is NxN, then the size of the integral image is (N + 1) x (N + 1) .

The pixel value of each pixel in the integral image is the sum of all pixel values ​​in the rectangle formed by the pixel point and the coordinate origin in the original image,

The pixel value of P0 is the sum of all pixel values ​​in the intersection area of ​​the first four rows and the first four columns in the original image

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-er9De8kL-1666660001277)(https://gitee.com/lyz-npu/typora-image/raw/master /img/%E5%9B%BE8-7.png)]

Integral image: standard summation integral image, square summation integral image, oblique summation integral image (only the summation direction is rotated by 45°)

//标准求和积分
void cv::intergral(InputArray src, //NxN
                   OutputArray sum, //(N+I)x(N+I)
                   int sdepth = -1
                  )
//平方求和积分
    void cv::intergral(InputArray src, 
                       OutputArray sum, 
                       OutputArray sqsum, //输出平方求和积分图像
                       int sdepth = -1, 
                       int sqdepth = -1
                      )
    //平方求和积分
    void cv::intergral(InputArray src, 
                       OutputArray sum, 
                       OutputArray sqsum, //输出平方求和积分图像
                       Output Array tilted,//输出倾斜求和积分图像
                       int sdepth = -1, 
                       int sqdepth = -1
                      )

8.3 Image Segmentation

8.3.1 Flood filling method - floodFill()

int cv::floodFill(InputOutputArray imgae, //单通道或者三通道
                  InputOutputArray mask, //尺寸比输入图像宽和高各大2 的单通道图像,用于标记漫水填充的区域.
                  Point seedPoint, //种子点
                  Scalar newVal, //归入种子点区域内像素点的新像素值.
                  Rect * rect = 0, //种子点漫水填充区域的最小矩形边界, 默认值为0 ,表示不输出边界
                  Scalar loDiff = Scalar(), //·添加进种子点区域条件的下界差值
                  Scalar upDiff = Scalar(), //添加进种子点区域条件的上界差值
                  int flags = 4//漫水填充法的操作标志,,,,,邻域种类、掩码矩阵中被填充像素点的像素值、填充算法
                  //41 | (255<<8) | FLOODFILL_FIXED_RANGE
                 )
int cv::floodFill(InputOutputArray imgae, 
                  Point seedPoint, 
                  Scalar newVal,
                  Rect * rect = 0, 
                  Scalar loDiff = Scalar(), 
                  Scalar upDiff = Scalar(), 
                  int flags = 4
                 )

8.3.2 Watershed method - watershed()

The difference is that the flood filling method is segmented from a certain pixel value, which is a local segmentation algorithm, while the watershed method starts from the whole world and needs to be segmented globally.

https://zhuanlan.zhihu.com/p/67741538

void cv::watershed(InputArray image, //是CU_8U数据类型的三通道彩色图像
                   InputOutputArray markers//CV_32S 数据类型的单通道图像的标记结果
                  )

The boundary of the two regions is -1, and the value of other regions is a number from 1 to n, and the number cannot be greater than the number of contours.

8.3.3 Grabcut method - grabCut()

The Grabcut method is an important image segmentation algorithm, which uses a Gaussian mixture model to estimate the background and foreground of the target area. This algorithm solves the problem of minimizing the energy function through the method of method generation, making the result more reliable.

void cv::grabCut(InputArray img, //CV_8U
                 InputOutputArray mask, //CV_8U
                 Rect rect, //的ROI 区域, 该参数仅在mode = GC_INIT_WITH_RECT 时使用.
                 InputOutputArray bgdModel, //背景模型的临时数组.
                 InputOutputArray fgdModel, //前景模型的临时数组.
                 int iterCount, //法代次数.
                 int mode =  GC_EVAL//分割模式标志,
                )

image-20220321094242508

image-20220321094248724

8.3.4 Mean-Shift法——pyrMeanShiftFiltering()

https://blog.csdn.net/ttransposition/article/details/38514127

Baidu Library PPT: https://wenku.baidu.com/link?url=kZt9aLkbV0D74VY9AjCs_aL4cO4eSrbU8rDdzk2pLP0ZbSieurP-xFAJAPGxZSx3bLLxv14bMC5WoSlTkjZ5QXO7UOOO5IKKMqVixxsZst_

meanshift is often used to find the mode point, that is, the point with the highest density

The mean shift method is an image segmentation algorithm based on the color space distribution. The output of the algorithm is a filtered "color separation" image, the image will become gradual, and the fine grain texture will become smooth. , each pixel is represented by a five-dimensional vector (x, y, b, g, r),

void cv:pyrMeanShiftFiltering(InputArray src, //CU_8U
                              OutputArray dst, 
                              double sp, //滑动窗口的半径.
                              double sr, //滑动窗口颜色幅度.
                              int maxLevel = 1, //分割金字塔缩放层数.
                              TermCriteria = TermCriteria(TermCriteria::MAX_ITER+TermCriteria::EPS,5,1))//法代算法终止条件.
cv::TermCriteria::TermCriteria(int Type, 
                               int maxCount, //最大法代次数或者元素数.
                               double epsilon//法代算法停止时需要满足的精度或者参数变化.
                              )

image-20220321102625440

8.4 Image Restoration

Removes " contaminated " areas of the image. Image restoration can not only remove scratches in images, but also remove watermarks, dates, etc. in images.

void cv::inpaint(InputArray src, //当图像为三通道时, 数据类型必须是CV_8U.
                 InputArray inpaintMask, 
                 OutputArray dst, 
                 double inpaintRadius, //算法考虑的每个像素点的圆形邻域半径.
                 int flags//修复图像方法标志,
                )

image-20220321112421202

Pixels farther away from edge regions are estimated to be less accurate , so if the "contaminated" region is larger, the inpainting will be less effective.

Create a mask first (the mask is the pollution in the image, and the polluted area is obtained by binarization and other operations), and then the polluted area of ​​the mask can be expanded appropriately, and the image to be decontaminated is input in the inpaint() function and pollution mask

Chapter 9 Feature Point Detection and Matching

9.1 Corner detection

image-20220321113432429

9.1.1 Display key points - drawKeypoints()

The key point is a name for the pixel point that contains special information in the image, mainly containing information such as the position and angle of the pixel point.

The drawKeypoints() function is used to draw all key points at once

void cv::drawKeypoints(InputArray image, 
                       const std::vector<Keypoint> & keypoints, 
                       InputOutputArray outImage, //绘制关键点后的图像
                       const Scalar & color = Scalar::all(-1), //关键点的颜色
                       DrawMatchesFlags flags = DrawMatchesFlasgs::DEFAULT//绘制功能选择标志
                      )

image-20220321141933948

//KeyPoint类
class KeyPoint{
	float angle		//关键点的角度
	int class_id	//关键点的分类号
	int octave		//特征点来源“金字塔”
	Point2f pt		//关键点坐标
	float response	//最强关键点的响应,可用于进一步分类和二次来样
	float size		//关键点邻域的直径
}

//关键点类型变量的其他属性可以默认,但是坐标属性必须具有数据.

9.1.2 Harris corner detection - cornerHarris()

Harris corners are invariant to rotation and translation, but not to scaling

How is the Harris corner response function expression derived? - Dahei's answer - Zhihu https://www.zhihu.com/question/37871386/answer/2311779754

Corner detection: Harris and Shi-Tomasi - Programmer Ade's article - Zhihu https://zhuanlan.zhihu.com/p/83064609

Harris corner point is one of the most classic corner points, which defines the corner point from the angle of pixel value change, and the local maximum peak value of the pixel value is the Harris corner point

void cv::cornerHarris(InputArray src, //CV_8U 或者CV_32F
                      OutputArray dst, //存放Harris评价系数R的矩阵,数据类型为CV_32F的单通道图像,
                      int blockSize, //邻域大小.          常常取2
                      int ksize, //Sobel算子的半径,用于得到梯度信息.   多使用3或者5
                      double k, //计算Harris 评价系数R 的权重系数.    一般取值为0.02~0.04.
                      int borderType = BORDER_DEFAULT
                     )

The result calculated by this function is the Harris evaluation coefficient, but because its value range is wide and there are positive and negative, it is often necessary to normalize it to the specified area through the normalize() function , and then judge the pixel by threshold comparison Is it a Harris corner

9.1.3 Shi-Tomas corner detection - goodFeaturesToTrack()

image-20220321161457165

void cv::goodFeaturesToTrack(InputArray image, 
                             OutputArray corners, //vector<Point2扣的向盘或者Mat 类矩阵中,那么如果存放在Mat类矩阵中, 那么生成的是数据类型为CV_32F 的单列矩阵,
                             int maxCorners, 
                             double qualityLevel, //角点阈值与最佳角点之间的关系,又称为质量等级,如果参数为0.01. 那么表示角点阀值是最佳角点的0.01倍.
                             double minDistance, //两个角点之间的最小欧氏距离.
                             InputArray mask = noArray(), 
                             int blockSize = 3, //计算梯度协方差矩阵的尺寸,,,,默认3
                             bool useHarrisDetector = false, //是否使用Har由角点检测.
                             double k = 0.04//Harris角点检测过程中的常值权重系数.
                            )

9.1.4 Sub-pixel level corner detection - cornerSubPix()

image-20220321170119101

image-20220321170141477

void cv::cornerSubPix(InputArray image, 
                      InputOutputArray corners, //角点坐标, 既是输入的角点坐标, 又是精细后的角点坐标.
                      Size winSize, //搜索窗口尺寸的一半, 必须是整数.实际的搜索窗口尺寸比该参数的2倍大1.
                      Size zeroZone, //搜索区域中间"死区"大小的一半,即不提取像素点的区域,(-1,-1)表示没"死区"
                      TermCriteria criteria//:终止角点优化法代的条件.
                     )

Sub-pixel level corner detection is to first calculate the Harris corner/Shi-Tomas corner, and then refine the calculation of the sub-pixel corner

9.1.5 FAST corner detection - FAST()

https://blog.csdn.net/tostq/article/details/49314017

https://aibotlab.blog.csdn.net/article/details/65662648

1. fast is very fast;
2. There is no scale invariance of sift, nor does it have rotation invariance;
3. When there are many noise points in the picture, its robustness is not good, and the effect of the algorithm also depends on a threshold t

	FAST(gray, fastPt, 50,true,FastFeatureDetector::TYPE_9_16);

//等价于下面
	Ptr<FastFeatureDetector> fast = FastFeatureDetector::create(50, true, FastFeatureDetector::TYPE_9_16);
	fast->detect(gray, fastPt);

//Ptr<类名>的用法,Ptr是一个智能指针,可以在任何地方都不使用时自动删除相关指针,从而帮助彻底消除内存泄漏和悬空指针的问题。,Ptr<类名>是一个模板类,其类型为指定的类,
OpenCV笔记(Ptr)  https://www.cnblogs.com/fireae/p/3684915.html
//Ptr计数指针的技术问题
shared_ptr的引用计数原理 https://blog.csdn.net/qq_29108585/article/details/78027867
深入理解智能指针之shared_ptr https://www.cnblogs.com/mrbendy/p/12701339.html

9.2 Feature point detection

The feature point and the corner point have the same macroscopic definition, they are all pixels that can represent many local feature points in the image, but the feature point is different from the corner point in that it has a descriptor that can uniquely describe the characteristics of the pixel point, for example, the Pixels to the left of a point are larger than pixels to the right, the point is a local minimum, etc.

Usually feature points are composed of key points and descriptors. For example, SIFT feature points, ORB feature points, etc. need to calculate key point coordinates first, and then calculate descriptors

9.2.1 Key points

image-20220321171830700

virtual void cv::Feature2D::detect(InputArray image, 
                                   std::vector<KeyPoint> & keypoints, 
                                   InputArray mask = noArray()
                                  )

This function can only be used after being inherited by other classes, that is, it can only be used in the specific class of feature points . For example, in the ORB class of ORB feature points, the key points of ORB feature points can be calculated through the ORB::detect() function. In the SIFT class of SIFT feature points, the key points of SIFT feature points can be calculated by the SIFT::detect() function

9.2.2 Descriptor - detectAndCompute()

A descriptor is a string of numbers used to uniquely describe a key point , which is similar to each person's personal information. Through the descriptor, two different key points can be distinguished , and the same key point can also be found in different images.

//计算描述子
virtual void cv::Feature2D::compute(InputArray image, 
                                    std::vector<KeyPoint> &keypoints, //已经在输入图像中计算得到的关键点.
                                    OutputArray descriptors
                                   )
//直接计算关键点和描述子
virtual void cv::Feature2D::detectAndCompute(InputArray image, 
                                             InputArray mask, //计算关键点时的掩码图像.
                                             std::vector<KeyPoint> & keypoints, //计算得到的关键点.
                                             OutputArray descriptors, //每个关键点对应的描述子.
                                             bool useProvidedKeypoints = false//是否使用己有关键点的标识符。
                                            )

9.2.3 SIFT feature point detection - Ptr<SIFT>

The reason why SIFT feature points are popular is that they still have good stability under disturbances such as illumination, noise, viewing angle, scaling and rotation.

Figure 9-8

Sift operator feature point extraction https://blog.csdn.net/dcrmg/article/details/52577555

Source of 128-dimensional feature items: https://www.cnblogs.com/wangguchangqing/p/4853263.html

The S1FT class variable is used to indicate that the function inherited from the Features2D class is to calculate SIFT feature points instead of calculating other feature points. The SIFT class is in the xfeatures2d header file and namespace, so it needs to be used in the program through "#include <xfeatures2d.hpp>" contains the header file,

static Ptr<SIFT> cv::xfeatures2D::SIFT::create(int nfeatures = 0, //计算SIFT特征点数目。
                                               int nOctaveLayers = 3,//金字塔中每组的层数.
                                               double contrastTheshold =0.04, //过滤较差特征点的阈值,该参数值越大,返回的特征点越少.
                                               double edgeThreshold = 10, //过滤边缘效应的阈值, 该参数值越大,返回的特征点越多.
                                               double sigma = 1.6//"金字塔"第0 层图像高斯滤波的系数,即上图的σ0
                                              )

9.2.4 SURF feature point detection - Ptr<SURF>

https://blog.csdn.net/zrz0258/article/details/113176528

The SURF feature points directly use the box filter to approximate the Gaussian difference space

In SURF feature points, the size of images between different groups is the same, but the size of the box filter used by different groups gradually increases, and the same size filter is used between different layers in the same group, but the blur coefficient of the filter gradually increased.

static Ptr<SURF> cv:xfeatures2d::SURF::create(double hessianThreshold = 100, //SURF关键点检测的阈值.
                                              int nOctaves = 4, //构建"金字塔"的组数.
                                              int nOctaveLayers = 3, //"金字塔"中每组的层数
                                              bool extended = false, //是否拓展64维描述子至128维.
                                              int upright = false//是否计算关键点方向的标志.
                                             )

image-20220322100603389

9.2.5 ORB feature point detection - Ptr<ORB>

https://www.cnblogs.com/alexme/p/11345701.html

https://blog.csdn.net/yang843061497/article/details/38553765

ORB feature points are known for their fast calculation speed, which can reach 10 times that of SURF feature points and 100 times that of SIFT feature points.

ORB feature points are composed of FAST corner points and BRIEF descriptors. Firstly, through FAST corner points, it is determined that there are obvious pixels in the image and surrounding image cables as key points, and then the BRIEF descriptor of each key point is calculated to uniquely determine ORB feature points. .

Rotation invariant and scale invariant

static Ptr<ORB> cv::ORB::create(int nfeatures = 500, 
                                float scaleFactor = 1.2f, //" 金字塔"尺寸缩小的比例
                                int nlevels = 8, //金字塔"层数
                                int edgeThreshold = 31, //边缘阈值.
                                int firstLevel = 0, //将原图像放入"金字塔"中的等级,例如放入第0 层.
                                int WTA_K = 2, //生成每位描述子时需要用的像索点数目.
                                ORB::ScoreType scoreType = ORB::HARRIS_SCORE, //检测关键点时关键点评价方法.
                                int patchSize = 31, //生成描述子时关键点周围邻域的尺寸.
                                int fastThreshold = 20//计算FAST 角点时像素值差值的阈值.
                               )

9.3 Feature point matching

Feature point matching is to find the same feature point of a unified object in different images

image-20220322145120459

image-20220322172033698

9.3.1 DescriptorMatcher Class Introduction

//一对一的匹配
void cv::DescriptorMatcher::match(InputArray queryDescriptors,//查询描述子集合.
                                  InputArray trainDescriptors,//训练描述子集合.
                                  std::vector<DMatch> & matches,//两个集合描述子匹配结果.匹配结果数目可能小于描述子的数目
                                  InputArray mask = noArray()//描述子匹配时的掩码矩阵,用于指定匹配哪些描述子.
                                 )

The DMatch type is the type used to store the feature point descriptor matching relationship in OpenCV4, and the index and distance of two descriptors are stored in the type.

class cv::DMatch{
		float distance	//两个描述子之间的距离
        int imgIdx		//训练描述子来自的图像索引
        int queryIdx	//查询描述子集合中的索引
        int trainIdx	//训练描述子集合中的索引
}
//一对多的描述子匹配
void cv::DescriptorMatcher::knnMatch(InputArray queryDescriptors,
                                    INputArray trainDescriptors,
                                    std::vector<std::vector<DMatch>> & matches,//即matches[i]中存放的是k个或者更少的与查询描述子匹配的训练描述子.
                                    int k,//每个查询描述子在训练描述子集合中寻找的最优匹配结果的数目.————————就是一个训练描述子集合可以寻找k个查询描述子
                                    InputArray mask = noArray(),
                                    bool compactResult = false//输出匹配结果数目是否与查询描述子数目相同的选择标志。
                                    )
//匹配所有满足条件的描述子,即将与查询描述子距离小于阈值的所有训练描述子都作为匹配点输出
void cv::DescriptorMatcher::radiusMatch(InputArray queryDescriptors,
                                    	INputArray trainDescriptors,
                                    	std::vector<std::vector<DMatch>> & matches,
                                    	float maxDistance,
                                    	InputArray mask = noArray(),
                                    	bool compactResult = false
                                    	)

Similar to the feature point detect() function, the match function can only be used after creating a class and inheriting the virtual class

9.3.2 Brute force matching - BFMatcher.match()

image-20220322155705188

cv::BFMatcher::BFMatcher(int normType = NORM_L2, 
                         bool crossCheck = false
                        )

Brute force matching will find an optimal descriptor for each query descriptor, but sometimes this constraint will also cause more wrong matches, for example, a certain feature point only appears in the query descriptor image, which is the case in another There will be no matching feature points in one image, but according to the principle of violent matching, this feature point will also find matching feature points in another image, resulting in a wrong match.

void cv::drawMatches(InputArray img1, 
                     const std::vector<KeyPoint> & keypoints1, 
                     InputArray img2, 
                     const std::vector<KeyPoint> & keypoints2, 
                     const std::vector<DMatch> & matches1to2, 
                     InputOutputArray outImg, 
                     const Scalar & matchColor = Scalar::all(-1), 
                     const Scalar & singlePointColor = Scalar::all(-1), 
                     const std::vector<char> & matchesMask = std::vector<char>(),
                     DrawMatchesFlags flags = DrawMathcesFlags::DEFAULT//绘制功能选择标志,
                    )

9.3.4 FLANN matching - FlannBaesdMatcher.match()

Although the principle of brute force matching is simple, the complexity of the algorithm is high. When the number of feature points is large, it will seriously affect the running time of the program. Therefore, OpenCV 4 provides a fast nearest neighbor search library (Fast Library for Approximate Nearest Neighbors . FLANN) for efficient matching of feature points.

cv::FlannBasedMatcher::FlannBasedMatcher(
    const Ptr<flann::IndexParams> & indexParams = makePtr<flann::KDTreeIndexParams>(), 
    const Ptr<flann::SearchParams> & searchParams = makePtr<flann::SearchParams>()//法代遍历次数终止条件、一般情况下使用默认参数即可。
)
    
    //使用FLANN方法进行匹配时描述子需要是CV_32F 类型,因此ORB 特征点的描述子变量需要进行类型转换后才可以实现特征点匹配.

image-20220322200630854

kd tree structure: https://blog.csdn.net/u011067360/article/details/23934361

image-20220322205902706

k-means clustering: https://www.cnblogs.com/pinard/p/6164214.html

img

Hierarchical Clustering: https://blog.csdn.net/zhangyonggang886/article/details/53510767

Hierarchical clustering is a kind of clustering algorithm , which creates a hierarchical nested clustering tree by calculating the similarity between different categories of data points. In a clustering tree, the raw data points of different categories are the lowest level of the tree, and the top level of the tree is the root node of a cluster. There are two methods of creating a clustering tree: bottom-up merging and top-down splitting. This article introduces the merging method.
[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-fcKn9Lj6-1666660001284) (https://gitee.com/lyz-npu/typora-image/raw/master /img/20161207212645385)]

9.3.5 RANSAC optimized feature matching

In order to better improve the matching accuracy of feature points, we can use RANSAC algorithm. Random Sampling Consensus Algorithm ( RANdom SAmple Consensus )

**RANSAC:**https://blog.csdn.net/robinhjwy/article/details/79174914,,,,,, very important algorithm

Use randomly selected points in the data set to fit a model, and calculate whether the points in the data set fall within the threshold to become inliers, and iterate continuously. The iterative process is based on the number of inliers, and calculates the best fit with the most inliers. The data of the combined model,,,,,,, its value is to remove the outlier point outlier, and then use the remaining inlier points, use the xxx fitting algorithm to find a more accurate model

Homography matrix understanding: https://blog.csdn.net/lyhbkz/article/details/82254893

image-20220322212834587

image-20220322212841076

Mat cv::findHomography(InputArray srcPoints, //原始图像中特征点的坐标.   CV_32FC2或者vector<Point2f>
                       InputArray dstPoints, //目标图像中特征点的坐标.
                       int method = 0, //计算单应矩阵方法的标志
                       double ransacReprojThreshold = 3, //重投影的最大误差.选择RANSAC和RHO时有用,
                       OutputArray mask = noArray(), //掩码矩阵,使用RANSAC 算法时表示满足单应矩阵的特征点.
                       const int maxIters = 3000, //RANSAC 算法法代的最大次数.
                       const double confidence = 0.995//置信区间,取值范围为0- 1 .
                      )

image-20220322213348090

To optimize the matching feature points through this function, it is necessary to judge whether each element in the output mask matrix is ​​0. If it is not 0, then it means that the point is a feature point of successful matching, and find the matching in vector<DMatch> The matching feature points, and put the matching result in a new vector of DMatch type.

//RANSAC算法实现
void ransac(vector<DMatch> matches, vector<KeyPoint> queryKeyPoint, vector<KeyPoint> trainKeyPoint, vector<DMatch> &matches_ransac)
{
	//定义保存匹配点对坐标
	vector<Point2f> srcPoints(matches.size()), dstPoints(matches.size());
	//保存从关键点中提取到的匹配点对的坐标
	for (int i = 0; i<matches.size(); i++)
	{
		srcPoints[i] = queryKeyPoint[matches[i].queryIdx].pt;
		dstPoints[i] = trainKeyPoint[matches[i].trainIdx].pt;
	}
	
	//匹配点对进行RANSAC过滤
	vector<int> inliersMask(srcPoints.size());
	//Mat homography;
	//homography = findHomography(srcPoints, dstPoints, RANSAC, 5, inliersMask);
	findHomography(srcPoints, dstPoints, RANSAC, 5, inliersMask);
	//手动的保留RANSAC过滤后的匹配点对
	for (int i = 0; i<inliersMask.size(); i++)
		if (inliersMask[i])
			matches_ransac.push_back(matches[i]);
}

Chapter 10 Stereo Vision

10.1 Monocular Vision

10.1.1 Monocular camera model

Measuring the internal parameter coefficient is the first step before using the camera

The internal reference matrix of the camera is only related to the internal parameters of the camera , so it is called the internal reference matrix. Through the internal reference matrix , any three-dimensional coordinates in the camera coordinate system can be mapped to the pixel coordinate system , and the mapping relationship between spatial points and pixel points can be constructed.

image-20220324095912098

//非齐次坐标转换成齐次坐标
void cv::convertPointsToHomogeneous(InputArray src, //非齐次坐标,,vector<Pointnf>或者Mat
                                    OutputArray dst//其次坐标,维数大1
                                   )
//齐次坐标转换为非齐次坐标
void cv::convertPointsFromHomogeneous(InputArray src, 
                                      OutputArray dst
                                     )

10.1.2 Extracting the corner points of the calibration board

https://zhuanlan.zhihu.com/p/94244568

image-20220324112319424

bool cv::findChessboardCorners(InputArray image,
                               Size patternSize, //图像中棋盘内角点行数和列数.
                               OutputArray corners, //在vector<Point2
                               ing flags = CALIB_CB_ADAPTIVE_THRESH+CALIB_CB_NORMALIZE_IMAGE//检测内角点方式的标志
                              )

image-20220324112527405

The inner corner coordinates detected by the findChessboardComers() function are only approximate values . In order to determine the inner corner coordinates more accurately, you can use the comerSubPix() function that we introduced earlier to calculate the sub-pixel corner coordinates. In addition, OpenCV4 also has a special purpose to improve the calibration board The find4QuadConerSubpix() function of the inner corner coordinate accuracy .

bool cv::find4QuadConerSubpix(InputArray img, 
                              InputOutputArray corners, //待优化的内角点坐标
                              Size region_size//优化坐标时考虑的邻域范围。
                             )
//对于圆形标定板
bool cv::findCirclesGrid(InputArray image, 
                         Size patternSize, //图像中每行和每列圆形的数目.
                         OutputArray centers, 
                         int flags = CALIB_CB_SYMMETRIC_GRID, //检测圆心的操作标志
                         const Ptr<FeatureDetector> & blobDetector = SimpleBlobDetector::create()//在浅色背景中寻找黑色圆形斑点的特征探测器.
                        )

image-20220324113342568

//绘制角点位置
void cv::drawChessboardCorners(InputArray image, //需要绘制角点的目标图像,必须是CU_8U 类型的彩色图像.
                               Size patternSize, //标定板每行和每列角点的数目.
                               InputArray corners, //检测到的角点坐标数组.
                               bool patternWasFound//绘制角点样式的标志,用于显示是否找到完整的标定板.
                              )

10.1.3 Monocular Camera Calibration

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-CpCz6Uum-1666660001286)(https://gitee.com/lyz-npu/typora-image/raw/master /img/%E5%9B%BE10-6.png)]

image-20220324143013796

image-20220324143111599

image-20220324143204656

Camera calibration is mainly the computer camera internal reference matrix and 5 coefficients of camera distortion

double cv::calibrateCamera(InputArrayOfArrays objectPoints,
                           InputArrayOfArrays imagePoints, 
                           Size imageSize, 
                           InputOutputArray cameraMatrix, 
                           InputOutputArray distCoeffs, 
                           OutputArrayOfArrays rvecs, 
                           OutputArrayOfArrays tvecs, 
                           int flags = 0, 
                           TermCriteria criteria =  TermCriteria(TermCriteria::COUNT+TermCriteria::EPS, 30, DBL_EPSILON))

image-20220324143647353

10.1.4 Monocular Camera Correction

After obtaining the distortion coefficient matrix of the camera, the distortion in the image can be removed according to the distortion model to generate a theoretically distortion-free image.

一:initUndistortRectifyMap()——remap()

2: unidistort()

void cv::initUndistoritRectifyMap(InputArray cameraMatrix, 
                                  InputArray distCoeffs, 
                                  InputArray R, 
                                  InputArray newCameraMatrix, 
                                  Size size, 
                                  int mltype, 
                                  OutputArray map1, 
                                  OutputArray map2
                                 )

image-20220324152803181

void cv::remap(InputArray src, 
               OutputArray dst,  
               InputArray map1,
               InputArray map2,
               int interpolation,
               int borderMode = BORDER_CONSTANT,	
               const Scalar & borderValue = Scalar()
              )

image-20220324153104182

//直接矫正
void cv::undistort(InputArray src, 
                   OutputArray dst,  
                   InputArray cameraMatrix,
                   InputArray distCoeffs,
                   InputArray newCameraMatrix = noArray()
                  )

image-20220324154913343

10.1.5 Monocular projection

Monocular projection refers to the process of calculating the coordinates of three-dimensional coordinate points in space in the two-dimensional plane of the image according to the imaging model of the camera. The projectPoints() function is provided in OpenCV4 to calculate the projection of three-dimensional points in the world coordinate system into the pixel coordinate system The two-dimensional coordinates of

void cv::projectPoints(InputArray objectPoints, 
                       InputArray rvec, 
                       InputArray tvec, 
                       InputArray cameraMatrix, 
                       InputArray disCoeffs, 
                       OutputArray imagePoints, 
                       OutputArray jacobian = noArray(), 
                       double aspectRatio = 0
                      )

image-20220324160305632

10.1.6 Monocular Pose Estimation

image-20220324164009124

bool cv::solvePnP(InputArray objectPoints, 
                  InputArray imagePoints, 
                  InputArray cameraMatrix, 
                  InputArray disCoeffs, 
                  OutputArray rvec, 
                  OutputArray tvec, 
                  bool useExtrinsicGuess = false, 
                  int flags = SOLVEPNP_ITERATIVE
                 )

image-20220324165041394

image-20220324165047990

image-20220324165131173

image-20220324165159040

image-20220324165209528

image-20220324165845260

Chapter 11 Video Analysis

.This chapter will focus on how to detect and track moving objects in the video. The main methods are difference method, mean shift method and optical flow method.

11.1 Interpolation method to detect moving objects

//用于计算两个图像差值的绝对值
void cv::absdiff(InputArray src1, 
                 InputArray src2, 
                 OutputArray dst
                )

11.2 Mean Shift Method Target Tracking

11.2.1 Target tracking achieved by the mean shift method

The mean value migration method can realize target tracking. Its principle is to first calculate the mean value in a given area. If the mean value does not meet the optimal value condition, then move the area to the direction close to the optimal condition, and find the target area through continuous method generation. .

The mean shift method can also be called the hill climbing algorithm.

Figure 11-3

image-20220325160446902

int cv::meanShift(InputArray probImage, //目标区域的直方图的反向投影
                  Rect & window, //初始搜索窗口和搜索结束时的窗口
                  TermCriteria criteria//迭代停止算法
                 )
//鼠标选取目标区域
Rect cv::selectROI(const String & windowName, 
                   InputArray img, //选择ROI 区域的图像.
                   bool showCrosshair = true, //是否显示选择矩形中心的十字准线的标志.
                   bool fromCenter = false//ROI 矩形区域中心位置标志.当该参数值为true 时,鼠标当前坐标为ROI 矩形的中心,当该参数值为false时,鼠标当前坐标为ROI 矩形区域的左上角.
                  )

11.2.2 Target Tracking by Adaptive Mean Shift

The adaptive mean shift method improves the mean shift method so that the size of the search window can be automatically adjusted according to the size of the tracked object . In addition, the improved mean transfer method can return not only the position of the tracking target, but also the angle information .

RotateRect cv::Camshift(InputArray probImage, 
                        Rect & window, 
                        TermCriteria criteria
                       )

11.3 Optical Flow Object Tracking

Optical flow is the instantaneous velocity of each pixel projected by a spatially moving object on the imaging image plane , which can be equivalent to the displacement of pixels in a short time interval . Under the premise of ignoring the influence of illumination changes, the generation of optical flow is mainly due to the movement of objects in the scene, the movement of the camera or the joint movement of both. Optical flow represents the change of the image, because it contains the information of the target's movement, it can be used by the observer to determine the movement of the target, and then realize the target tracking.

image-20220325163304629

The optical flow method requires the pixel to move a small distance, but sometimes the pixel movement distance in the obtained continuous image is relatively large, and at this time, the image "pyramid" needs to be used to solve the problem of large-scale movement.

According to the number of pixels to calculate the optical flow velocity, the optical flow method can be divided into dense optical flow method (all pixels are used) and sparse optical flow method. (only some pixels are used)

11.3.1 Farneback Polynomial Extension Algorithm

void cv::calcOpticalFlowFarnback(InputArray prev, 
                                 InputArray next, 
                                 InputOutputArray flow, 
                                 double pyr_scale, 
                                 int levels,//金字塔层数 
                                 int winsize, //均值窗口的尺寸
                                 int iterations, 
                                 int ploy_n, //在每个像素中找到多项式展开的像素邻域的大小.
                                 double poly_sigma, //高斯标准差.
                                 int flags//计算方法标志. 当该参数值为OPTFLOW_USE_INITIAL_FLOW时, 表示使用输入流作为初始流的近似值:当该参数值为OPTFLOW_FARNEBACK_GAUSSIAN时,表示使用离斯滤波器代替方框滤波器进行光流估计.
                                )
//计算二维向量方向和模长
void cv::cartToPolar(InputArray x, 
                     InputArray y, 
                     OutputArray magnitude, 
                     OutputArray angle, 
                     bool angleInDegrees = false
                    )

Dense optical flow is often used for object tracking in camera-fixed video data .

11.3.2 Tracking based on LK sparse optical flow method

https://blog.csdn.net/a_31415926/article/details/50515835

//TermCriteria终止条件
cv::TermCriteria::TermCriteria(int type, //判定迭代终止的条件类型,要么只按count算,要么只按EPS算,要么两个条件达到一个就算结束
                               
                               int maxCount, //具体的最大迭代的次数是多少
                               double epsilon//具体epsilon值是多少
                               ) 
/*
COUNT:按最大迭代次数算
EPS:就是epsilon,按达到某个收敛的阈值作为求解结束标志
COUNT + EPS:要么达到了最大迭代次数,要么按达到某个阈值作为收敛结束条件。
*/
void cv::calcOpticalFlowPyrLK(InputArray prevImg, 
                              InputArray nextImg, 
                              InputArray prePts, 
                              InputOutputArray nextPts, 
                              OutputArray status, 
                              OutputArray err, 
                              Size winSize = Size(21,21), 
                              int maxLevel = 3,
                              TermCriteria criteria = TermCriteria(TermCriteria::COUNT+TermCriteria::EPS,30,0.01)
                              int flags = 0,
                              double minEigThreshold = 1e-4
                             )

image-20220326103950521

The calcOpticalFlowPyrLK() function needs to manually input the coordinates of the sparse optical flow points in the image. Usually, it can detect the feature points or corner points in the image , and use the coordinates of the feature points or corner points as the coordinate input of the initial sparse optical flow points.

The biggest problem : the number of corner points is getting smaller and smaller, so it is necessary to count the number of focus points tracked at all times. When the number of corner points is less than a certain threshold , it is necessary to detect the corner points again to increase the number of corner points. If there are non-moving objects in the image , the corner points on these objects can be detected in each image , so that the number of corner points is always higher than the threshold , but these fixed feature points are not what we need, so we need to judge the corner points Whether the point moves in the two frames of images, delete the corner points that do not move , and then track the moving object.

Guess you like

Origin blog.csdn.net/weixin_42264818/article/details/127506686