OpenCV image

OpenCV image

We talked about configuration earlier and talked about Mat, a very important data structure of OpenCV. Then Mat was born for image manipulation.

Image traversal

Imagine how we traverse a two-dimensional array?
We write nested loops. The first loop is the control of the number of rows in a two-dimensional array, and the second is the control of the number of columns.
Then this set can also be used in image traversal!
The only question we need to consider is whether the storage of images is continuous? That is, is the address difference between the last column of the first row and the first one of the second row adjacent?
Well, in fact, we don't need to judge through a large string of codes, because there is a built-in API!
Then we write a form of traversal:

#include<iostream>
#include<string>
#include<opencv2/opencv.hpp>
#include<opencv2/imgcodecs.hpp>
#include<opencv2/imgproc.hpp>
using namespace std;
using namespace cv;

const string file_path = "D://C++//kingjames//Debug//favorite.jpg";

int main() {
    
    
	Mat img = imread(file_path);
	// 判断一张图像的存储是否是连续的
	cout << boolalpha << img.isContinuous() << endl;
	//Mat copy_img = Mat::zeros(img.size(), CV_8UC3);
	int row = img.rows;
	int col = img.cols;
	int channel = img.channels();
	// 试想如果是三通道,那么我们的列数就应该乘上3能对每个值进行遍历到
	col *= channel;
	// 如果是连续的话我们可以用一维的形式来遍历即可,遍历总数就是col * row * channel
	if (img.isContinuous()) {
    
    
		row = 1;
		col *= row;
	}
	uchar* p;
	for (int i = 0; i < row; i++) {
    
    
		p = img.ptr<uchar>(i);
		for (int j = 0; j < col; j++) {
    
    
			cout << p[j] << endl;
		}
	}

	return 0;
}

After we read in a picture, we get the number of rows and columns. The number of columns is worth our attention, because don’t forget the number of channels in the picture! In the above code, we directly multiply the number of columns by it to get a new value and then nest two levels of loops, but this is not necessarily the case. We can nest three levels of loops, so that the third level of loops is controlled by the number of channels. Every time we traverse a line, we first get the first address of the current line, that is, use ptr to get it.
ptr represents the pointer, uchar is the data type representing unsigned char, and the last i is the row label. What is obtained in this form is the first address of the i-th row .
I won’t print it out here because it’s too long. When we are doing project debugging, we may need to output the problem location. Usually, a small picture is enough. My picture is too big!

When introducing iterator traversal, we must first introduce another OpenCV data structure, Vec3b.
Assuming that one of our pixels has three channels, then we can use a Vec3b data structure to store the value of that pixel, and use the index "[]" to get the specific value of a channel.
(Note that the three values ​​in Vec3b are all uchar types.)

The iterator believes that everyone has already experienced it when using many containers in the STL standard library. We only need to get a head and a tail, and then move backwards like a pointer to get the completion, then the steps are actually exactly the same. Declare the iterator, get the head and tail, and let the iterator go from beginning to end:

#include<iostream>
#include<string>
#include<opencv2/opencv.hpp>
#include<opencv2/imgcodecs.hpp>
#include<opencv2/imgproc.hpp>
using namespace std;
using namespace cv;

const string file_path = "D://C++//kingjames//Debug//favorite.jpg";

int main() {
    
    
	Mat img = imread(file_path);
	MatIterator_<Vec3b> it;  // 声明迭代器
	for (it = img.begin<Vec3b>(); it < img.end<Vec3b>(); it++) {
    
    
		//通过API直接获得图头和尾
		cout << "B:" << (*it)[0] << endl;
		cout << "G:" << (*it)[1] << endl;
		cout << "R:" << (*it)[2] << endl;
	}
	return 0;
}

Of course, we can also use uchar when declaring iterators, but that applies to single-channel graphs.

~~Note: We often get some weird symbols when printing out uchar data, we only need to force the type conversion to int to be ok!

There is another one about lookuptable in the official education. I'm not very interested in it and I haven't touched it before.

Filter operation

The recent AI craze remains high, which can be described as obsession.
Most people's impression of image filtering should be, for example, "P-pictures of a certain picture (skinned? Highlight?)", or that they have come into contact with AI's kernel "filtering core". Most of the time we are feeling the powerful functions brought by filtering or calling APIs for direct operation, and we also have to conduct an in-depth analysis of the implementation principle.
First put the code directly, it will be more convenient to explain this way:

#include<iostream>
#include<string>
#include<opencv2/opencv.hpp>
#include<opencv2/imgcodecs.hpp>
#include<opencv2/imgproc.hpp>
using namespace std;
using namespace cv;

const string file_path = "D://C++//kingjames//Debug//favorite.jpg";

void Sharpen(Mat& A, Mat& B);

int main() {
    
    
	Mat img = imread(file_path);
	Mat B;
	B.create(img.size(), img.type());
	double t = (double)getTickCount();
	Sharpen(img, B);
	cout << "Time is " << ((double)getTickCount() - t) / getTickFrequency() << endl;
	imshow("filter", B);
	waitKey(0);
	Mat dst = Mat::zeros(img.size(), img.type());
	Mat kernel = (Mat_<char>(3, 3) << 0, -1, 0, -1, 5, -1, 0, -1, 0);
	t = (double)getTickCount();
	filter2D(img, dst, dst.depth(), kernel);
	cout << "Time is " << ((double)getTickCount() - t) / getTickFrequency() << endl;
	imshow("Like", dst);
	waitKey(0);
	return 0;
}

void Sharpen(Mat& A, Mat& B) {
    
    
	int row_num = A.rows;
	int col_num = A.cols;
	int chanel = A.channels();
	for (int i = 1; i < row_num - 1; i++) {
    
    
		const uchar* up = A.ptr<uchar>(i - 1);
		const uchar* pa = A.ptr<uchar>(i);
		const uchar* un = A.ptr<uchar>(i + 1);
		uchar* pb = B.ptr<uchar>(i);
		for (int j = chanel; j < (col_num - 1) * chanel; j++) {
    
    
			*pb++ = saturate_cast<uchar>(5 * pa[j] - up[j] - un[j] - pa[j + chanel] - pa[j - chanel]);
		}
	}
	B.row(0).setTo(Scalar(0));
	B.row(row_num - 1).setTo(Scalar(0));
	B.col(0).setTo(Scalar(0));
	B.col(col_num - 1).setTo(Scalar(0));
}

One thing that filtering does is actually to change the pixel values ​​in the original image. The core is how to change it? What is the effect of such a change? In order to achieve a certain effect, how do we design the filter kernel weight matrix?

Also take the official education as an example:
it has designed a weight matrix for us as follows

Insert picture description here
So how to change a pixel value according to this matrix?
We have to keep an eye on the center point, this point actually represents that we are traversing to that point!
According to this idea, we found that we got the weight of 5 at the point we traversed, and the weights around it are -1. Then the calculation method of the value of this point becomes:
Insert picture description here
this way It is determined that when we traverse the manipulated graph, we must start from the second row to the penultimate row, and the number of columns is more complicated. Draw a graph with three channels as an example to explain: Take a
Insert picture description here
closer look, our Does the column loop really start at 1? no! It starts with the channel. For example, in the three-channel picture above, our first filtered pixel is the Blue in the second column. What is its serial number? It's 3! It's the cahnnel number! In this way, it is not difficult to understand that the capping of the column loop is <(col-1)* channel.

When we understand the filtering process, it is easy to write this thing, and the conditions for our nested two-layer loop have all come out. At the same time, we have to set three constant pointers to point to the previous line, the current line and the next line of the current pixel position respectively. At the same time, note that the top and bottom are easy to understand, because they are in the same column! But the left and right are not simply subtracted by 1, but also by subtracting the channel. Look at the picture I drew and you will know that the B in the second column corresponds to the B in the first column, so the subtracted is still 3.

At the end, we have to change the first and last two rows and two columns to 0 for the unfiltered ones. This is the official simple way to deal with it, I thought, wouldn’t it be easier not to deal with haha~~~~

Then we came to our favorite API link. OpenCV helped us encapsulate the filter function, which is filter2D. The parameters correspond to the original image, the filtered image, the image bit depth and the filter kernel.
Then we don’t have to write by hand every time, because sometimes the filter matrix may be larger, for example, 15 * 15, then we have to start from the 7th line... It is indeed freeing our hard-working hands. .

** Put in here, although basically it is said that filter2D time comes fast, not necessarily, I have never tested this fast, there is a problem, there is something tricky, I don't understand it! ! **

Basic image operation

The most basic is to learn to read and write a picture, that is, the use of the imread function:
Of course, the most basic of the above code has been seen is the parameter plus the file path name.
So sometimes what we want may only be a grayscale image, like this:

#include<iostream>
#include<string>
#include<opencv2/opencv.hpp>
#include<opencv2/imgcodecs.hpp>
#include<opencv2/imgproc.hpp>
using namespace std;
using namespace cv;

const string file_path = "D://C++//kingjames//Debug//favorite.jpg";

int main() {
    
    
	Mat img = imread(file_path, IMREAD_GRAYSCALE);
	imshow("im", img);
	waitKey(0);
	return 0;
}

What you get is such a picture:
Insert picture description here
In fact, there are so many options for reading. There is no need to introduce them one by one. You can explore by yourself:
Insert picture description here
reading and writing are a pair of coexisting operations, which is not like OS or Database, reading to You put a lock and write you put a lock, read a permission for you, and write a permission for you.
Writing functions is also very simple, such as:

#include<iostream>
#include<string>
#include<opencv2/opencv.hpp>
#include<opencv2/imgcodecs.hpp>
#include<opencv2/imgproc.hpp>
using namespace std;
using namespace cv;

const string file_path = "D://C++//kingjames//Debug//favorite.jpg";

int main() {
    
    
	Mat img = imread(file_path, IMREAD_GRAYSCALE);
	imwrite("D://new.png", img);
	return 0;
}

You will find that there is an extra new.png under Disk D. It's that simple!
! ! If you have permission under the D drive, then it may not be possible, just change to a place with permission to write it! !

(Note: The following parts are directly taken from the official website!)
We have already traversed the image, and the index of the image is actually the same. For a single-channel image, we can do it like this:
at function, the specified type is uchar, y and x is the corresponding position.

Scalar intensity = img.at<uchar>(y, x);

The most important thing to pay attention to in this way is the order of x and y. If we can’t remember, we have to write:

Scalar intensity = img.at<uchar>(Point(x, y));

The three-channel index method has to use Vec3b:

            Vec3b intensity = img.at<Vec3b>(y, x);
            uchar blue = intensity.val[0];
            uchar green = intensity.val[1];
            uchar red = intensity.val[2];

This indexing method can also be used to directly modify the value corresponding to its position.

After yes, we will choose a value that we are interested in. The shape must be determined by ourselves. The simplest rectangle shown here:
The first two parameters of Rect fix the position of the upper left vertex, and the last two positions represent the width and height respectively.
There are also a variety of initialization methods, and it is also possible to pass in two points······

            Rect r(10, 10, 100, 100);
            Mat smallImg = img(r);

The color gamut that we talked about in the second section only needs one cvtColor:
a lot of transformations are not explained one by one, whichever is used, not to mention many are not used.
Insert picture description here

The next step is to change the storage method of the picture:
we are very common is the 8bit unsigned char type, in fact, there are many, and this is done with the help of our convertTo: (Anyway, it is a large section)
Insert picture description here
Then how to display our pictures is believed to be all I'm tired of watching it, it's imshow.
The first parameter of imshow is the name of the window, and the second is the matrix to be displayed by Mat.
We need to use a waitKey command to make the picture stay during Debug, otherwise the picture cannot be displayed!

That's basically it.
Now to borrow Guo Zhui's famous quote, Guo Jerry's "Okay, then I'm leaving now, for nothing"!

Guess you like

Origin blog.csdn.net/kingvingjames/article/details/115027205