Jizhi AI | Understanding of Mask and ROI in Image Processing

  Get into the habit of writing together! This is the 18th day of my participation in the "Nuggets Daily New Plan · June Update Challenge", click to view the event details .

欢迎关注我的公众号 [极智视界],获取我的更多笔记分享

  Hello everyone, I am Jizhi Vision. This article introduces the understanding of mask and ROI in image processing.

  The concept of mask has been used more in traditional image processing before, but it is more popular in deep learning such as: (1) use mask to do some data enhancement similar to "cloze" in self-supervision; (2) swin transformer Some applications of masks in the new model, etc. It can be seen from the above that masks can also play an important role in the field of deep learning. For better expression, here we use traditional image processing to introduce mask and ROI, which are basically the same as the mask used in deep learning.

1. Understanding of ROI

  The understanding of ROI is relatively simple. ROI is the region of interest, that is, the area to be operated.

2. Understanding of masks

  • The mask is an 8-bit single-channel image (grayscale/binary);
  • If a certain position of the mask is 0, the operation on this position has no effect;
  • If a certain position of the mask is not 0, the operation on this position will work, that is, the ROI area;
  • Can be used to extract irregular ROI;

  Indication:

  source      mask       dst
125 100 85   0  0  0   0  0  0
66  25  35   0 255 0   0  25 0
120 125 100  0  0  0   0  0  0

3. Why use an image mask

  For regular ROIs, such as the most commonly used regular rectangle (Rect), the image data of the region of interest can be easily obtained, as follows:

for(int y = rect.tl().y; y < rect.br().y; ++y)
{
	uchar* imgRow = img.data + y * img.step;
	for(int x = rect.tl().x; x < rect.br().x; ++x)
	{
		imgRow[x]...
	}
}

  However, for irregular ROIs, it is difficult to directly extract the image data in the region of interest. At this time, the role of the mask is reflected. Assuming the mask image size is the same as the source image size, then:

cv::Mat dst = cv::Mat::zeros(Size(img.cols, img.rows), CV_8UC1);
for(int y = 0; y < img.rows; ++y)
{
	uchar* imgRow = img.data + y * img.step;
	uchar* maskRow = mask.data + y * img.step;
	uchar* dstRow = dst.data +y * img.step;
	for(int x = 0; x < img.cols; ++x)
	{
		if(maskRow[x] > 0)
		{
			dstRow[x] = imgRow[x]...
		}
	}
}

4. Image fusion using masks

  Example of image fusion using masks:

  Above, the logo of the opencv on the right is fused into the vehicle image on the left. It can be found that one feature is that the opencv logo is irregular. At this time, it is very suitable to use a mask of the same size as the original image on the right. The effect of image fusion is as follows:


  Well, the above shared the understanding of mask and ROI in image processing. I hope my sharing can help you a little bit in your study.


 【Public number transmission】

"Extreme AI | Understanding of Mask and ROI in Image Processing"


logo_show.gif

Guess you like

Origin juejin.im/post/7113489244684812302