An article to understand the image binarization algorithm

Traditional machine vision usually includes two steps: preprocessing and object detection. And is a bridge between the two image segmentation (Image Segmentation ) [. 1]. Image segmentation makes the image easier to analyze by simplifying or changing the representation of the image.

For example, a new batch of broilers have been imported into a food processing plant and want to visually detect their deliciousness. After preprocessing and optimizing the image, the machine must first separate the chicken from the background in the image, and analyze the area of ​​interest separately to make a quick and accurate judgment.

Visual processing in food processing plants

However, image segmentation is not easy for stupid AI. Smart humans can see at a glance what can and cannot be eaten in the picture below. But it takes a lot of effort for the computer to separate these things.

Original image

 

Image segmentation result

The simplest image segmentation method is Binarization .

Image Binarization is the process of setting the gray value of the pixels on the image to 0 or 255, which is the process of presenting the entire image with a clear black and white effect. Each pixel of a binary image has only two values: either pure black or pure white.

Comparison of color image, gray image, and binary image

Because the binary image data is simple enough, many vision algorithms rely on binary images. Through the binary image, the shape and contour of the object can be better analyzed. The binary image is also often used as a mask for the original image (also known as a mask, mask, or Mask) : it is like a partially hollowed-out paper, which masks the areas that we are not interested in. Binarizing a variety of ways, the most common is the use of threshold method ( Thresholding ) is binarized.

In computer vision, a matrix is ​​generally used to represent an image. In other words, no matter how delicious your pictures look, they are nothing more than a matrix to the computer.

In this matrix, each pixel is an element in the matrix. In a three-channel color image, this element is a tuple composed of three numbers.

Color three-channel image

For single-channel grayscale images, this element is a number. This number represents the brightness of the image at this point. The larger the number, the brighter the pixel. In a common 8-bit single-channel color space, 0 represents all black and 255 represents all white.

Single-channel grayscale image

The threshold method is to select a number, greater than it is regarded as completely white, and less than it is regarded as completely black. Just like a light switch in a classroom, we gently push it, and if it suddenly exceeds a certain threshold, the light will come on with a pop.

According to different threshold selection methods, it can be divided into global threshold and local threshold.

1. Global threshold (Global Method)

The global threshold refers to the selection of the same threshold for each pixel in the entire image. We can experience this operation in Photoshop's Image—Adjust—Threshold:

Threshold in Photoshop

It can be seen that during the movement of the threshold level from 1 to 255, more and more areas of the image become black. When the threshold number is within a certain range, the outline of the red rice sausage is clearly distinguishable.

Correct binarization makes the outline of red rice intestines clear and recognizable

In the production line environment, the light is known, and a fixed number is often set as the global threshold. But in outdoor or robot competitions, the lighting conditions are often more complicated.

The same Oreo ice cream, during the day and night, the camera sees the picture may not be the same, the constant threshold can not adapt to these two situations at the same time.

Different light and dark pictures

For the darker night of the picture, we need a lower threshold, for example, set the threshold to 50, which can clearly separate the black and white colors at night, but in the daytime, it will be white (left); if we set The threshold is set relatively high, such as 172, which can be segmented smoothly during the day, but it is completely black at night (right). We need algorithms that can adapt to complex environments.

Threshold on the left = 50, threshold on the right = 172

In fact, after a little analysis, we can find that the color difference in this image is quite obvious, and there are only two colors of dark and light. Therefore, whether it is day or night, its color gradation histogram should be two obvious peaks, representing the dark and light areas respectively. It's just that the histogram of gradation will be shifted to the right during the day, and to the left at night.

Image histogram

If you choose the trough between the two peaks as the threshold, you can easily separate the two types of pixels. However, the histogram of the image is often discontinuous, with a lot of spikes and jitter, and it is very difficult to find the exact extreme point.

Japanese engineer Otsu Nobuyuki found a suitable mathematical expression for this trough, and published it in 1979 [2]. This binarization method is called Otsu's method ( Otsu's method ) . Otsu's algorithm is similar to the discretization simulation of one-dimensional Fisher discriminant analysis. An exhaustive method is used to find a threshold number, and these pixels are cut into two categories, so that the intra-class variance of the brightness of the two types of pixels is the smallest. The intra-class variance refers to the weighted sum of the variances of the two types of pixels, where weight refers to the ratio of the number of such pixels to the number of pixels in the entire image.

Maybe your picture will not only have two lumps of colors with large differences. For example, this ice cream has three sharp peaks.

Three-color ice cream (take the histogram of the ice cream part)

At this time, it can be done only by slightly extending the Otsu algorithm. The multi-level generalization of Otsu's algorithm becomes the multi-Otsu method ( multi Otsu method ) [3].

2. Local Method

Local Method is also called Adaptive Thresholding.

During the game, spotlights are often shining on a specific area, creating a scene where part of the light is received but not part of the light.

Partially illuminated image

When global thresholding is applied to a partially illuminated image, it may be embarrassing that "no matter what threshold parameter is set, the requirements of the whole image cannot be met". For example, in the above image, when the global threshold is directly applied, when all the sushi in the upper left half is exposed, the lower right half is still black.

Global Threshold Processing of Locally Illuminated Image

At this time, we will use the local threshold to deal with. In fact, the human eye also comes with this step of operation. We determine the color depth of an object, which is often affected by the color around the object, which is why the black teeth look whiter.

The local threshold method assumes that the illumination of the image in a certain area is relatively close. It scans the image with a sliding window and compares the brightness of the center point of the sliding window with the brightness of other areas (called neighborhood area ) in the sliding window . If the brightness of the center point is higher than the brightness of the neighborhood , the center point is marked as white, otherwise it is marked as black.

Local threshold sliding window

 

Mentioned here is the basic method of local threshold. For other local threshold methods commonly used in actual use, please refer to Chow-Kaneko adaptive threshold method [4].

The application of local threshold is very extensive, especially for the processing of white paper and black text is very effective. Many algorithms for optical character recognition (OCR) and QR code scanning use local threshold operations.

For example, the following two-dimensional code is a typical partial light-receiving image:

Partially illuminated QR code

 

If a global threshold is used for this picture (for example, the image below is segmented using Otsu's algorithm), it cannot be segmented correctly anyway.

The global method cannot process the local light-receiving image

 

The local threshold method can segment the image well. It can be clearly observed from the picture that the local threshold method is more sensitive to the details of a large clean area, so there are many spots on the paper that we didn't notice.

Partial method to segment the QR code

In actual application, we have to choose different binarization methods according to the needs, and no method is absolutely perfect.

For example, when identifying enemy robots, because the armor piece light bar is a self-illuminating object, it is less affected by ambient light. In order to improve the efficiency of the program, we use a fixed number as the global threshold:

Base automatic counterattack

In the recognition of the energy organ, since the energy organ has only two colors, black and white, we use the Otsu algorithm and its many variants:

Binary graph of each area of ​​the large energy organ

When the aerial robot reads the QR code in the base area, the local threshold method is used:

Aerial robot recognition base

In addition to image segmentation methods based on thresholds, commonly used segmentation methods can also be based on edges (such as Yanowitz-Bruckstein adaptive threshold method [5]), regions (such as region growing algorithm [6]), etc. They are used in satellite image processing, traffic Control systems, industrial production monitoring, medical imaging and other fields play a huge role.

Brain tissue image segmentation

 

The OpenCV implementation of the threshold method described in this article, please refer to the blog: Python+OpenCV image processing experiment

Project effect

references

[1] Spirkovska, L. (1993). A summary of image segmentation techniques.

[2] Nobuyuki Otsu (1979). "A threshold selection method from gray-level histograms". IEEE Trans. Sys., Man., Cyber9 (1):62–66.

[3] Ping-Sung Liao and Tse-Sheng Chen and Pau-Choo Chung (2001). "A Fast Algorithm for Multilevel Thresholding". J. Inf. Sci. Eng17 (5):713–727.

[4] Chow,C.K.; and Kaneko, T.: Boundary Detection of Radiographic Images by a Thresholding Method. Frontiers of Pattern Recognition, S. Watanabe, ed.,Academic Press, New York, 1972, pp. 61-82.

[5] Yanowitz, S. D., & Bruckstein, A. M. (1988, November). A new method for image segmentation. In Pattern Recognition, 1988., 9th International Conference on (pp. 270-275). IEEE.

[6] Richardson H W. Regional growth theory[M]. Macmillan,1973.

Guess you like

Origin blog.csdn.net/m0_38106923/article/details/115206093