OpenCV4-image segmentation-watershed (watershed algorithm)

1. Watershed concept

The watershed method is an algorithm that finds the same area according to the difference between the pixel gray values ​​to achieve segmentation. We can understand the gray value as the height of the pixel, so that an image can be seen as a rough ground or a mountain. Pour a certain amount of water on low-lying areas in the ground, and the water will cover areas below a certain height.
The watershed method is to inject water at multiple local minimum points from a global perspective. As the amount of water injection increases, the water level will become higher and higher, which will "submerge" the values ​​​​with smaller local pixel values. Finally, the water in two adjacent depressions will be pooled together, and a watershed will be formed at the confluence.

2. Watershed processing steps

Syntax: result_img = cv2.watershed(img, masker)
img: original image
masker: seed information of the watershed, including foreground, background, and unknown areas, which can be understood more deeply after reading the full text.
result_img: the segmented image, the edge is identified by the pixel value -1

  1. Find the background and mark the background
  2. Find the foreground and mark the foreground
  3. Mark unknown regions (i.e. edges)
  4. Construct a marker
  5. to split

3. Code example

3.1 Image binarization

When the image is slightly complex, you can use the adaptive threshold method provided by openCV to binarize, which can avoid the inaccuracy of your own selection.

# 首先将图像二值化
img = cv2.imread('./image/water_coins.jpeg')
gary = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# 第四个参数是自适应阈值,让算法自己找合适的部分
ret1, thresh = cv2.threshold(gary, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

Binarized image:
binarized image

3.2 Morphological operations

The noise in the coin is removed through morphological operations, and then the coin is expanded to reduce the background background, which must be the background.

# 开运算,首先将硬币里的噪点去除
kernel = np.ones((3, 3), np.int8)
open1 = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=2)

# 膨胀(为了扩大硬币,缩小背景),保证背景区域一定是背景
bg = cv2.dilate(open1, kernel, iterations=1)

3.3 Get foreground

The farther you are from the center of the coin, the more likely it is the background; the closer you are to the center of the coin, the more likely it is the foreground.
It stands to reason that the coins can be reduced by corrosion, and the foreground obtained must be coins, but the coins are next to each other, and the use of corrosion cannot distinguish this situation very well. As shown in the picture below: The results of four corrosions still cannot distinguish the coins well.
insert image description here
Get foreground by distance transform:
In two-dimensional space, a binary image can be considered to contain only two kinds of pixels of the target and the background, the pixel value of the target is 1, and the pixel value of the background is 0; the result of distance transformation is not a binary image, but a grayscale image, that is, a distance image, and the grayscale value of each pixel in the image is the distance between the pixel (1) and the nearest background (0) pixel.

Syntax: dist = cv2.distanceTransform(img, distanceType, maskSize)
img: binary image to calculate distance
distanceType: distance type, distance calculation method: DIST_L1 (calculate distance by absolute value), DIST_L2 (calculate distance by Pythagorean theorem)
insert image description here
maskSize: kernel size, 3 for L1, 5 for L2

# 获得前景(缩小硬币全区域),保证硬币的地方一定是硬币(不会跟边界混淆)
dist = cv2.distanceTransform(open1, cv2.DIST_L2, 5)
# 超过最大值的百分之70才算
ret2, fg = cv2.threshold(dist, 0.7*dist.max(), 255, cv2.THRESH_BINARY)

# 用matplotlib来绘出图像,就是距离背景越远越亮
# plt.imshow(dist, cmap='gray')
# plt.show()
# exit()

As shown in the picture below: the center of the foreground (coin) is the farthest from the background, so the center is the brightest, and the distance greater than 70 percent is set as the foreground.
insert image description here

3.4 Get the edge (unknown area)

The edge image is obtained by subtracting the background and foreground. (If we don't find the edge, we can directly use erosion to find it, no distance transformation function is needed)

# 获取未知区域,即边缘一定在未知区域
fg = np.uint8(fg)
unknow = cv2.subtract(bg, fg)

insert image description here

3.5 Build markers

The key to the watershed method is to build a good marker. First, the purpose of finding the connected domain through the connected domain function is to set the pixel points of the connected branch to non-zero, and set the background to 0. But if the background is set to 0, the watershed function will regard 0 as an unknown area, so we add 1 to the whole to ensure that the background is not 0, and then set the unknown area to 0.

Function to find connected branches
Syntax: num_objects, labels = cv2.connectedComponents(img)
img: The input image must be a binary image, that is, an 8-bit single-channel image.
num_objects: the number of connected domains
labels: the label of each pixel on the image, represented by numbers 1, 2, 3... (different numbers indicate different connected domains), and 0 indicates the background.

# 计算连通域,构造marker
ret, marker = cv2.connectedComponents(fg)
# marker 的像素点都加1
marker = marker + 1
# 只把未知区域标记为0
marker[unknow==255] = 0

3.6 Image Segmentation

Segment the image based on the constructed marker. The watershed function will mark the pixel value of the edge as -1, so we only need to paint the color of the edge pixel at the end.

# 分水岭
result = cv2.watershed(img, marker)

# 将边缘绘制为红色
img[result==-1] = [0, 0, 255]

The final rendering is as follows:
insert image description here

The above is the entire content of the actual combat of the watershed algorithm. If you have any questions, please leave a message for discussion in the comment area.

Guess you like

Origin blog.csdn.net/weixin_45153969/article/details/131830860