Convert yolov5 image segmentation labels based on semantic segmentation Ground Truth (GT) (example of road surface water detection)

Convert yolov5 image segmentation labels based on semantic segmentation Ground Truth (GT) (example of road surface water detection)

overview

As developers are calling for yolov5 to do segmentation tasks in issues, the yolov5 team is really helping developers solve problems. After the v6.0 version, the latest solution is launched with a tutorial.

image-20230128172549462

Before, there was a method of using improved yolo to add segmentation heads to achieve target detection and segmentation. The latest v7.0 version has a very good effect, and yolov8 is also a heavy blow in segmentation

image-20230128172654729

img

Therefore, using yolo to complete target detection is also an option for landing projects, and the ecology of yolo is more suitable for landing, and to achieve trial detection. However, most of the current public data sets use other segmentation domain models, and of course the labels are also adapted to other models. I thought of this when I was doing competitions on the Jishi platform . I felt that it would be more labor-saving to use target detection for the road area, but he gave the segmentation data.

img

img

I tried to convert the GT image tag to the format of yolo, but I couldn't find a good solution after searching for a long time, so I tried to modify it based on my previous experience in target detection .

process

Since there is no json format or other format label corresponding to the segmented area, it is necessary to find the corresponding coordinates according to GT, which can be understood as the Polygon label format, and each inflection point is marked with labelimg, so it is necessary to obtain a rough outline through contour detection. Coordinate points, converted to the format required by yolo

1、查找分割区域,
2、获取分割区域的轮廓坐标
3、精简坐标点
4、转存txt

All the above operations are based on OpenCV

read and process

Convert to a single-channel grayscale image and process the binarized image, let the image automatically convert the threshold,

cv2.threshold (src, thresh, maxval, type)

src:源图片,必须是单通道
thresh:阈值,取值范围0~255
maxval:填充色,取值范围0~255
type:阈值类型,具体见下表

Threshold type:

threshold Parameter Type Pixels smaller than the threshold Pixels greater than the threshold
0 cv2.THRESH_BINARY set to 0 set fill color
1 cv2.THRESH_BINARY_INV set fill color set to 0
2 cv2.THRESH_TRUNC Keep the original color set gray
3 cv2.THRESH_TOZERO set to 0 Keep the original color
4 cv2.THRESH_TOZERO_INV Keep the original color set to 0

image-20230129101726731

Here I used automatic threshold adjustment, so only need to give range 0-255.

gray_img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,bin_img = cv2.threshold(gray_img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

Query the contour and get the coordinate points

The single-channel image will be sent to the edge detection algorithm for contour point query, because the labeling of GT tags is very fine, and there will be many edge points. In the detection process, it is necessary to use a method with less point estimation, or to filter method

cv2.findContours(image, mode, method[, offset])

method: The contour approximation method has the following methods

cv2.CHAIN_APPROX_NONE:存储所有的轮廓点
cv2.CHAIN_APPROX_SIMPLE:压缩水平,垂直和对角线段,只留下端点。 例如矩形轮廓可以用4个点编码。
cv2.CHAIN_APPROX_TC89_L1,cv2.CHAIN_APPROX_TC89_KCOS:使用Teh-Chini chain近似算法

After testing, the cv2.CHAIN_APPROX_TC89_KCOS method is more in line with our needs. The following is a comparison chart of several methods:

original image

image-20230129102601475

cv2.CHAIN_APPROX_NONE

image-20230129103241865

cv2.CHAIN_APPROX_SIMPLE

image-20230129103345849

cv2.CHAIN_APPROX_TC89_L1

image-20230129103557153

cv2.CHAIN_APPROX_TC89_KCOS

image-20230129103640467

The label format of yolo is to add mark points on turning or longer edges, so it does not need too many adjacent points, just give a rough outline. Compared with the above, the most suitable method is the cv2.CHAIN_APPROX_TC89_KCOS approximation method.

But from the final result graph, there are still some unnecessary points, so we choose a simple principle, if the adjacent point changes in x or y exceeds a threshold, it will be kept, otherwise it will not be marked and not used as a segmentation point . The threshold is not fixed, I set it to 30 and the effect is as follows

image-20230129102920513

The above part of the code

cnt,hit = cv2.findContours(bin_img,cv2.RETR_TREE,cv2.CHAIN_APPROX_TC89_KCOS)
cv2.drawContours(img1,cnt,-1,(0,255,0),5)
cnt = list(cnt)
for j in cnt:
    result = []
    pre = j[0]
    for i in j:
        if abs(i[0][0] - pre[0][0]) > 30 or abs(i[0][1] - pre[0][1]) > 30:
            pre = i
            temp = list(i[0])
            #根据yolo的归一化方式,x,y分别除以原图的宽和高
            temp[0] /= W
            temp[1] /= H
            result.append(temp)
            cv2.circle(img1,i[0],1,(0,0,255),2)

Calculate and dump txt

It is stored according to the coordinates of each category, and some coordinates will be many, so an array needs to be written once. The "0" written first is the current category, if multiple categories need to be processed separately

f.write("0 ")
for line in result:
    line = str(line)[1:-2].replace(",","")
    # print(line)
    f.write(line+" ")
f.write("\n")

Effect demo: https://live.csdn.net/v/271857

Complete code: https://github.com/magau123/CSDN/blob/master/GT2yolo-seg.py

Guess you like

Origin blog.csdn.net/charles_zhang_/article/details/128786621