YOLOv5-7.0 instance segmentation trains its own data, slices the mask graph and straightens it

YOLOv5-7.0 can be used for instance segmentation tasks! ! ! It feels like 666 after using it.

Table of contents

Project Introduction

 Data labeling and processing

        convert json to txt 

        Segment training set, test set, validation set

Modify the configuration file

Model Training and Inference

Post-processing


Project Introduction

This article has two main purposes:

  1. Use yolov5 to split the network to train your own data
  2. Process the segmentation result of yolov5 and crop the segmented image

My project is to identify these small blocks in Figure 1, cut out each small block, and rotate it into a horizontal angle before proceeding to the next step. Due to the confidentiality of the project, I will replace it with a blurry picture, sorry for the inconvenience. The effect diagram is shown below. If your project needs to implement functions similar to mine, you can refer to the reference

 Figure 1

 Figure II

 Figure three

Figure four

illustrate:

  • Figure 1 is the original picture
  • Figure 2 is the picture after yolov5 detection. It marks the mask of the target with other colors and draws the outermost rectangular frame of the target.
  • Figure 3 (post-processing) is to calculate the coordinates of the four points of the target and draw them on the original picture
  • Figure 4 (post-processing) is to rotate the target and cut it into small pictures

 Data labeling and processing

  1. Labeling tool: labelme
  2. Annotation file: json format
  3. Training data requirements: txt file with normalized coordinates

 From the sample data coco128-seg provided by the author (download link: https://ultralytics.com/assets/coco128-seg.zip ), you can see the content of the txt file, which are category subscripts and normalized coordinates , separated by spaces in the middle, and line breaks for different target objects

        convert json to txt 

How to convert the json file we marked with labelme into the corresponding format?

import json
import os
import argparse
from tqdm import tqdm

def convert_label_json(json_dir, save_dir, classes):
    json_paths = os.listdir(json_dir)
    classes = classes.split(',')

    for json_path in tqdm(json_paths):
    # for json_path in json_paths:
        path = os.path.join(json_dir,json_path)
        with open(path,'r') as load_f:
            json_dict = json.load(load_f)
        h, w = json_dict['imageHeight'], json_dict['imageWidth']

        # save txt path
        txt_path = os.path.join(save_dir, json_path.replace('json', 'txt'))
        txt_file = open(txt_path, 'w')

        for shape_dict in json_dict['shapes']:
            label = shape_dict['label']
            label_index = classes.index(label)
            points = shape_dict['points']

            points_nor_list = []

            for point in points:
                points_nor_list.append(point[0]/w)
                points_nor_list.append(point[1]/h)

            points_nor_list = list(map(lambda x:str(x),points_nor_list))
            points_nor_str = ' '.join(points_nor_list)
            
            label_str = str(label_index) + ' ' +points_nor_str + '\n'
            txt_file.writelines(label_str)

if __name__ == "__main__":
    """
    python json2txt_nomalize.py --json-dir my_datasets/color_rings/jsons --save-dir my_datasets/color_rings/txts --classes "cat,dogs"
    """
    parser = argparse.ArgumentParser(description='json convert to txt params')
    parser.add_argument('--json-dir', type=str, help='json path dir')
    parser.add_argument('--save-dir', type=str, help='txt save dir')
    parser.add_argument('--classes', type=str, help='classes')
    args = parser.parse_args()
    json_dir = args.json_dir
    save_dir = args.save_dir
    classes = args.classes
    convert_label_json(json_dir, save_dir, classes)

Script description:

    --json-dir: marked pure json directory;

    --save-dir: txt file directory to save;

    --classes: category name, its category order is the same as that of the following configuration files, such as category cat, dog, the execution command can be written like this

python json2txt_nomalize.py --json-dir my_datasets/color_rings/jsons --save-dir my_datasets/color_rings/txts --classes "cat,dog"

        Segment training set, test set, validation set

# 将图片和标注数据按比例切分为 训练集和测试集
import shutil
import random
import os
import argparse

# 检查文件夹是否存在
def mkdir(path):
    if not os.path.exists(path):
        os.makedirs(path)


def main(image_dir, txt_dir, save_dir):
    # 创建文件夹
    mkdir(save_dir)
    images_dir = os.path.join(save_dir, 'images')
    labels_dir = os.path.join(save_dir, 'labels')

    img_train_path = os.path.join(images_dir, 'train')
    img_test_path = os.path.join(images_dir, 'test')
    img_val_path = os.path.join(images_dir, 'val')

    label_train_path = os.path.join(labels_dir, 'train')
    label_test_path = os.path.join(labels_dir, 'test')
    label_val_path = os.path.join(labels_dir, 'val')

    mkdir(images_dir);mkdir(labels_dir);mkdir(img_train_path);mkdir(img_test_path);mkdir(img_val_path);mkdir(label_train_path);mkdir(label_test_path);mkdir(label_val_path);


    # 数据集划分比例,训练集75%,验证集15%,测试集15%,按需修改
    train_percent = 0.8
    val_percent = 0.1
    test_percent = 0.1


    total_txt = os.listdir(txt_dir)
    num_txt = len(total_txt)
    list_all_txt = range(num_txt)  # 范围 range(0, num)

    num_train = int(num_txt * train_percent)
    num_val = int(num_txt * val_percent)
    num_test = num_txt - num_train - num_val

    train = random.sample(list_all_txt, num_train)
    # 在全部数据集中取出train
    val_test = [i for i in list_all_txt if not i in train]
    # 再从val_test取出num_val个元素,val_test剩下的元素就是test
    val = random.sample(val_test, num_val)

    print("训练集数目:{}, 验证集数目:{},测试集数目:{}".format(len(train), len(val), len(val_test) - len(val)))
    for i in list_all_txt:
        name = total_txt[i][:-4]

        srcImage = os.path.join(image_dir, name+'.jpg')
        srcLabel = os.path.join(txt_dir, name + '.txt')

        if i in train:
            dst_train_Image = os.path.join(img_train_path, name + '.jpg')
            dst_train_Label = os.path.join(label_train_path, name + '.txt')
            shutil.copyfile(srcImage, dst_train_Image)
            shutil.copyfile(srcLabel, dst_train_Label)
        elif i in val:
            dst_val_Image = os.path.join(img_val_path, name + '.jpg')
            dst_val_Label = os.path.join(label_val_path, name + '.txt')
            shutil.copyfile(srcImage, dst_val_Image)
            shutil.copyfile(srcLabel, dst_val_Label)
        else:
            dst_test_Image = os.path.join(img_test_path, name + '.jpg')
            dst_test_Label = os.path.join(label_test_path, name + '.txt')
            shutil.copyfile(srcImage, dst_test_Image)
            shutil.copyfile(srcLabel, dst_test_Label)


if __name__ == '__main__':
    """
    python split_datasets.py --image-dir my_datasets/color_rings/imgs --txt-dir my_datasets/color_rings/txts --save-dir my_datasets/color_rings/train_data
    """
    parser = argparse.ArgumentParser(description='split datasets to train,val,test params')
    parser.add_argument('--image-dir', type=str, help='image path dir')
    parser.add_argument('--txt-dir', type=str, help='txt path dir')
    parser.add_argument('--save-dir', type=str, help='save dir')
    args = parser.parse_args()
    image_dir = args.image_dir
    txt_dir = args.txt_dir
    save_dir = args.save_dir

    main(image_dir, txt_dir, save_dir)

Script description:

    --image-dir: training image directory;

    --txt-dir: The directory where txt was generated in the previous step; 

    --save-dir: the storage path of the split data set, the execution command example:

python split_datasets.py --image-dir my_datasets/color_rings/imgs --txt-dir my_datasets/color_rings/txts --save-dir my_datasets/color_rings/train_data

After execution, you can see in the storage path, two folders, images and labels, are automatically generated, and there are three folders in the two folders: train\test\val

Modify the configuration file

1. There is a yaml file in the data folder, and the picture below is the content of data/coco128-seg.yaml.

    path: It is the path where the --save-dir segmented image is stored above;

    train, val, and test are filled in according to actual conditions for the folders in images respectively;

    names: It is the category name and the assigned subscript, in the same order as the above transfer to txt

2. The models/ segment folder also has a yaml file. If you use the yolov5m model, modify the nc in the yolov5m-seg.yaml file. If there are two categories, modify the nc to 2

Model Training and Inference

1. Training and executing commands

python segment/train.py --epochs 300 --data coco128-seg.yaml --weights yolov5m-seg.pt --img 640 --cfg models/segment/yolov5m-seg.yaml --batch-size 16 --device 2

Execution command instructions: specify the configuration file, pre-training weight path, etc., see the train.py file for specific parameters

Result: The train-seg file is generated in the runs directory, and the corresponding weight file will be generated for each training

2. Model reasoning

python segment/predict.py --weight ./runs/train-seg/exp2/weights/best.pt --source ./my_datasets/color_rings/train_data/images/test/000030.jpg

Execution command description: specify the weight path and the predicted picture or folder, see the predict.py file for specific parameters

Result: The predict-seg directory is generated in the runs directory, and the result map in Figure 2 above is saved

Post-processing

The important post-processing is coming! ! !

segment/predict.py, around line 169, saves the predicted coordinates in a txt file. Print the dimensions of segments, which is a list. If there are 6 targets in the predicted picture, then the list contains 6 sub-elements, each element is composed of multiple coordinate points, and the coordinate points are the contour coordinate values ​​predicted by the target

The steps required for post-processing are:

  1. Coordinate denormalization: the coordinates of segments are in the same format as txt, and they are normalized coordinate values, which need to be converted to the real coordinate values ​​of the image
  2. Get four points: Multiple coordinate points calculate the upper left, upper right, lower left, and lower right points, and output them in clockwise order
  3. Rotation and straightening: Four points are known, the angle can be calculated, and the target is straightened and saved as a small picture 

code contribution

# segments是分割的坐标点
segments = [
    scale_segments(im0.shape if retina_masks else im.shape[2:], x, im0.shape, normalize=True)
    for x in reversed(masks2segments(masks))]
new_segments = []  # 用来装反归一化后的坐标
image_list = []    # 切割的小图
im0_h, im0_w, im0_c = im0.shape
for k, seg_list in enumerate(segments):
    # 将归一化的点转换为坐标点
    new_seg_list = []
    for s_point in seg_list:
        pt1, pt2 = s_point
        new_pt1 = int(pt1 * im0_w)
        new_pt2 = int(pt2 * im0_h)
        new_seg_list.append([new_pt1, new_pt2])
    rect = cv2.minAreaRect(np.array(new_seg_list))  # 得到最小外接矩形的(中心(x,y), (宽,高), 旋转角度)
    seg_bbox = cv2.boxPoints(rect)  # 获取最小外接矩形的4个顶点坐标(ps: cv2.boxPoints(rect) for OpenCV 3.x)
    seg_bbox = np.int0(seg_bbox)
    if np.linalg.norm(seg_bbox[0] - seg_bbox[1]) < 5 or np.linalg.norm(seg_bbox[3] - seg_bbox[0]) < 5:
        continue

    # 坐标点排序
    box1 = sorted(seg_bbox, key=lambda x: (x[1], x[0]))
    # 将坐标点按照顺时针方向来排序,box的从左往右从上到下排序
    if box1[0][0] > box1[1][0]:
        box1[0], box1[1] = box1[1], box1[0]
    if box1[2][0] < box1[3][0]:
        box1[2], box1[3] = box1[3], box1[2]
    if box1[0][1] > box1[1][1]:
        box1[0], box1[1], box1[2], box1[3] = box1[1], box1[2], box1[3], box1[0]
    box1_list = [b.tolist() for b in box1] # 坐标转换为list格式
    new_segments.append(box1_list)
    tmp_box = copy.deepcopy(np.array(box1)).astype(np.float32)
    partImg_array = image_crop_tools.get_rotate_crop_image(im0, tmp_box)
    image_list.append(partImg_array)
    # cv2.imwrite(str(k)+'.jpg', partImg_array)  # 保存小图
    
# 在原图上画出分割图像
# src_image = im0.copy()
# for ns_box in new_segments:
#     cv2.drawContours(src_image, [np.array(ns_box)], -1, (0, 255, 0), 2)
# cv2.imwrite('1.jpg', src_image)

 Code description: This part of the script is copied in the segment/predict.py file, which can be placed under the same level as if save_txt. Among them, the comment #save small picture is to save the picture in Figure 4 at the beginning of the article. Note#Draw the segmented image on the original image, which is the image in Figure 3 at the beginning of the article.

The image_crop_tools.get_rotate_crop_image function is used in the rotation part, which is mainly used for angle calculation and image alignment. The code is as follows:

import cv2
import numpy as np
def get_rotate_crop_image(img, points):
    """
    根据坐标点截取图像
    :param img: 
    :param points: 
    :return: 
    """

    h, w, _ = img.shape

    left = int(np.min(points[:, 0]))
    right = int(np.max(points[:, 0]))
    top = int(np.min(points[:, 1]))
    bottom = int(np.max(points[:, 1]))


    img_crop = img[top:bottom, left:right, :].copy()

    points[:, 0] = points[:, 0] - left
    points[:, 1] = points[:, 1] - top
    img_crop_width = int(np.linalg.norm(points[0] - points[1]))

    img_crop_height = int(np.linalg.norm(points[0] - points[3]))

    pts_std = np.float32([[0, 0], [img_crop_width, 0], [img_crop_width, img_crop_height], [0, img_crop_height]])

    M = cv2.getPerspectiveTransform(points, pts_std)

    dst_img = cv2.warpPerspective(
        img_crop,
        M, (img_crop_width, img_crop_height),
        borderMode=cv2.BORDER_REPLICATE)
    dst_img_height, dst_img_width = dst_img.shape[0:2]
    if dst_img_height * 1.0 / dst_img_width >= 1:
#         pass
#         print(dst_img_height * 1.0 / dst_img_width,dst_img_height,dst_img_width,'*-'*10)
        dst_img = np.rot90(dst_img,-1)   #-1为逆时针,1为顺时针。
    
    return dst_img


def sorted_boxes(dt_boxes):
    """
    坐标点排序
    """
   
    num_boxes = dt_boxes.shape[0]
    sorted_boxes = sorted(dt_boxes, key=lambda x: (x[0][1], x[0][0]))
    _boxes = list(sorted_boxes)

    for i in range(num_boxes - 1):
        if abs(_boxes[i+1][0][1] - _boxes[i][0][1]) < 10 and \
            (_boxes[i + 1][0][0] < _boxes[i][0][0]):
            tmp = _boxes[i]
            _boxes[i] = _boxes[i + 1]
            _boxes[i + 1] = tmp
   
    return _boxes

 Flowers are over! ! !

Guess you like

Origin blog.csdn.net/jin__9981/article/details/128385498