[Deep Learning] YOLOv5 label normalization processing - convert json to txt, COCO format to YOLO format

This code is a script for processing JSON files, mainly including the following functions and a main program:

cor_convert_simple(img_size, box): This function is used to convert a simple coordinate box format to the desired format. It accepts the image size img_size and a coordinate box box as input, performs a series of calculations and conversions, and finally returns the converted result. The specific conversion process includes calculating the coordinates of the center point, width and height, and then performing normalization according to the image size.

cor_convert_seg(img_size, box_list): This function is used to convert a complex coordinate box list format to the desired format. It accepts the image size img_size and a coordinate box list box_list as input, performs a series of calculations and conversions, and finally returns the converted result. The specific conversion process is similar to cor_convert_simple, but it needs to process multiple coordinate frames.

rewrite_seg(from_path, to_path): This function is used to process the JSON file and write the result to a new file. It accepts the input file path from_path and the output file path to_path as arguments. The function first opens the input file and reads the JSON data in it. Then extract a specific coordinate frame from the JSON data according to the specified conditions, and call the cor_convert_seg function to convert. Finally, write the transformed result to the output file.

The main program part first gets all the filenames in the specified directory and iterates through each file. For each file, it extracts the numeric portion of the filename, constructing an input file path and an output file path. Then call the rewrite_seg function for processing.

Some Chinese comments are also used in the code, mainly to explain the function of the code. For example:

# Calculate normalization parameters: used to explain the process of calculating normalization parameters.

# Write transformed results to file: Used to explain the operation of writing transformed results to a file.

import json


def cor_convert_simple(img_size, box):
    # 计算归一化参数
    dw = 1. / (img_size[0])
    dh = 1. / (img_size[1])

    # 计算中心点坐标、宽度和高度
    x = (box[0] + box[2]) / 2.0
    y = (box[1] + box[3]) / 2.0
    w = box[2] - box[0]
    h = box[3] - box[1]

    # 归一化处理
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh

    temp_res = [x, y, w, h]
    result = [str(x) for x in temp_res]
    return result


def cor_convert_seg(img_size, box_list):
    # 计算归一化参数
    dw = 1. / (img_size[0])
    dh = 1. / (img_size[1])

    result = []
    for box in box_list:
        # 归一化处理并添加到结果列表
        result.append(str(box[0] * dw))
        result.append(str(box[1] * dh))
    return result


def rewrite_seg(from_path, to_path):
    label = ['class_1', 'class_2','class_3','class_4']
    label_to_num = {lab: i for i, lab in enumerate(label)}

    fw = open(to_path, "w", encoding='utf-8')

    with open(from_path, "r", encoding='utf-8') as f:
        origin_json = json.load(f)
        img_size = [origin_json["imageWidth"], origin_json["imageHeight"]]
        for reg_shape in origin_json["shapes"]:
            reg_label = reg_shape["label"]
            if reg_label == "screen_seg":
                # 将转换后的结果写入文件
                fw.write(
                    "{} {}\n".format(label_to_num["screen"], " ".join(cor_convert_seg(img_size, reg_shape["points"]))))


if __name__ == "__main__":
    import os

    filenames = os.listdir("./json/")
    for fn in filenames:
        num_str = fn.split('.')[0]
        from_file = "./json/" + fn
        to_file = "{}{}.txt".format("./gen_txt/", num_str)
        rewrite_seg(from_file, to_file)

The expression of the bounding box:

xyxy format: the bounding box is represented by upper left coordinates (x1, y1) and lower right coordinates (x2, y2)

xywh format: The bounding box is represented by the center coordinates (x, y) and the length and width (w, h) of the box - this is mainly used in YOLO

According to the above criteria, before converting, you need to know which form the bbox stores in your json file.

3. Conversion

1. The case of bbox (x1, y1, x2, y2)

size is the size of the picture, which can be obtained from the general json file and stored in the form of a list, such as [1920, 1080]

box is the bounding box bbox in json, also expressed in the form of list

def cor_convert_simple(size,box):
    dw = 1. / size[0]
    dh = 1. / size[1]
    x = (box[0] + box[2]) / 2.0
    y = (box[1] + box[3]) / 2.0
    w = box[2] - box[0]
    h = box[3] - box[1]
    x = x * dw
    w = w * dw    
    y = y * dh
    h = h * dh
    return(x,y,w,h)

2. The case of bbox (x, y, w, h)

 def cor_convert_simple(siez,box):
    x, y, w, h = item['bbox']
    dw = 1. / size[0]
    dh = 1. / size[1]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return(x,y,w,h)

 

 

Guess you like

Origin blog.csdn.net/dsafefvf/article/details/130645806