【ディープラーニング】YOLOv5ラベル正規化処理 - json→txt、COCO形式→YOLO形式に変換

このコードは、JSON ファイルを処理するためのスクリプトであり、主に次の関数とメインプログラムが含まれています。

cor_convert_simple(img_size, box): この関数は、単純な座標ボックス形式を目的の形式に変換するために使用されます。画像サイズ img_size と座標ボックス box を入力として受け取り、一連の計算と変換を実行し、最後に変換された結果を返します。具体的な変換処理としては、中心点、幅、高さの座標を算出し、画像サイズに応じて正規化する処理が行われる。

cor_convert_seg(img_size, box_list): この関数は、複雑な座標ボックスリスト形式を目的の形式に変換するために使用されます。画像サイズ img_size と座標ボックスリスト box_list を入力として受け取り、一連の計算と変換を実行し、最後に変換された結果を返します。具体的な変換プロセスは cor_convert_simple と似ていますが、複数の座標フレームを処理する必要があります。

rewrite_seg(from_path, to_path): この関数は、JSON ファイルを処理し、結果を新しいファイルに書き込むために使用されます。入力ファイルパス from_path と出力ファイルパス to_path を引数として受け入れます。この関数はまず入力ファイルを開き、その中の JSON データを読み取ります。次に、指定された条件に従って JSON データから特定の座標フレームを抽出し、cor_convert_seg 関数を呼び出して変換します。最後に、変換された結果を出力ファイルに書き込みます。

メインプログラム部分は、まず指定されたディレクトリ内のすべてのファイル名を取得し、各ファイルを反復処理します。ファイルごとに、ファイル名の数値部分が抽出され、入力ファイルパスと出力ファイルパスが構築されます。次に、処理のために rewrite_seg 関数を呼び出します。

コード内では、主にコードの機能を説明するために中国語のコメントも使用されています。例えば：

# 正規化パラメータの計算: 正規化パラメータを計算するプロセスを説明するために使用されます。

# 変換結果をファイルに書き出す: 変換結果をファイルに書き込む操作を説明するために使用されます。

import json


def cor_convert_simple(img_size, box):
    # 计算归一化参数
    dw = 1. / (img_size[0])
    dh = 1. / (img_size[1])

    # 计算中心点坐标、宽度和高度
    x = (box[0] + box[2]) / 2.0
    y = (box[1] + box[3]) / 2.0
    w = box[2] - box[0]
    h = box[3] - box[1]

    # 归一化处理
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh

    temp_res = [x, y, w, h]
    result = [str(x) for x in temp_res]
    return result


def cor_convert_seg(img_size, box_list):
    # 计算归一化参数
    dw = 1. / (img_size[0])
    dh = 1. / (img_size[1])

    result = []
    for box in box_list:
        # 归一化处理并添加到结果列表
        result.append(str(box[0] * dw))
        result.append(str(box[1] * dh))
    return result


def rewrite_seg(from_path, to_path):
    label = ['class_1', 'class_2','class_3','class_4']
    label_to_num = {lab: i for i, lab in enumerate(label)}

    fw = open(to_path, "w", encoding='utf-8')

    with open(from_path, "r", encoding='utf-8') as f:
        origin_json = json.load(f)
        img_size = [origin_json["imageWidth"], origin_json["imageHeight"]]
        for reg_shape in origin_json["shapes"]:
            reg_label = reg_shape["label"]
            if reg_label == "screen_seg":
                # 将转换后的结果写入文件
                fw.write(
                    "{} {}\n".format(label_to_num["screen"], " ".join(cor_convert_seg(img_size, reg_shape["points"]))))


if __name__ == "__main__":
    import os

    filenames = os.listdir("./json/")
    for fn in filenames:
        num_str = fn.split('.')[0]
        from_file = "./json/" + fn
        to_file = "{}{}.txt".format("./gen_txt/", num_str)
        rewrite_seg(from_file, to_file)

境界ボックスの式:

xyxy 形式: 境界ボックスは左上の座標 (x1, y1) と右下の座標 (x2, y2) で表されます。

xywh 形式: 境界ボックスは、ボックスの中心座標 (x, y) と長さと幅 (w, h) で表されます。これは主に YOLO で使用されます。

上記の基準に従って、変換する前に、bbox が json ファイルにどの形式を保存するかを知る必要があります。

3. 変換

1. bbox (x1, y1, x2, y2) の場合

size は画像のサイズで、一般的な json ファイルから取得でき、[1920, 1080] などのリスト形式で保存できます。

box は、json の境界ボックス bbox であり、リストの形式でも表現されます。

def cor_convert_simple(size,box):
    dw = 1. / size[0]
    dh = 1. / size[1]
    x = (box[0] + box[2]) / 2.0
    y = (box[1] + box[3]) / 2.0
    w = box[2] - box[0]
    h = box[3] - box[1]
    x = x * dw
    w = w * dw    
    y = y * dh
    h = h * dh
    return(x,y,w,h)

2. bbox (x, y, w, h) の場合

 def cor_convert_simple(siez,box):
    x, y, w, h = item['bbox']
    dw = 1. / size[0]
    dh = 1. / size[1]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return(x,y,w,h)