ターゲット検出データの前処理 - ターゲットフレームの外側の特定の範囲に従ってスクリーンショットを取得します (ラベル付き)

カメラによって収集されたデータセットの背景領域が多すぎて、ターゲットがごく一部しか占めていないことが判明した場合、またはそのうちのいくつかが分類されている限り、残りは認識に参加しません。このとき、これらの破棄されたターゲットも背景領域として使用されます; または、特定のターゲットのパズル処理を行う必要があります。トレーニングと認識。特定の領域が特定のターゲットを学習して、誤検出率を改善および低減します...上記の問題はスクリーンショットで解決できます。この記事では、特定の
カテゴリに応じた最も単純なスクリーンショットを採用しています。より複雑な状況に対応するアップグレードバージョンは、次のとおりです。パーツを一定の割合で拡大したスクリーンショット。

コードに直接移動します。

from copy import deepcopy
import cv2
import json
import os

def main(img_path, json_path, save_path, label, extend):
    files = os.listdir(json_path)
    for file in files:
        if os.path.splitext(file)[-1] != ".json":
            continue
        # 要是还有非jpg、png、jpeg类型的其他格式图片，再自行添加
        img_file = os.path.join(img_path, os.path.splitext(file)[0] + ".jpg")
        if not os.path.exists(img_file):
            img_file = os.path.join(img_path, os.path.splitext(file)[0] + ".png")
            if not os.path.exists(img_file):
                img_file = os.path.join(img_path, os.path.splitext(file)[0] + ".jpeg")
        json_file = os.path.join(json_path, file)
        json_data = json.load(open(json_file))
        img_h = json_data["imageHeight"]
        img_w = json_data["imageWidth"]
        i = 0
        for shape in json_data["shapes"]:
            json_data_1 = deepcopy(json_data)
            if shape["label"] in label:
                img_save = os.path.join(save_path, os.path.split(img_file)[-1])
                json_save = save_path + "/" + file
                if os.path.exists(json_save):
                    json_save = save_path + "/" + str(i) + file
                    img_save = save_path + "/" + str(i) + os.path.split(img_file)[-1]
                    json_data_1["imagePath"] = str(i) + os.path.split(img_file)[-1]
                    i += 1
                p = shape["points"]
                # 外扩指定的范围
                x1 = int(min(p[0][0], p[1][0])) - extend
                y1 = int(min(p[0][1], p[1][1])) - extend
                x2 = int(max(p[0][0], p[1][0])) + extend
                y2 = int(max(p[0][1], p[1][1])) + extend
                if x1 < 0:
                    x1 = 0
                if y1 < 0:
                    y1 = 0
                if x2 > img_w:
                    x2 = img_w
                if y2 > img_h:
                    y2 = img_h
                print(img_save)
                print(x1, y1, x2, y2)
                bbox = []
                for shape1 in json_data_1["shapes"]:
                    # 截图的区域a内是否有别的目标框的一半以上的区域落入截图区a域内（中心点判断）有则保留作为此图（截的新图）的目标框
                    if shape1["label"] in label or shape1["label"] in in_label:
	                    m_p = shape1["points"]
	                    m_x1 = int(min(m_p[0][0], m_p[1][0]))
	                    m_y1 = int(min(m_p[0][1], m_p[1][1]))
	                    m_x2 = int(max(m_p[0][0], m_p[1][0]))
	                    m_y2 = int(max(m_p[0][1], m_p[1][1]))
	                    if not (x1 < (m_x1 + m_x2)/2 < x2 and y1 < (m_y1 + m_y2)/2 <y2):
	                        continue
	                    print(m_x1, m_y1, m_x2, m_y2)
	                    bbox.append(shape1)
                img = cv2.imread(img_file)
                img = img[y1:y2, x1:x2, :]
                json_data_1["shapes"] = []
                for b_shape in bbox:
                    b_p = b_shape["points"]
                    b_p[0][0] = b_p[0][0] - x1
                    b_p[0][1] = b_p[0][1] - y1
                    b_p[1][0] = b_p[1][0] - x1
                    b_p[1][1] = b_p[1][1] - y1
                    if b_p[0][0] < 0:
                        b_p[0][0] = 0
                    if b_p[0][1] < 0:
                        b_p[0][1] = 0
                    if b_p[1][0] > x2:
                        b_p[1][0] = x2
                    if b_p[1][1] > y2:
                        b_p[1][1] = y2
                    json_data_1["shapes"].append(b_shape)
                json_data_1["imageHeight"] = y2 - y1
                json_data_1["imageWidth"] = x2 -x1
                json.dump(json_data_1, open(json_save, "w"), ensure_ascii=False, indent=2)
                cv2.imwrite(img_save, img)
            
if __name__ == "__main__":
    img_path = "/data/wearmask/images"
    json_path = "/data/wearmask/json"
    save_path = "/data/wearmask/save"

    label = ["head", "hat", "workhat", "helmet"]   # 根据label的目标框进行截图
    in_label = ["wearmask", "eye", "glasses"]      # 要保留的目标框
    extend = 5
    main(img_path, json_path, save_path, label, extend)

最初に元の画像を配置します。
ここに画像の説明を挿入
ここでのサンプル画像は適切に選択されておらず、背景が小さいためスクリーンショットを使用する必要はありませんが、コードの効果を確認するためだけに使用されています。
実行後のこの画像:
ここに画像の説明を挿入
コードの外部拡張範囲は自分で調整できます。ここでは 5 ポイント拡張しました。スクリーンショット後、左右と下は全て拡大され、上は拡大されないので元の絵が見えますが、これは拡大できないため制限されています。負の値であり、最大値が画像のサイズを超えることはできません。それ以外の場合は、データセットの使用時にエラーが発生します。

ターゲット検出データの前処理 - ターゲット フレームの外側の特定の範囲に従ってスクリーンショットを取得します (ラベル付き)

おすすめ

ターゲット検出データの前処理 - ターゲットフレームの外側の特定の範囲に従ってスクリーンショットを取得します (ラベル付き)