Data labeling for PaddleDetect image target detection model training - using labelme for labeling

0 Preface

When training the PaddleDetect image detection model, it is necessary to manually label the data set. The following will take the truck detection as an example, and record the detailed process of labeling with labelme, in case you forget it in the future.

Please download the files used in this article here: https://download.csdn.net/download/loutengyuan/87616492

1 labelme environment construction

labelme is a graphical image annotation tool, which is written in Python and uses Qt for its graphical interface. To put it bluntly, it has an interface, like software, and can be interacted with, but it is started by the command line, which is a little more troublesome than using software. Its interface is as follows:
insert image description here

Note: I have done Sinicization of the interface here. It was originally in English, and the Sinicization files will be provided below.

1.1 labelme tool installation

Please open the cmd command window first, and then enter the following command to install:

# -i 是指定国内镜像源,提高安装速度
pip install labelme -i https://mirror.baidu.com/pypi/simple

insert image description here
Check if it is installed with the following command:

pip list

insert image description here

1.2 Chinese labelme

Replace the labelme Chinese file (app.py file in the download directory) with the following directory (this is based on your actual installation directory):

C:\Users\Administrator\AppData\Local\Programs\Python\Python310\Lib\site-packages\labelme

insert image description here

2 Labeling operation process

Open the downloaded file, you can see the following files:

  • truck_detect: the picture to be marked
  • app.py: labelme Chinese file
  • labels.txt: label list
  • Double-click to open the annotation tool.bat: start the script
    insert image description here

2.1 Open the annotation tool

Originally, the labelme software was launched through the command line. Here I put this command into the double- click to open the labeling tool.bat file, so just double-click this script to open the labelme tool. The content of the startup script is as follows:

:: 进入当前目录
cd  %~dp0
labelme truck_detect --labels labels.txt
pause

2.2 Turn on the auto save option

  • Select File -> Save Automatically to automatically save the option
    insert image description here

new dimension

Select the Draw Rectangle or Draw Polygon option
insert image description here
to perform the following steps, install the label type to mark the position of each area in the picture, and then click Next, the drawn area will be automatically saved:
insert image description here

2.3 Edit/modify annotation

If the label is wrongly selected or the area needs to be adjusted, you can click Edit to select, select the area to be edited to modify, or right-click to select Edit Label to modify.
insert image description here

2.4 Annotate files

The marked files and pictures are saved in the truck_detect folder:
insert image description here
the label file is in json format, and the content is as follows:

{
    
    
  "version": "5.1.1",
  "flags": {
    
    },
  "shapes": [
    {
    
    
      "label": "整车侧面",
      "points": [
        [
          96.73611111111113,
          66.625
        ],
        [
          446.0416666666667,
          290.23611111111114
        ]
      ],
      "group_id": null,
      "shape_type": "rectangle",
      "flags": {
    
    }
    },
    {
    
    
      "label": "车头侧面",
      "points": [
        [
          101.59722222222224,
          70.79166666666666
        ],
        [
          299.5138888888889,
          294.40277777777777
        ]
      ],
      "group_id": null,
      "shape_type": "rectangle",
      "flags": {
    
    }
    },
    {
    
    
      "label": "车轮侧面",
      "points": [
        [
          177.29166666666669,
          188.84722222222223
        ],
        [
          431.45833333333337,
          288.84722222222223
        ]
      ],
      "group_id": null,
      "shape_type": "rectangle",
      "flags": {
    
    }
    },
    {
    
    
      "label": "车身侧面",
      "points": [
        [
          290.48611111111114,
          70.79166666666666
        ],
        [
          439.09722222222223,
          279.125
        ]
      ],
      "group_id": null,
      "shape_type": "rectangle",
      "flags": {
    
    }
    }
  ],
  "imagePath": "81675.jpg_800x533.jpg",
  "imageData": "",
  "imageHeight": 327,
  "imageWidth": 490
}

3 format conversion

Commonly used target detection datasets come in two formats, VOC and COCO.

3.1 Convert to COCO format

If you use the COCO format, it is recommended to use x2coco in PaddleDetection to convert the marked file into a COCO format dataset. The conversion code is as follows:

python tools/x2coco.py \
                --dataset_type labelme \
                --json_input_dir ./labelme_annos/ \
                --image_input_dir ./labelme_imgs/ \
                --output_dir ./cocome/ \
                --train_proportion 0.8 \
                --val_proportion 0.2 \
                --test_proportion 0.0

3.2 Convert to VOC format

This project uses the dataset in VOC format, use the following code to convert the dataset into VOC2007 format:

import os
import numpy as np
import codecs
import json
from glob import glob
import cv2
import shutil

# 标签路径
labelme_path = "D:/DataProcess/vehicle/truck_detect/truck_detect/"  # 原始labelme标注数据路径
saved_path = "D:/DataProcess/vehicle/truck_detect/truck_detect_voc/"  # 保存路径

# 将中文标签名称转换成英文字母
label_translate_dict = {
    
    
    "车头正面": "head_front",
    "车头侧面": "head_side",
    "车身侧面": "body_side",
    "车轮侧面": "wheel_side",
    "整车侧面": "vehicle_side",
    "工程车": "vehicle_side" # engine_truck
}

label_txt = ["__ignore__", "_background_"]

# 创建要求文件夹
if not os.path.exists(saved_path + "Annotations"):
    os.makedirs(saved_path + "Annotations")
if not os.path.exists(saved_path + "JPEGImages/"):
    os.makedirs(saved_path + "JPEGImages/")

# 获取待处理文件
files = glob(labelme_path + "*.json")
files = [i.split("\\")[-1].split(".json")[0] for i in files]
total_len = len(files)
total_idx = 0

out_count = 0
skip_count = 0
err_count = 0

# 读取标注信息并写入 xml
for json_file_ in files:
    total_idx += 1
    if " " in json_file_:
        continue
    json_filename = labelme_path + json_file_ + ".json"
    img_filename = labelme_path + json_file_ + ".jpg"
    if not os.path.exists(json_filename) or not os.path.exists(img_filename):
        continue
    json_file = json.load(open(json_filename, "r", encoding="utf-8"))

    if total_idx % 100 == 0:
        print("进度:{}".format((total_idx / total_len) * 100))

    imagePath = json_file["imagePath"]
    if json_file_ not in imagePath:
        err_count += 1
        print("imagePath error fileName = {}    imagePath = {}".format(json_file_, imagePath))
        continue

    height, width, channels = cv2.imread(img_filename).shape
    success_count = 0
    xml_filename = saved_path + "Annotations/" + json_file_ + ".xml"
    with codecs.open(xml_filename, "w", "utf-8") as xml:
        xml.write('<annotation>\n')
        xml.write('\t<folder>' + 'UAV_data' + '</folder>\n')
        xml.write('\t<filename>' + json_file_ + ".jpg" + '</filename>\n')
        xml.write('\t<source>\n')
        xml.write('\t\t<database>The UAV autolanding</database>\n')
        xml.write('\t\t<annotation>UAV AutoLanding</annotation>\n')
        xml.write('\t\t<image>flickr</image>\n')
        xml.write('\t\t<flickrid>NULL</flickrid>\n')
        xml.write('\t</source>\n')
        xml.write('\t<owner>\n')
        xml.write('\t\t<flickrid>NULL</flickrid>\n')
        xml.write('\t\t<name>ChaojieZhu</name>\n')
        xml.write('\t</owner>\n')
        xml.write('\t<size>\n')
        xml.write('\t\t<width>' + str(width) + '</width>\n')
        xml.write('\t\t<height>' + str(height) + '</height>\n')
        xml.write('\t\t<depth>' + str(channels) + '</depth>\n')
        xml.write('\t</size>\n')
        xml.write('\t\t<segmented>0</segmented>\n')

        for multi in json_file["shapes"]:
            points = np.array(multi["points"])
            xmin = min(points[:, 0])
            xmax = max(points[:, 0])
            ymin = min(points[:, 1])
            ymax = max(points[:, 1])

            if "未知" in multi["label"]:
                break
            for key in label_translate_dict.keys():
                if key in multi["label"]:
                    label = label_translate_dict[key]
                    break
            if label is None:
                continue
            if label not in label_txt:
                label_txt.append(label)

            if xmax <= xmin:
                continue
            elif ymax <= ymin:
                continue
            elif (xmax - xmin) * (ymax - ymin) < 1000:
                continue
            else:
                xml.write('\t<object>\n')
                xml.write('\t\t<name>' + label + '</name>\n')
                xml.write('\t\t<pose>Unspecified</pose>\n')
                xml.write('\t\t<truncated>1</truncated>\n')
                xml.write('\t\t<difficult>0</difficult>\n')
                xml.write('\t\t<bndbox>\n')
                xml.write('\t\t\t<xmin>' + str(xmin) + '</xmin>\n')
                xml.write('\t\t\t<ymin>' + str(ymin) + '</ymin>\n')
                xml.write('\t\t\t<xmax>' + str(xmax) + '</xmax>\n')
                xml.write('\t\t\t<ymax>' + str(ymax) + '</ymax>\n')
                xml.write('\t\t</bndbox>\n')
                xml.write('\t</object>\n')
                # print(total_idx, total_len, json_filename, xmin, ymin, xmax, ymax, label)
                success_count += 1
        xml.write('</annotation>')

    if success_count > 0:
        shutil.copy(img_filename, saved_path + "JPEGImages/")
        out_count += 1
    else:
        skip_count += 1
        if os.path.exists(xml_filename):
            os.remove(xml_filename)

# 写标签文件
labels_txt_writer = open(saved_path + '/labels.txt', 'w')
for text in label_txt:
    labels_txt_writer.write(text + "\n")
labels_txt_writer.close()

print("-------------------------------- Finished  --------------------------------")
print("out_count = {}".format(out_count))
print("skip_count = {}".format(skip_count))
print("err_count = {}".format(err_count))
print("label_txt = {}".format(label_txt))

3.3 Output label area

The following script is used to parse the label json file and crop or draw the result into the image separately to view the labeling effect:

import os
import numpy as np
import json
from glob import glob
import cv2
from PIL import Image, ImageDraw, ImageFont

labelme_path = "D:/DataProcess/vehicle/truck_detect/truck_detect/"  # 原始labelme标注数据路径
# 输出文件夹路径
outputPath = "D:/DataProcess/vehicle/predict_output/"

files = glob(labelme_path + "*.json")
files = [i.split("\\")[-1].split(".json")[0] for i in files]
files_size = len(files)


# 将区域裁剪单独保存
def clip_save():
    file_idx = 0
    for json_file_ in files:
        file_idx += 1
        if " " in json_file_:
            continue
        json_filename = labelme_path + json_file_ + ".json"
        img_filename = labelme_path + json_file_ + ".jpg"
        if not os.path.exists(json_filename) or not os.path.exists(img_filename):
            continue
        json_file = json.load(open(json_filename, "r", encoding="utf-8"))
        # image = cv2.imread(img_filename)
        image = cv2.imdecode(np.fromfile(img_filename, dtype=np.uint8), -1)
        # height, width, channels = image.shape

        # "车头正面", "车头侧面", "车身侧面", "车轮侧面", "整车侧面", "工程车"
        for label_name in ["车头正面", "车头侧面"]:
            outputLabelPath = outputPath + label_name + "/"
            if not os.path.exists(outputLabelPath):
                os.makedirs(outputLabelPath)
            idx = 0
            for multi in json_file["shapes"]:
                points = np.array(multi["points"])
                xmin = min(points[:, 0])
                xmax = max(points[:, 0])
                ymin = min(points[:, 1])
                ymax = max(points[:, 1])

                #     "车头正面", "车头侧面", "车身侧面", "车轮侧面", "整车侧面"
                label = multi["label"]
                if label_name not in label:
                    continue

                if xmax <= xmin:
                    continue
                elif ymax <= ymin:
                    continue
                elif (xmax - xmin) * (ymax - ymin) < 1000:
                    continue
                else:
                    print("{}/{} {} {} {} {} {} {}".format(file_idx, files_size, json_filename, xmin, ymin, xmax, ymax,
                                                           label))
                    # 将主体区域裁剪后保存到输出文件夹 #参数1 是高度的范围,参数2是宽度的范围
                    img_crop = image[int(ymin):int(ymax), int(xmin):int(xmax)]
                    if idx == 0:
                        out_image_file = outputLabelPath + json_file_ + ".jpg"
                    else:
                        out_image_file = outputLabelPath + "crop_" + str(idx) + "_" + json_file_ + ".jpg"
                    # cv2.imwrite(out_image_file, img_crop)
                    cv2.imencode('.jpg', img_crop)[1].tofile(out_image_file)

                    idx += 1


# 将结果绘制到一张图片
def draw_save():
    file_idx = 0
    for json_file_ in files:
        file_idx += 1
        if " " in json_file_:
            continue
        json_filename = labelme_path + json_file_ + ".json"
        img_filename = labelme_path + json_file_ + ".jpg"
        if not os.path.exists(json_filename) or not os.path.exists(img_filename):
            continue
        json_file = json.load(open(json_filename, "r", encoding="utf-8"))
        # image = cv2.imread(img_filename)
        image = cv2.imdecode(np.fromfile(img_filename, dtype=np.uint8), -1)[:, :, ::-1]
        if isinstance(image, np.ndarray):
            image = Image.fromarray(image)
        # height, width, channels = image.shape
        is_success = False
        for multi in json_file["shapes"]:
            points = np.array(multi["points"])
            xmin = min(points[:, 0])
            xmax = max(points[:, 0])
            ymin = min(points[:, 1])
            ymax = max(points[:, 1])

            #     "车头正面", "车头侧面", "车身侧面", "车轮侧面", "整车侧面"
            label = multi["label"]
            tag = False
            for text in ["车头正面"]:
                if text in label:
                    tag = True
                    break
            if not tag:
                continue

            if xmax <= xmin:
                continue
            elif ymax <= ymin:
                continue
            else:
                is_success = True
                print("{}/{} {} {} {} {} {} {}".format(file_idx, files_size, json_filename, xmin, ymin, xmax, ymax,
                                                       label))
                draw = ImageDraw.Draw(image)
                font_size = 36
                # simfang.ttf 这个文件在下载包中有
                font = ImageFont.truetype("./simfang.ttf", font_size, encoding="utf-8")
                text = "标签:{}".format(multi["label"])
                th = font_size
                tw = font.getsize(text)[0]
                # tw = int(len(result["rec_docs"]) * font_size) + 60
                start_y = max(0, ymin - th)

                draw.rectangle(
                    [(xmin + 1, start_y), (xmin + tw + 1, start_y + th)], fill=(0, 102, 255))

                draw.text((xmin + 1, start_y), text, fill=(255, 255, 255), font=font)

                draw.rectangle(
                    [(xmin, ymin), (xmax, ymax)], outline=(255, 0, 0), width=2)

        if is_success:
            # image.show(img_filename)

            os.makedirs(outputPath, exist_ok=True)
            output_path = os.path.join(outputPath, json_file_ + ".jpg")

            image.save(output_path, quality=95)


if __name__ == '__main__':
    # 将结果单独裁剪保存
    clip_save()

    # # 将结果绘制到同一张图片上保存
    # draw_save()

Guess you like

Origin blog.csdn.net/loutengyuan/article/details/129751419