[Model training] labelme labeling and processing segmentation data method

  Get into the habit of writing together! This is the 7th day of my participation in the "Nuggets Daily New Plan · April Update Challenge", click to view the details of the event .

欢迎关注我的公众号 [极智视界],获取我的更多笔记分享

  Hello everyone, my name is Jizhi Vision. This paper introduces in detail how labelme labels and processes segmented data.

  Image segmentation is a common task in computer vision tasks, including instance segmentation, semantic segmentation, panoramic segmentation, etc. Before sending it to the segmentation task, the data needs to be labeled. As we all know, the quality of data in deep learning has a great impact on the final detection effect, so the importance of data labeling is self-evident.

  Start below.

1. Install labelme

  Whether you are windows or linux you can install it like this:

# 首先安装anaconda,这个这里不多说
# 安装pyqt5
pip install -i https://pypi.douban.com/simple pyqt5

# 安装labelme
pip install -i https://pypi.douban.com/simple labelme

# 打开labelme
./labelme
复制代码

  Then a json file corresponding to the image will be generated, which will contain the label and the labeled segmentation mask information, which is almost like this:

2. Built-in json to datset

2.1 Single image json to dataset

  Execute directly:

labelme_json_dataset xxx.json
复制代码

  will then generate:

  • img.png: original image;

  • label.png: mask image;

  • label_viz.png: mask image with background;

  • info.yaml, label_names.txt: label information;

2.2 Batch json to dataset

  Find the cli/json_to_dataset.py directory, then:

cd cli
touch json_to_datasetP.py
vim json_to_datasetP.py
复制代码

  Add the following:

import argparse
import json
import os
import os.path as osp
import warnings
 
import PIL.Image
import yaml
 
from labelme import utils
import base64
 
def main():
    warnings.warn("This script is aimed to demonstrate how to convert the\n"
                  "JSON file to a single image dataset, and not to handle\n"
                  "multiple JSON files to generate a real-use dataset.")
    parser = argparse.ArgumentParser()
    parser.add_argument('json_file')
    parser.add_argument('-o', '--out', default=None)
    args = parser.parse_args()
 
    json_file = args.json_file
    if args.out is None:
        out_dir = osp.basename(json_file).replace('.', '_')
        out_dir = osp.join(osp.dirname(json_file), out_dir)
    else:
        out_dir = args.out
    if not osp.exists(out_dir):
        os.mkdir(out_dir)
 
    count = os.listdir(json_file) 
    for i in range(0, len(count)):
        path = os.path.join(json_file, count[i])
        if os.path.isfile(path):
            data = json.load(open(path))
            
            if data['imageData']:
                imageData = data['imageData']
            else:
                imagePath = os.path.join(os.path.dirname(path), data['imagePath'])
                with open(imagePath, 'rb') as f:
                    imageData = f.read()
                    imageData = base64.b64encode(imageData).decode('utf-8')
            img = utils.img_b64_to_arr(imageData)
            label_name_to_value = {'_background_': 0}
            for shape in data['shapes']:
                label_name = shape['label']
                if label_name in label_name_to_value:
                    label_value = label_name_to_value[label_name]
                else:
                    label_value = len(label_name_to_value)
                    label_name_to_value[label_name] = label_value
            
            # label_values must be dense
            label_values, label_names = [], []
            for ln, lv in sorted(label_name_to_value.items(), key=lambda x: x[1]):
                label_values.append(lv)
                label_names.append(ln)
            assert label_values == list(range(len(label_values)))
            
            lbl = utils.shapes_to_label(img.shape, data['shapes'], label_name_to_value)
            
            captions = ['{}: {}'.format(lv, ln)
                for ln, lv in label_name_to_value.items()]
            lbl_viz = utils.draw_label(lbl, img, captions)
            
            out_dir = osp.basename(count[i]).replace('.', '_')
            out_dir = osp.join(osp.dirname(count[i]), out_dir)
            if not osp.exists(out_dir):
                os.mkdir(out_dir)
 
            PIL.Image.fromarray(img).save(osp.join(out_dir, 'img.png'))
            #PIL.Image.fromarray(lbl).save(osp.join(out_dir, 'label.png'))
            utils.lblsave(osp.join(out_dir, 'label.png'), lbl)
            PIL.Image.fromarray(lbl_viz).save(osp.join(out_dir, 'label_viz.png'))
 
            with open(osp.join(out_dir, 'label_names.txt'), 'w') as f:
                for lbl_name in label_names:
                    f.write(lbl_name + '\n')
 
            warnings.warn('info.yaml is being replaced by label_names.txt')
            info = dict(label_names=label_names)
            with open(osp.join(out_dir, 'info.yaml'), 'w') as f:
                yaml.safe_dump(info, f, default_flow_style=False)
 
            print('Saved to: %s' % out_dir)
if __name__ == '__main__':
    main()
复制代码

  Then do the conversion in batches:

python path/cli/json_to_datasetP.py path/JPEGImages
复制代码

  If an error is reported:

lbl_viz = utils.draw_label(lbl, img, captions)
AttributeError: module 'labelme.utils' has no attribute 'draw_label'
复制代码

  Solution: You need to change the labelme version, you need to reduce the labelme version to 3.16.2, the method enters the labelme environment, and you pip install labelme==3.16.2can , and you can succeed.

3. Another split label production

  If you want to generate tags like this:

  Original image:

  Corresponding label (0 for background, 1 for circle):

  This tag is an 8-bit single-channel image, and the method supports up to 256 types.

  The dataset can be produced by the following script:

import cv2
import numpy as np
import json
import os

#    0     1    2    3   
#  backg  Dog  Cat  Fish     
category_types = ["Background", "Dog", "Cat", "Fish"]

#  获取原始图像尺寸
img = cv2.imread("image.bmp")
h, w = img.shape[:2]

for root,dirs,files in os.walk("data/Annotations"): 
    for file in files: 
        mask = np.zeros([h, w, 1], np.uint8)    # 创建一个大小和原图相同的空白图像

        print(file[:-5])

        jsonPath = "data/Annotations/"
        with open(jsonPath + file, "r") as f:
            label = json.load(f)

        shapes = label["shapes"]
        for shape in shapes:
            category = shape["label"]
            points = shape["points"]
            # 填充
            points_array = np.array(points, dtype=np.int32)
            mask = cv2.fillPoly(mask, [points_array], category_types.index(category))

        imgPath = "data/masks/"
        cv2.imwrite(imgPath + file[:-5] + ".png", mask)
复制代码

  The above is the case of 4 categories.

  You're done here. The above has shared the method of labelme labeling and processing segmentation data. I hope my sharing can help you a little bit in your learning.


 【Public number transmission】

"[Model training] labelme labeling and processing segmentation data method"


logo_show.gif

Guess you like

Origin juejin.im/post/7083771288203821063