Tutorial - Use BiSeNet (semantic segmentation) network to train your own data set from scratch

introduction

In order to segment the features we want from the image, we use BiSeNet as the segmentation model and conduct training and testing on the data set we created. Note: The training is in a Linux environment. Training under Win may have some problems.

1. Download the installation package of BiSeNet network

1. Download link: GitHub - CoinCheung/BiSeNet: Add bisenetv2. My implementation of BiSeNet

2. Download and decompress it to your directory. The file structure is shown in the figure below: 

2. Make relevant data sets

1. We use the labelme tool for data annotation. The method of using labelme will not be described in detail here.

2. After step 1, we can get the original JPG file and the corresponding Json file, as shown below:

3. Convert json files into visual segmented pictures

We have borrowed the method from the https://github.com/caozhiwei1994/labelme2dataset link, downloaded and decompressed the link file to any directory, and after opening there are three files as shown below:

 Go to the labelme2BisSeNet folder and put the original image and JSON file obtained in step 1 into it, as shown below (PV is the name of my data set, and PV contains JPG and JSON):

 Execute the json_to_dataset.py file code, which is the following code (be careful to modify your own file address):

import argparse
import json
import os
import os.path as osp
import warnings
import PIL.Image
import yaml
from labelme import utils
import base64
def main():
    count = os.listdir("E:\File\Pycharm\BiSeNet-master\datasets\labelme2dataset-main\labelme2BiSeNet\PV") #数据集的地址
    for i in range(0, len(count)):
        path = os.path.join("./PV", count[i]) #注意好文件地址
        if os.path.isfile(path) and path.endswith('json'):
            data = json.load(open(path))
            if data['imageData']:
                imageData = data['imageData']
            else:
                imagePath = os.path.join(os.path.dirname(path), data['imagePath'])
                with open(imagePath, 'rb') as f:
                    imageData = f.read()
                    imageData = base64.b64encode(imageData).decode('utf-8')
            img = utils.img_b64_to_arr(imageData)
            label_name_to_value = {'_background_': 0}
            for shape in data['shapes']:
                label_name = shape['label']
                if label_name in label_name_to_value:
                    label_value = label_name_to_value[label_name]
                else:
                    label_value = len(label_name_to_value)
                    label_name_to_value[label_name] = label_value
            # label_values must be dense
            label_values, label_names = [], []
            for ln, lv in sorted(label_name_to_value.items(), key=lambda x: x[1]):
                label_values.append(lv)
                label_names.append(ln)
            assert label_values == list(range(len(label_values)))
            lbl = utils.shapes_to_label(img.shape, data['shapes'], label_name_to_value)
            captions = ['{}: {}'.format(lv, ln)
                        for ln, lv in label_name_to_value.items()]
            lbl_viz = utils.draw_label(lbl, img, captions)
            out_dir = osp.basename(count[i]).replace('.', '_')
            out_dir = osp.join(osp.dirname(count[i]), out_dir)
            out_dir = osp.join("output", out_dir)
            if not osp.exists(out_dir):
                #os.mkdir(out_dir)
                os.makedirs(out_dir)
            PIL.Image.fromarray(img).save(osp.join(out_dir, 'img.png'))
            utils.lblsave(osp.join(out_dir, 'label.png'), lbl)
            PIL.Image.fromarray(lbl_viz).save(osp.join(out_dir, 'label_viz.png'))
            with open(osp.join(out_dir, 'label_names.txt'), 'w') as f:
                for lbl_name in label_names:
                    f.write(lbl_name + '\n')
            warnings.warn('info.yaml is being replaced by label_names.txt')
            info = dict(label_names=label_names)
            with open(osp.join(out_dir, 'info.yaml'), 'w') as f:
                yaml.safe_dump(info, f, default_flow_style=False)
            print('Saved to: %s' % out_dir)
if __name__ == '__main__':
    main()

Note: An error may occur when executing this step of code: "labelme.utils" does not have a "draw_label" attribute.

Solution: It’s mainly a problem with the labelme version, just lower the version:

pip install labelme==3.16.7

After execution, the output folder is obtained. The files (visual segmentation) under the output folder are as shown below:

 4. Get paired JPG_PNG training pictures

First, we create a new txt in the labelme2BiSeNet directory, which contains all the categories we marked (we have three categories here, including background), as shown below:

 Next, run the get_png.py file, which is the following code:

import os
from PIL import Image
import numpy as np


def main():
    # 读取原文件夹
    count = os.listdir("PV") #注意修改为自己的地址
    for i in range(0, len(count)):
        # 如果里的文件以jpg结尾
        # 则寻找它对应的png
        if count[i].endswith("jpg"):
            path = os.path.join("PV", count[i]) #注意修改为自己的地址
            img = Image.open(path)
            if not os.path.exists('jpg_png/jpg'):
                os.makedirs('jpg_png/jpg')
            img.save(os.path.join("jpg_png/jpg", count[i]))
            # 找到对应的png
            path = "output/" + count[i].split(".")[0] + "_json/label.png"
            img = Image.open(path)
            # 找到全局的类
            class_txt = open("class_name.txt", "r")
            class_name = class_txt.read().splitlines()
            # ["_background_","a","b"]
            # 打开json文件里面存在的类,称其为局部类
            with open("output/" + count[i].split(".")[0] + "_json/label_names.txt", "r") as f:
                names = f.read().splitlines()
                # ["_background_","b"]
                new = Image.new("RGB", [np.shape(img)[1], np.shape(img)[0]])
                # print('new:',new)
                for name in names:
                    index_json = names.index(name)
                    index_all = class_name.index(name)
                    # 将局部类转换成为全局类
                    new = new + np.expand_dims(index_all * (np.array(img) == index_json), -1)
            new = Image.fromarray(np.uint8(new))
            print('new:',new)
            if not os.path.exists('jpg_png/png'):
                os.makedirs('jpg_png/png')
            new.save(os.path.join("jpg_png/png", count[i].replace("jpg", "png")))
            print(np.max(new), np.min(new))

if __name__ == '__main__':
    main()

 Execute the above code, and you will get the jpg_png file. The jpg file stores the original image, and the png file stores the corresponding 24-bit grayscale image (all black to the naked eye, because the categories are divided according to pixel values, and they all look like The black ones actually have pixel values ​​of 0, 1 or something):

5. However, what BiSeNet needs is an 8-bit grayscale image, and the above is 24-bit. We need to continue to convert and execute the get_dataset.py file, that is, the following code:

import cv2
import os
from PIL import Image

#if picture is jpg,you can use jpg2png
jpg_read = "jpg_png/jpg/"
if not os.path.exists('dataset/gt_png'):
    os.makedirs('dataset/gt_png')
png_write = "dataset/gt_png/"
jpg_names = os.listdir(jpg_read)
for j in jpg_names:
    img = Image.open(jpg_read + j)
    j = j.split(".")
    if j[-1] == "jpg":
        j[-1] = "png"
        j = str.join(".", j)
        # r,g,b,a=img.split()
        # img=Image.merge("RGB",(r,g,b))
        to_save_path = png_write + j
        img.save(to_save_path)
    else:
        continue

#24bit to 8bit
bit24_dir = 'jpg_png/png'      #上一步保存.png图像文件夹
if not os.path.exists('dataset/label_png'):
    os.makedirs('dataset/label_png')
bit8_dir = 'dataset/label_png'
png_names = os.listdir(bit24_dir)
for i in png_names:
    img = cv2.imread(bit24_dir+'/'+i)
    gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
    cv2.imencode('.png', gray)[1].tofile(bit8_dir+'/'+i)


 After execution, the dataset file is obtained, which stores pairs of training pictures, as shown in the figure below:

Since the BiSeNet network needs to ensure that the sizes of all images are consistent, the test will not go wrong. If your training images are all the same size, you don’t need to look at the following code. The following code is resize and crop (that is, execute resize.py and crop.py respectively. document):

# resize.py
import cv2
import os
import shutil


def main(path):
    for root, dirs, files in os.walk(path):
        for file in files:
            if file.endswith('.png'):
                image_name = os.path.join(root, file)
                image = cv2.imread(image_name, -1)
                crop_image = cv2.resize(image,(1080,704))
                os.remove(image_name)
                cv2.imwrite(image_name, crop_image)

image_path = './dataset/gt_png'
label_path = './dataset/label_png'
if __name__ == '__main__':
    main(image_path)
    main(label_path)

# crop.py
# 将1280×720裁剪1080×704

import cv2
import os
import shutil


def main(path):
    for root, dirs, files in os.walk(path):
        for file in files:
            if file.endswith('.png'):
                image_name = os.path.join(root, file)
                image = cv2.imread(image_name, -1)
                crop_image = image[:700,:1080]
                os.remove(image_name)
                cv2.imwrite(image_name, crop_image)

image_path = './dataset/gt_png'
label_path = './dataset/label_png'
if __name__ == '__main__':
    main(image_path)
    main(label_path)

6. Get txt file

After completing the above steps, execute the train_val.py and train_val_txt.py files, that is, the following code, to divide the data set into a training set and a validation set, and obtain the txt file:

# train_val.py

'''
将数据分为train val
'''

import os
import random
import shutil

total_list = []
train_list = []
val_list = []


image_path = 'dataset/gt_png'
label_path = 'dataset/label_png'

# 清空
for dir in ['train', 'val']:
    image_dir = os.path.join(image_path, dir)
    label_dir = os.path.join(label_path, dir)
    if os.path.exists(image_dir):
        shutil.rmtree(image_dir)
    os.makedirs(image_dir)
    if os.path.exists(label_dir):
        shutil.rmtree(label_dir)
    os.makedirs(label_dir)


for root, dirs, files in os.walk(image_path):
    for file in files:
        if file.endswith('png'):
            total_list.append(file)

total_size = len(total_list)
train_size = int(total_size * 0.8)
val_size = total_size-train_size

train_list = random.sample(total_list, train_size)
remain_list = list(set(total_list) - set(train_list))
val_list = random.sample(remain_list, val_size)



for file in total_list:
    image_path_0 = os.path.join(image_path, file)
    label_file = file.split('.')[0] + '.png'
    label_path_0 = os.path.join(label_path, label_file)
    if file in train_list:
        image_path_1 = os.path.join(image_path, 'train', file)
        shutil.move(image_path_0, image_path_1)

        label_path_1 = os.path.join(label_path, 'train', label_file)
        shutil.move(label_path_0, label_path_1)

    elif file in val_list:
        image_path_1 = os.path.join(image_path, 'val', file)
        shutil.move(image_path_0, image_path_1)

        label_path_1 = os.path.join(label_path, 'val', label_file)
        shutil.move(label_path_0, label_path_1)


print(len(total_list))
print(len(train_list))
print(len(val_list))
import os


def write_txt(type,txt):
    gt = os.listdir("dataset/gt_png/"+type)
    label = os.listdir("dataset/label_png/"+type)
    with open(txt, "w") as f:
        for i in gt:
            j = i.replace("gt_png","label_png")
            # 判断jpg是否存在对应的png
            if j in label:
                f.write("gt_png/"+ type + '/' + i + ","+"label_png/"+type + '/'+ j + "\n")

write_txt("train","train.txt")
write_txt("val","val.txt")

 

Get the files we need, namely dataset, train.txt and val.txt , so far the preparation of the dataset has been completed.

 3. Training BiSeNet network

1. Place the gt_png, label_png folders, train.txt and val.txt in the dataset file into the BiSeNet-master/datasets/cityscapes directory, as shown below (the other two files are automatically generated later, so ignore them here):

 2. Modify some codes

This step requires modifying the code according to your own situation. The basic thing to change is the number of categories. The original author's number of categories is 19. You can search for 19 globally and change all 19 to your number of categories (the number of categories here includes background)

 In addition, there is a more important modification is the cityscapes_cv2.py file, which needs to be modified in two parts, as shown below:

 Leave the first line untouched by default (background class), change the second and third lines to your own class (because bloggers only have two categories besides background), and change 'ignoreInEval' to False, set color as you like, trainId and name Also needs modification. (If this step is not corrected, the loss may be Nan in subsequent training)

 Then modify the following code (just below the file), n_cats=your number of categories (including background), arange (number of categories)

  4. Start training

Single-card training adopted by the blogger, LInux environment:

export CUDA_VISIBLE_DEVICES=0
torchrun --nproc_per_node=1 tools/train_amp.py --config ./configs/bisenetv2_city.py

The training parameters are adjusted in the files under the configs/ directory:

 For detailed training and testing commands, please refer to the author of the code github, which is GitHub - CoinCheung/BiSeNet: Add bisenetv2. My implementation of BiSeNet

Reference article: BiSeNet training labelme tagged semantic segmentation dataset_setuptools==50.3.1.post20201107_ Wuwei Traveler's Blog - CSDN Blog

If you encounter any problems, please leave a message in the comment area, and the blogger will answer them one by one.

Guess you like

Origin blog.csdn.net/qq_39149619/article/details/131882664