Pytorch version of Mask-RCNN image segmentation combat (custom data set)

Table of contents

Mask-RCNN overview

Train yourself data steps

ToolsLabelme

label data

The source code needs to be changed

Test results after training

Mask-RCNN overview

Mask R-CNN is a deep learning model widely used in object detection and image segmentation tasks. It is composed of Faster R-CNN (a fast object detection model) and Mask R-CNN (an instance segmentation model). . Mask R-CNN replaces the RPN and RoI Pooling layers in Faster R-CNN with RPN and RoI Align layers to achieve pixel-level image segmentation. It can detect multiple objects at the same time and perform pixel-level segmentation on each object. segmentation.

The main idea of ​​Mask R-CNN is to add a branch network on the basis of Faster R-CNN, that is, the Mask branch. This branch network can perform pixel-level segmentation operations on detected objects to obtain the segmentation mask of each object instance. . Similar to Faster R-CNN, Mask R-CNN also uses RPN to generate candidate frames, and extracts the features in the candidate frames in the RoIPooling layer and ROIAlign layer. In the RoIAlign layer, Mask R-CNN extracts accurate features from the feature map through bilinear interpolation, and then sends them to the Mask branch for pixel-level segmentation, and finally obtains an accurate mask for each instance.

Facebook AI Research has open sourced   the PyTorch 1.0 implementation benchmark of Faster R-CNN and Mask R-CNN: MaskRCNN-Benchmark. Compared with Detectron and mmdetection, MaskRCNN-Benchmark has comparable performance, faster training speed and lower GPU memory footprint

The advantages are as follows:

  • PyTorch 1.0: Realization of RPN, Faster R-CNN, and Mask R-CNN equal to or exceeding the accuracy of Detectron;
  • Very fast: the training speed is twice that of Detectron and 1.3 times that of mmdection;
  • Save memory: about 500MB less GPU memory is used during training than mmdetection;
  • Use multi-GPU training and inference;
  • Batched inference: multiple images can be used for inference on each batch per GPU;
  • Supports CPU inference: Can run on CPU during inference time.
  • Provides almost all pre-trained models that refer to Mask R-CNN and Faster R-CNN configurations, with a schedule of 1x.

Source code address based on Mask  RCNN open source project: https://github.com/facebookresearch/maskrcnn-benchmark

Train yourself data steps

  • Install Labelme
  • label data
  • The source code needs to be changed
  • Test results after training

ToolsLabelme

1. Install the labelme tool

pip install labelme
pip install pyqt5
pip install pillow==4.0.0

label data

 1. Use labelme to get the .json file, and put the .json file and the original image in a folder;

2. Batch conversion: convert labelme annotation data into coco data set;

  • Run the labelmetococo.py file;
  • Under the dataset folder of my current directory, a coco folder is generated;
  • There are annotations folder and images folder under the coco folder;
  • The annotations folder stores 2 json files;
  • The images folder stores train (storage: divided image data for training) and val (storage: divided image data for verification) two folders;

The following is the labelmetococo.py file code:

import os
import json
import numpy as np
import glob
import shutil
import cv2
from sklearn.model_selection import train_test_split

np.random.seed(41)

# 0为背景
classname_to_id = {
    "class1": 1

}
# 注意这里
# 需要从1开始把对应的Label名字写入:这里根据自己的Lable名字修改

class Lableme2CoCo:

    def __init__(self):
        self.images = []
        self.annotations = []
        self.categories = []
        self.img_id = 0
        self.ann_id = 0

    def save_coco_json(self, instance, save_path):
        json.dump(instance, open(save_path, 'w', encoding='utf-8'), ensure_ascii=False, indent=1)  # indent=2 更加美观显示

    # 由json文件构建COCO
    def to_coco(self, json_path_list):
        self._init_categories()
        for json_path in json_path_list:
            obj = self.read_jsonfile(json_path)
            self.images.append(self._image(obj, json_path))
            shapes = obj['shapes']
            for shape in shapes:
                annotation = self._annotation(shape)
                self.annotations.append(annotation)
                self.ann_id += 1
            self.img_id += 1
        instance = {}
        instance['info'] = 'spytensor created'
        instance['license'] = ['license']
        instance['images'] = self.images
        instance['annotations'] = self.annotations
        instance['categories'] = self.categories
        return instance

    # 构建类别
    def _init_categories(self):
        for k, v in classname_to_id.items():
            category = {}
            category['id'] = v
            category['name'] = k
            self.categories.append(category)

    # 构建COCO的image字段
    def _image(self, obj, path):
        image = {}
        from labelme import utils
        img_x = utils.img_b64_to_arr(obj['imageData'])
        h, w = img_x.shape[:-1]
        image['height'] = h
        image['width'] = w
        image['id'] = self.img_id
        image['file_name'] = os.path.basename(path).replace(".json", ".jpg")
        return image

    # 构建COCO的annotation字段
    def _annotation(self, shape):
        # print('shape', shape)
        label = shape['label']
        points = shape['points']
        annotation = {}
        annotation['id'] = self.ann_id
        annotation['image_id'] = self.img_id
        annotation['category_id'] = int(classname_to_id[label])
        annotation['segmentation'] = [np.asarray(points).flatten().tolist()]
        annotation['bbox'] = self._get_box(points)
        annotation['iscrowd'] = 0
        annotation['area'] = 1.0
        return annotation

    # 读取json文件,返回一个json对象
    def read_jsonfile(self, path):
        with open(path, "r", encoding='utf-8') as f:
            return json.load(f)

    # COCO的格式: [x1,y1,w,h] 对应COCO的bbox格式
    def _get_box(self, points):
        min_x = min_y = np.inf
        max_x = max_y = 0
        for x, y in points:
            min_x = min(min_x, x)
            min_y = min(min_y, y)
            max_x = max(max_x, x)
            max_y = max(max_y, y)
        return [min_x, min_y, max_x - min_x, max_y - min_y]


if __name__ == '__main__':

    # 需要把labelme_path修改为自己放images和json文件的路径
    labelme_path = "D:\\maskrcnn-benchmark-main\\dataset\\gps\\"
    # saved_coco_path = "../../../xianjin_data-3/"
    saved_coco_path = "D:\\maskrcnn-benchmark-main\\dataset\\"
    # saved_coco_path = "./"
    # 要把saved_coco_path修改为自己放生成COCO的路径,这里会在我当前COCO的文件夹下建立生成coco文件夹。
    print('reading...')
    # 创建文件
    if not os.path.exists("%scoco/annotations/" % saved_coco_path):
        os.makedirs("%scoco/annotations/" % saved_coco_path)
    if not os.path.exists("%scoco/images/train2017/" % saved_coco_path):
        os.makedirs("%scoco/images/train2017" % saved_coco_path)
    if not os.path.exists("%scoco/images/val2017/" % saved_coco_path):
        os.makedirs("%scoco/images/val2017" % saved_coco_path)
    # 获取images目录下所有的joson文件列表
    print(labelme_path + "\*.json")
    json_list_path = glob.glob(labelme_path + "\*.json")
    # json_list_path = glob.glob(labelme_path + "\*.png")
    print('json_list_path: ', len(json_list_path))
    # 数据划分,这里没有区分val2017和tran2017目录,所有图片都放在images目录下
    train_path, val_path = train_test_split(json_list_path, test_size=0.1, train_size=0.9)
    # 将训练集和验证集的比例是9:1,可以根据自己想要的比例修改。
    print("train_n:", len(train_path), 'val_n:', len(val_path))

    # 把训练集转化为COCO的json格式
    l2c_train = Lableme2CoCo()
    train_instance = l2c_train.to_coco(train_path)
    l2c_train.save_coco_json(train_instance, '%scoco/annotations/instances_train2017.json' % saved_coco_path)
    for file in train_path:
        # 存入png格式图片,原始图片有两种格式.png,.jpg
        # print("这里测试一下file:"+file)
        img_name = file.replace('json', 'png')
        # print("这里测试一下img_name:" + img_name)
        temp_img = cv2.imread(img_name)
        # 图像为空说明为.jpg格式
        if  temp_img is None:
            img_name_jpg = img_name.replace('png', 'jpg')
            temp_img = cv2.imread(img_name_jpg)

        filenames = img_name.split("\\")[-1]
        cv2.imwrite("D:\\maskrcnn-benchmark-main\\dataset\\coco\\images\\train2017/{}".format(filenames), temp_img)
        # print(temp_img) #测试图像读取是否正确


    for file in val_path:
        # shutil.copy(file.replace("json", "jpg"), "%scoco/images/val2017/" % saved_coco_path)

        img_name = file.replace('json', 'png')
        temp_img = cv2.imread(img_name)
        if temp_img is None:
            img_name_jpg = img_name.replace('png', 'jpg')
            temp_img = cv2.imread(img_name_jpg)
        filenames = img_name.split("\\")[-1]
        cv2.imwrite("D:\\maskrcnn-benchmark-main\\dataset\\coco\\images\\val2017/{}".format(filenames), temp_img)
        

    # 把验证集转化为COCO的json格式
    l2c_val = Lableme2CoCo()
    val_instance = l2c_val.to_coco(val_path)
    l2c_val.save_coco_json(val_instance, '%scoco/annotations/instances_val2017.json' % saved_coco_path)

Execute the program, the generated folder

The source code needs to be changed:

Assuming you are in the maskrcnn-benchmark/ directory at this time, the organizational structure of datasets is as follows

datasets
-> coco
  -> annotations
    -> instances_train2014.json //训练标签
    -> instances_test2014.json  //验证标签
  -> train2014 //训练图片
  -> val2014  //验证图

For convenience, it is recommended to follow the above coco standard naming, but all the above names are not necessarily required to be written in this way, you can name them reasonably, as long as each place in the program related to the data path corresponds to each other one by one.

  1. maskrcnn-benchmark/configse/2e_mask_rcnn_R_50_FPN_1x.yaml为例The places circled in blue below must be modified :
  2. One is the category. If there is no category in it, which is consistent with the one in default.py, then the program will automatically look for it in defaults.py.
  3. In particular, there is DATASETS, which is used to specify the data set used in this training. For the convenience of me, I did not modify it here, so I borrowed the name of coco_2014_train. The name can be arbitrarily named, but there needs to be a corresponding data set maskrcnn-benchmark/maskrcnn_benchmark/config/paths_catalog.pyin Path description, and the path should correspond to the files in datasets/coco/
  4. BASE_LR, the number of iterations, and the saving interval are determined according to the needs of your own model
  5. To modify the checkpoint under maskrcnn_benchmark/utils, you need to comment lines 65-68 (self.optimizer.load_state.. self.scheduler.load_...)

Change the dataset path in the paths_catalogs file under myconfig to your own;

maskrcnn-benchmark/maskrcnn_benchmark/config/defaults.py This is a general default configuration file for the model, some modification is required

_C.INPUT = CN()
# Size of the smallest side of the image during training
_C.INPUT.MIN_SIZE_TRAIN = (400,)  # (800,)
# Maximum size of the side of the image during training
_C.INPUT.MAX_SIZE_TRAIN = 667
# Size of the smallest side of the image during testing
_C.INPUT.MIN_SIZE_TEST = 400
# Maximum size of the side of the image during testing
_C.INPUT.MAX_SIZE_TEST = 667
#下面的两处修改也需要特别注意!!!必须和自己的类别相对应,如果没有分类,那么就为2
_C.MODEL.ROI_BOX_HEAD.NUM_CLASSES = 2 #类别数量加1
_C.MODEL.RETINANET.NUM_CLASSES = 2 #类别数量加1

Guess you like

Origin blog.csdn.net/qq_31807039/article/details/130575381