MMDetection fine-tunes the RTMDet model for instance segmentation tasks

foreword

  • In the previous blog, MMdetectionin addition to being suitable for target detection tasks, the framework can also do instance segmentation tasks.
  • However, MMdetectionthe tutorial file on the instance segmentation task in the official project of the framework will report an error during the actual operation due to the update of the framework version, cannot import name 'build_dataset' from 'mmdet.datasets'so this article is mainly a tutorial on instance segmentation for the new version of the framework
  • Because the data volume of the official balloondataset is too small, the dataset kaggleon the platform is used here, the data addressMotorcycle Night Ride
  • All the following codes are run on kagglethe platform, GPUfor P100the environment.

Environment configuration

import IPython.display as display

!pip install openmim
!mim install mmengine==0.7.2
!pip install -q /kaggle/input/frozen-packages-mmdetection/mmcv-2.0.1-cp310-cp310-linux_x86_64.whl

!rm -rf mmdetection
!git clone https://github.com/open-mmlab/mmdetection.git
!git clone https://github.com/open-mmlab/mmyolo.git
%cd mmdetection

!mkdir ./data
%pip install -e .

!pip install wandb
display.clear_output()
  • Since we are using wandbthe platform to visualize the training process, we also need to log in to wandbthe platform first.
import wandb
wandb.login()

Pre-trained model inference

  • RTMDet-lWe first download the model weights that need to be fine-tuned , and then perform inference on the test image. By the way, we can also check whether the environment is complete.
  • Regarding the model number, it can configs/rtmdetbe found under project files. But it should be noted that Readme.mdthere are two tables in the document are Object Detectionand Instance Segmentation, since it is an instance segmentation task, we should look up the model model in the `Instance Segmentation`` table
    insert image description here
!mkdir ./checkpoints
!mim download mmdet --config rtmdet-ins_l_8xb32-300e_coco --dest ./checkpoints
  • Model inference on test images
import mmcv
import mmengine
from mmdet.apis import init_detector, inference_detector
from mmdet.utils import register_all_modules

config_file = 'configs/rtmdet/rtmdet-ins_l_8xb32-300e_coco.py'

checkpoint_file = 'checkpoints/rtmdet-ins_l_8xb32-300e_coco_20221124_103237-78d1d652.pth'

register_all_modules()

model = init_detector(config_file, checkpoint_file, device='cuda:0')

image = mmcv.imread('demo/demo.jpg',channel_order='rgb')
result = inference_detector(model, image)

from mmdet.registry import VISUALIZERS
visualizer = VISUALIZERS.build(model.cfg.visualizer)
visualizer.dataset_meta = model.dataset_meta

visualizer.add_datasample('result',image,data_sample=result,draw_gt = None,wait_time=0,)

display.clear_output()
visualizer.show()

Please add a picture description

Data Exploration and Visualization

  • Because the name of the dataset is too long, copy the image folder and annotation file to a datasubfolder in the project folder here
import os
import shutil

def copy_files(src_folder, dest_folder):
    # 确保目标文件夹存在
    os.makedirs(dest_folder, exist_ok=True)

    # 遍历源文件夹中的所有内容
    for root, _, files in os.walk(src_folder):
        for file in files:
            # 拼接源文件的完整路径
            src_file_path = os.path.join(root, file)

            # 拼接目标文件的完整路径
            dest_file_path = os.path.join(dest_folder, os.path.relpath(src_file_path, src_folder))

            # 确保目标文件的文件夹存在
            os.makedirs(os.path.dirname(dest_file_path), exist_ok=True)

            # 复制文件
            shutil.copy(src_file_path, dest_file_path)

source_folder = '/kaggle/input/motorcycle-night-ride-semantic-segmentation/www.acmeai.tech ODataset 1 - Motorcycle Night Ride Dataset'
destination_folder = './data'
copy_files(source_folder, destination_folder)
  • Data visualization, since the visualized graph is already provided in the data set, we compare the original graph and the annotated graph together
import mmcv
import matplotlib.pyplot as plt

img_og = mmcv.imread('data/images/Screenshot (446).png')
img_fuse = mmcv.imread('data/images/Screenshot (446).png___fuse.png')

fig, axes = plt.subplots(1, 2, figsize=(15, 10))
axes[0].imshow(mmcv.bgr2rgb(img_og))
axes[0].set_title('Original Image')
axes[0].axis('off')

axes[1].imshow(mmcv.bgr2rgb(img_fuse))
axes[1].set_title('mask Image')
axes[1].axis('off')
plt.show()

Please add a picture description

  • Use pycocotoolsthe library to read the annotation file and output the category information
from pycocotools.coco import COCO

# 初始化COCO对象
coco = COCO('data/COCO_motorcycle (pixel).json')

# 获取所有的类别标签和对应的类别ID
categories = coco.loadCats(coco.getCatIds())
category_id_to_name = {
    
    cat['id']: cat['name'] for cat in categories}

display.clear_output()
# 打印所有类别ID和对应的类别名称
for category_id, category_name in category_id_to_name.items():
    print(f"Category ID: {
      
      category_id}, Category Name: {
      
      category_name}")
  • output:
Category ID: 1329681, Category Name: Rider
Category ID: 1323885, Category Name: My bike
Category ID: 1323884, Category Name: Moveable
Category ID: 1323882, Category Name: Lane Mark
Category ID: 1323881, Category Name: Road
Category ID: 1323880, Category Name: Undrivable
  • You can see that there are 6 categories in total, namely: Rider, My bike, Moveable, Lane Mark, Road, Undrivable

Modify the configuration file

  • For a detailed explanation of the configuration file, I have detailed instructions on the entire process of training and testing the MMDetection framework in the blog , and fine-tuning the Mask2Former model using MMSegmentation .
  • The main modification is the pre-training weight path, image path, annotation file path, batch_size, epochs, learning rate scaling,number of categories, multi-card to single-card (SyncBN --> BN),Category labels and palettes
  • Pay great attention to the two parameters of category number and category label and palette, otherwise an error will be reported class EpochBasedTrainLoop in mmengine/runner/loops.py: class CocoDataset in mmdet/datasets/coco.py: need at least one array to concatenate. This kind of error is very, very common. You must check whether the category number and label information are correct
from mmengine import Config
cfg = Config.fromfile('./configs/rtmdet/rtmdet-ins_l_8xb32-300e_coco.py')
from mmengine.runner import set_random_seed

cfg.load_from = 'checkpoints/rtmdet-ins_l_8xb32-300e_coco_20221124_103237-78d1d652.pth'

cfg.work_dir = './work_dir'

cfg.max_epochs = 100
cfg.stage2_num_epochs = 7
cfg.train_dataloader.batch_size = 4
cfg.train_dataloader.num_workers = 2

scale_factor = cfg.train_dataloader.batch_size / (8 * 32)

cfg.base_lr *= scale_factor
cfg.optim_wrapper.optimizer.lr = cfg.base_lr

# cfg.model.backbone.frozen_stages = 4
cfg.model.bbox_head.num_classes = 6

# 单卡训练时,需要把 SyncBN 改成 BN
cfg.norm_cfg = dict(type='BN', requires_grad=True)

cfg.metainfo = {
    
    
    'classes': ('Rider', 'My bike', 'Moveable', 'Lane Mark', 'Road', 'Undrivable', ),
    'palette': [
        (141, 211, 197),(255, 255, 179),(190, 186, 219),(245, 132, 109),(127, 179, 209),(251, 180, 97),
    ]
}

cfg.data_root = './data'

cfg.train_dataloader.dataset.ann_file = 'COCO_motorcycle (pixel).json'
cfg.train_dataloader.dataset.data_root = cfg.data_root
cfg.train_dataloader.dataset.data_prefix.img = 'images/'
cfg.train_dataloader.dataset.metainfo = cfg.metainfo


cfg.val_dataloader.dataset.ann_file = 'COCO_motorcycle (pixel).json'
cfg.val_dataloader.dataset.data_root = cfg.data_root
cfg.val_dataloader.dataset.data_prefix.img = 'images/'
cfg.val_dataloader.dataset.metainfo = cfg.metainfo


cfg.test_dataloader = cfg.val_dataloader

cfg.val_evaluator.ann_file = cfg.data_root+'/'+'COCO_motorcycle (pixel).json'
cfg.val_evaluator.metric = ['segm']

cfg.test_evaluator = cfg.val_evaluator

cfg.default_hooks.checkpoint = dict(type='CheckpointHook', interval=10, max_keep_ckpts=2, save_best='auto')
cfg.default_hooks.logger.interval = 20

cfg.custom_hooks[1].switch_epoch = 300 - cfg.stage2_num_epochs

cfg.train_cfg.max_epochs = cfg.max_epochs
cfg.train_cfg.val_begin = 20
cfg.train_cfg.val_interval = 2
cfg.train_cfg.dynamic_intervals = [(300 - cfg.stage2_num_epochs, 1)]

# cfg.train_dataloader.dataset = dict(dict(type='RepeatDataset',times=5,dataset = cfg.train_dataloader.dataset))

cfg.param_scheduler[0].end = 100

cfg.param_scheduler[1].eta_min = cfg.base_lr * 0.05
cfg.param_scheduler[1].begin = cfg.max_epochs // 2
cfg.param_scheduler[1].end = cfg.max_epochs
cfg.param_scheduler[1].T_max = cfg.max_epochs //2

set_random_seed(0, deterministic=False)

cfg.visualizer.vis_backends.append({
    
    "type":'WandbVisBackend'})

#------------------------------------------------------
config=f'./configs/rtmdet/rtmdet-ins_l_1xb4-100e_motorcycle.py'
with open(config, 'w') as f:
    f.write(cfg.pretty_text)
  • start training
!python tools/train.py {
    
    config}
  • Shows the value of the indicator with the best performance of the model
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.561
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.758
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.614
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.017
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.195
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.633
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.543
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.645
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.649
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.036
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.246
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.721
07/23 19:31:52 - mmengine - INFO - segm_mAP_copypaste: 0.561 0.758 0.614 0.017 0.195 0.633
07/23 19:31:52 - mmengine - INFO - Epoch(val) [98][40/40]    coco/segm_mAP: 0.5610  coco/segm_mAP_50: 0.7580  coco/segm_mAP_75: 0.6140  coco/segm_mAP_s: 0.0170  coco/segm_mAP_m: 0.1950  coco/segm_mAP_l: 0.6330  data_time: 0.0491  time: 3.1246

Visualize the training process

  • We can log in to wandbthe platform to view the indicator changes during the training process

insert image description here
insert image description here

  • It can be seen segm_mAPthat the value is still rising. Due to time constraints, I only ran 100. epochIf you try it yourself, you can try to run 300. It is estimated that the effect will be better.

Inference on test images

  • After training is complete, we load the best performing model and perform inference on test images
from mmengine.visualization import Visualizer
import mmcv
from mmdet.apis import init_detector, inference_detector
import glob

img = mmcv.imread('data/images/Screenshot (446).png',channel_order='rgb')

checkpoint_file = glob.glob('./work_dir/best_coco_segm_mAP*.pth')[0]

model = init_detector(cfg, checkpoint_file, device='cuda:0')

new_result = inference_detector(model, img)

visualizer_now = Visualizer.get_current_instance()

visualizer_now.dataset_meta = model.dataset_meta

visualizer_now.add_datasample('new_result', img, data_sample=new_result, draw_gt=False, wait_time=0, out_file=None, pred_score_thr=0.5)

visualizer_now.show()

Please add a picture description

Troubleshooting

  • The official wrote a document for troubleshooting some very common problems, I will put the address here
  • The most frequent error I encountered in the process of running is that it valueerror-need-at-least-one-array-to-concatenateis also explained in the official troubleshooting manual above. To check the number of categories and tags and palettes, but I checked and found that there is no such problem, and finally found that there is another factor that may cause this error
  • Due to the update of the framework,Labels and PalettesThe way of writing has changed, and the wrong way of writing is:
cfg.metainfo = {
    
    
    'CLASSES': ('Rider', 'My bike', 'Moveable', 'Lane Mark', 'Road', 'Undrivable', ),
    'PALETTE': [
        (141, 211, 197),(255, 255, 179),(190, 186, 219),(245, 132, 109),(127, 179, 209),(251, 180, 97),
    ]
}
  • The uppercase CLASSESand PALETTEare no longer applicable in the new version. If you want to change it to lowercase, it will not be saved
cfg.metainfo = {
    
    
    'classes': ('Rider', 'My bike', 'Moveable', 'Lane Mark', 'Road', 'Undrivable', ),
    'palette': [
        (141, 211, 197),(255, 255, 179),(190, 186, 219),(245, 132, 109),(127, 179, 209),(251, 180, 97),
    ]
}

Guess you like

Origin blog.csdn.net/qq_20144897/article/details/131887980