foreword
- In the previous blog,
MMdetection
in addition to being suitable for target detection tasks, the framework can also do instance segmentation tasks. - However,
MMdetection
the tutorial file on the instance segmentation task in the official project of the framework will report an error during the actual operation due to the update of the framework version,cannot import name 'build_dataset' from 'mmdet.datasets'
so this article is mainly a tutorial on instance segmentation for the new version of the framework - Because the data volume of the official
balloon
dataset is too small, the datasetkaggle
on the platform is used here, the data addressMotorcycle Night Ride
- All the following codes are run on
kaggle
the platform,GPU
forP100
the environment.
Environment configuration
- For a detailed description of this part, please refer to the environment configuration part of the blog MMDetection framework training and testing process , so I won’t go into details here
import IPython.display as display
!pip install openmim
!mim install mmengine==0.7.2
!pip install -q /kaggle/input/frozen-packages-mmdetection/mmcv-2.0.1-cp310-cp310-linux_x86_64.whl
!rm -rf mmdetection
!git clone https://github.com/open-mmlab/mmdetection.git
!git clone https://github.com/open-mmlab/mmyolo.git
%cd mmdetection
!mkdir ./data
%pip install -e .
!pip install wandb
display.clear_output()
- Since we are using
wandb
the platform to visualize the training process, we also need to log in towandb
the platform first.
import wandb
wandb.login()
Pre-trained model inference
RTMDet-l
We first download the model weights that need to be fine-tuned , and then perform inference on the test image. By the way, we can also check whether the environment is complete.- Regarding the model number, it can
configs/rtmdet
be found under project files. But it should be noted thatReadme.md
there are two tables in the document areObject Detection
andInstance Segmentation
, since it is an instance segmentation task, we should look up the model model in the `Instance Segmentation`` table
!mkdir ./checkpoints
!mim download mmdet --config rtmdet-ins_l_8xb32-300e_coco --dest ./checkpoints
- Model inference on test images
import mmcv
import mmengine
from mmdet.apis import init_detector, inference_detector
from mmdet.utils import register_all_modules
config_file = 'configs/rtmdet/rtmdet-ins_l_8xb32-300e_coco.py'
checkpoint_file = 'checkpoints/rtmdet-ins_l_8xb32-300e_coco_20221124_103237-78d1d652.pth'
register_all_modules()
model = init_detector(config_file, checkpoint_file, device='cuda:0')
image = mmcv.imread('demo/demo.jpg',channel_order='rgb')
result = inference_detector(model, image)
from mmdet.registry import VISUALIZERS
visualizer = VISUALIZERS.build(model.cfg.visualizer)
visualizer.dataset_meta = model.dataset_meta
visualizer.add_datasample('result',image,data_sample=result,draw_gt = None,wait_time=0,)
display.clear_output()
visualizer.show()
Data Exploration and Visualization
- Because the name of the dataset is too long, copy the image folder and annotation file to a
data
subfolder in the project folder here
import os
import shutil
def copy_files(src_folder, dest_folder):
# 确保目标文件夹存在
os.makedirs(dest_folder, exist_ok=True)
# 遍历源文件夹中的所有内容
for root, _, files in os.walk(src_folder):
for file in files:
# 拼接源文件的完整路径
src_file_path = os.path.join(root, file)
# 拼接目标文件的完整路径
dest_file_path = os.path.join(dest_folder, os.path.relpath(src_file_path, src_folder))
# 确保目标文件的文件夹存在
os.makedirs(os.path.dirname(dest_file_path), exist_ok=True)
# 复制文件
shutil.copy(src_file_path, dest_file_path)
source_folder = '/kaggle/input/motorcycle-night-ride-semantic-segmentation/www.acmeai.tech ODataset 1 - Motorcycle Night Ride Dataset'
destination_folder = './data'
copy_files(source_folder, destination_folder)
- Data visualization, since the visualized graph is already provided in the data set, we compare the original graph and the annotated graph together
import mmcv
import matplotlib.pyplot as plt
img_og = mmcv.imread('data/images/Screenshot (446).png')
img_fuse = mmcv.imread('data/images/Screenshot (446).png___fuse.png')
fig, axes = plt.subplots(1, 2, figsize=(15, 10))
axes[0].imshow(mmcv.bgr2rgb(img_og))
axes[0].set_title('Original Image')
axes[0].axis('off')
axes[1].imshow(mmcv.bgr2rgb(img_fuse))
axes[1].set_title('mask Image')
axes[1].axis('off')
plt.show()
- Use
pycocotools
the library to read the annotation file and output the category information
from pycocotools.coco import COCO
# 初始化COCO对象
coco = COCO('data/COCO_motorcycle (pixel).json')
# 获取所有的类别标签和对应的类别ID
categories = coco.loadCats(coco.getCatIds())
category_id_to_name = {
cat['id']: cat['name'] for cat in categories}
display.clear_output()
# 打印所有类别ID和对应的类别名称
for category_id, category_name in category_id_to_name.items():
print(f"Category ID: {
category_id}, Category Name: {
category_name}")
- output:
Category ID: 1329681, Category Name: Rider
Category ID: 1323885, Category Name: My bike
Category ID: 1323884, Category Name: Moveable
Category ID: 1323882, Category Name: Lane Mark
Category ID: 1323881, Category Name: Road
Category ID: 1323880, Category Name: Undrivable
- You can see that there are 6 categories in total, namely: Rider, My bike, Moveable, Lane Mark, Road, Undrivable
Modify the configuration file
- For a detailed explanation of the configuration file, I have detailed instructions on the entire process of training and testing the MMDetection framework in the blog , and fine-tuning the Mask2Former model using MMSegmentation .
- The main modification is the pre-training weight path, image path, annotation file path,
batch_size
,epochs
, learning rate scaling,number of categories, multi-card to single-card (SyncBN --> BN),Category labels and palettes。 - Pay great attention to the two parameters of category number and category label and palette, otherwise an error will be reported
class EpochBasedTrainLoop in mmengine/runner/loops.py: class CocoDataset in mmdet/datasets/coco.py: need at least one array to concatenate
. This kind of error is very, very common. You must check whether the category number and label information are correct
from mmengine import Config
cfg = Config.fromfile('./configs/rtmdet/rtmdet-ins_l_8xb32-300e_coco.py')
from mmengine.runner import set_random_seed
cfg.load_from = 'checkpoints/rtmdet-ins_l_8xb32-300e_coco_20221124_103237-78d1d652.pth'
cfg.work_dir = './work_dir'
cfg.max_epochs = 100
cfg.stage2_num_epochs = 7
cfg.train_dataloader.batch_size = 4
cfg.train_dataloader.num_workers = 2
scale_factor = cfg.train_dataloader.batch_size / (8 * 32)
cfg.base_lr *= scale_factor
cfg.optim_wrapper.optimizer.lr = cfg.base_lr
# cfg.model.backbone.frozen_stages = 4
cfg.model.bbox_head.num_classes = 6
# 单卡训练时,需要把 SyncBN 改成 BN
cfg.norm_cfg = dict(type='BN', requires_grad=True)
cfg.metainfo = {
'classes': ('Rider', 'My bike', 'Moveable', 'Lane Mark', 'Road', 'Undrivable', ),
'palette': [
(141, 211, 197),(255, 255, 179),(190, 186, 219),(245, 132, 109),(127, 179, 209),(251, 180, 97),
]
}
cfg.data_root = './data'
cfg.train_dataloader.dataset.ann_file = 'COCO_motorcycle (pixel).json'
cfg.train_dataloader.dataset.data_root = cfg.data_root
cfg.train_dataloader.dataset.data_prefix.img = 'images/'
cfg.train_dataloader.dataset.metainfo = cfg.metainfo
cfg.val_dataloader.dataset.ann_file = 'COCO_motorcycle (pixel).json'
cfg.val_dataloader.dataset.data_root = cfg.data_root
cfg.val_dataloader.dataset.data_prefix.img = 'images/'
cfg.val_dataloader.dataset.metainfo = cfg.metainfo
cfg.test_dataloader = cfg.val_dataloader
cfg.val_evaluator.ann_file = cfg.data_root+'/'+'COCO_motorcycle (pixel).json'
cfg.val_evaluator.metric = ['segm']
cfg.test_evaluator = cfg.val_evaluator
cfg.default_hooks.checkpoint = dict(type='CheckpointHook', interval=10, max_keep_ckpts=2, save_best='auto')
cfg.default_hooks.logger.interval = 20
cfg.custom_hooks[1].switch_epoch = 300 - cfg.stage2_num_epochs
cfg.train_cfg.max_epochs = cfg.max_epochs
cfg.train_cfg.val_begin = 20
cfg.train_cfg.val_interval = 2
cfg.train_cfg.dynamic_intervals = [(300 - cfg.stage2_num_epochs, 1)]
# cfg.train_dataloader.dataset = dict(dict(type='RepeatDataset',times=5,dataset = cfg.train_dataloader.dataset))
cfg.param_scheduler[0].end = 100
cfg.param_scheduler[1].eta_min = cfg.base_lr * 0.05
cfg.param_scheduler[1].begin = cfg.max_epochs // 2
cfg.param_scheduler[1].end = cfg.max_epochs
cfg.param_scheduler[1].T_max = cfg.max_epochs //2
set_random_seed(0, deterministic=False)
cfg.visualizer.vis_backends.append({
"type":'WandbVisBackend'})
#------------------------------------------------------
config=f'./configs/rtmdet/rtmdet-ins_l_1xb4-100e_motorcycle.py'
with open(config, 'w') as f:
f.write(cfg.pretty_text)
- start training
!python tools/train.py {
config}
- Shows the value of the indicator with the best performance of the model
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.561
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.758
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.614
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.017
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.195
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.633
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.543
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.645
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.649
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.036
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.246
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.721
07/23 19:31:52 - mmengine - INFO - segm_mAP_copypaste: 0.561 0.758 0.614 0.017 0.195 0.633
07/23 19:31:52 - mmengine - INFO - Epoch(val) [98][40/40] coco/segm_mAP: 0.5610 coco/segm_mAP_50: 0.7580 coco/segm_mAP_75: 0.6140 coco/segm_mAP_s: 0.0170 coco/segm_mAP_m: 0.1950 coco/segm_mAP_l: 0.6330 data_time: 0.0491 time: 3.1246
Visualize the training process
- We can log in to
wandb
the platform to view the indicator changes during the training process
- It can be seen
segm_mAP
that the value is still rising. Due to time constraints, I only ran 100.epoch
If you try it yourself, you can try to run 300. It is estimated that the effect will be better.
Inference on test images
- After training is complete, we load the best performing model and perform inference on test images
from mmengine.visualization import Visualizer
import mmcv
from mmdet.apis import init_detector, inference_detector
import glob
img = mmcv.imread('data/images/Screenshot (446).png',channel_order='rgb')
checkpoint_file = glob.glob('./work_dir/best_coco_segm_mAP*.pth')[0]
model = init_detector(cfg, checkpoint_file, device='cuda:0')
new_result = inference_detector(model, img)
visualizer_now = Visualizer.get_current_instance()
visualizer_now.dataset_meta = model.dataset_meta
visualizer_now.add_datasample('new_result', img, data_sample=new_result, draw_gt=False, wait_time=0, out_file=None, pred_score_thr=0.5)
visualizer_now.show()
Troubleshooting
- The official wrote a document for troubleshooting some very common problems, I will put the address here
- The most frequent error I encountered in the process of running is that it
valueerror-need-at-least-one-array-to-concatenate
is also explained in the official troubleshooting manual above. To check the number of categories and tags and palettes, but I checked and found that there is no such problem, and finally found that there is another factor that may cause this error - Due to the update of the framework,Labels and PalettesThe way of writing has changed, and the wrong way of writing is:
cfg.metainfo = {
'CLASSES': ('Rider', 'My bike', 'Moveable', 'Lane Mark', 'Road', 'Undrivable', ),
'PALETTE': [
(141, 211, 197),(255, 255, 179),(190, 186, 219),(245, 132, 109),(127, 179, 209),(251, 180, 97),
]
}
- The uppercase
CLASSES
andPALETTE
are no longer applicable in the new version. If you want to change it to lowercase, it will not be saved
cfg.metainfo = {
'classes': ('Rider', 'My bike', 'Moveable', 'Lane Mark', 'Road', 'Undrivable', ),
'palette': [
(141, 211, 197),(255, 255, 179),(190, 186, 219),(245, 132, 109),(127, 179, 209),(251, 180, 97),
]
}