mmdetection model training skills

1. Pre-trained model

      Regarding the pre-training model, the general detection is to use the backbone pre-trained by ImageNet. This is the basic configuration, and this loading method is also officially supported.

      The more advanced one is to do a pre-training for the data set: cut out all the targets, and then train a good classification model. This initialization is much better than ImageNet.

      The last is to use the weights of the complete detection model pre-trained by coco. This effect is that the model converges quickly, and the effect is generally better, and it is also the most commonly used method. Since the category of each task is different, the weight needs to be fine-tuned. Here is a script for mmdetection to modify the coco pre-training weight category.

      The script takes cascade rcnn as an example, and the modification of other models is similar.

# for cascade rcnn
import torch
num_classes = 21
model_coco = torch.load("cascade_rcnn_x101_32x4d_fpn_2x_20181218-28f73c4c.pth")

# weight
model_coco["state_dict"]["bbox_head.0.fc_cls.weight"].resize_(num_classes,1024)
model_coco["state_dict"]["bbox_head.1.fc_cls.weight"].resize_(num_classes,1024)
model_coco["state_dict"]["bbox_head.2.fc_cls.weight"].resize_(num_classes,1024)
# bias
model_coco["state_dict"]["bbox_head.0.fc_cls.bias"].resize_(num_classes)
model_coco["state_dict"]["bbox_head.1.fc_cls.bias"].resize_(num_classes)
model_coco["state_dict"]["bbox_head.2.fc_cls.bias"].resize_(num_classes)
#save new model
torch.save(model_coco,"coco_pretrained_weights_classes_%d.pth"%num_classes)

2. Soft-NMS

    Soft-NMS improves the previous violent NMS. When the IOU exceeds a certain threshold, the box is no longer deleted directly, but its confidence (score) is lowered. If the score is lower than a threshold, it will be excluded; but If it is still higher after being lowered, it will remain.

    The settings in mmdetection are as follows:

test_cfg = dict(
    rpn=dict(
        nms_across_levels=False,
        nms_pre=1000,
        nms_post=1000,
        max_num=1000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        score_thr=0.05, nms=dict(type='soft_nms', iou_thr=0.5), max_per_img=100),
    keep_all_stages=False)

3. GIoULoss

      Under normal circumstances, using GIoULoss instead of L1Loss will increase the point.

      The configuration file for the original version (using L1Loss) is as follows:

    rpn_head=dict(
        type='RPNHead',
        in_channels=256,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[0.0, 0.0, 0.0, 0.0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
        loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
    roi_head=dict(
        type='StandardRoIHead',
        bbox_roi_extractor=dict(
            type='SingleRoIExtractor',
            roi_layer=dict(type='RoIAlign', out_size=7, sample_num=0),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32]),
        bbox_head=dict(
            type='Shared2FCBBoxHead',
            in_channels=256,
            fc_out_channels=1024,
            roi_feat_size=7,
            num_classes=10,
            bbox_coder=dict(
                type='DeltaXYWHBBoxCoder',
                target_means=[0.0, 0.0, 0.0, 0.0],
                target_stds=[0.1, 0.1, 0.2, 0.2]),
            reg_class_agnostic=False,
            loss_cls=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
            loss_bbox=dict(type='L1Loss', loss_weight=1.0))))

    The configuration file after adding GIoULoss is as follows:

    rpn_head=dict(
        type='RPNHead',
        in_channels=256,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[0.0, 0.0, 0.0, 0.0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
	reg_decoded_bbox=True,      # 使用GIoUI时注意添加
        loss_bbox=dict(type='GIoULoss', loss_weight=5.0)),
    roi_head=dict(
        type='StandardRoIHead',
        bbox_roi_extractor=dict(
            type='SingleRoIExtractor',
            roi_layer=dict(type='RoIAlign', out_size=7, sample_num=0),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32]),
        bbox_head=dict(
            type='Shared2FCBBoxHead',
	    
            in_channels=256,
            fc_out_channels=1024,
            roi_feat_size=7,
            num_classes=10,
            bbox_coder=dict(
                type='DeltaXYWHBBoxCoder',
                target_means=[0.0, 0.0, 0.0, 0.0],
                target_stds=[0.1, 0.1, 0.2, 0.2]),
            reg_class_agnostic=False,
            loss_cls=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
            reg_decoded_bbox=True,     # 使用GIoUI时注意添加
            loss_bbox=dict(type='GIoULoss', loss_weight=5.0))))

4. Model slimming tips

      When mmdetection saves the model, in addition to saving the weight, it also saves the original data and optimized parameters. However, when the model is tested, some parameters are useless. How to remove these useless parameters to reduce the model (about 50%)? See the code below:

import torch

model_path = "epoch_30.pth"
checkpoint = torch.load(model_path)
checkpoint['meta'] = None
checkpoint['optimizer'] = None

weights = checkpoint['state_dict']

state_dict = {"state_dict":weights}

torch.save(state_dict,  './epotch_30_new.pth')

5. Online Hard Case Mining (OHEM)

   Online hard case mining: Online selection of difficult samples for training during training (select samples with larger loss).

   The idea is relatively simple, the application in mmdetection is as follows:

   Take faster rcnn as an example:

_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
train_cfg = dict(rcnn=dict(sampler=dict(type='OHEMSampler')))

   The first line is the configuration file of your training model, and the second line sets the sampling method to online hard case mining.       

everything:

(1). GIoULoss has been completed

(2). Online hard case mining has been completed

(3). Mixed precision training

(4). Deformable convolution

(5). Multi-scale training

(6). Multi-scale test and data enhancement test

(7). The use of Albu data enhancement library

(8). Model fusion

(9). Split test

(10). Mosaic data enhancement

(11). PAFPN

(12). Sample balance to suppress the long tail distribution problem

Guess you like

Origin blog.csdn.net/Guo_Python/article/details/108148385