Play with the data set file, training plan file, operation information file and specific parameter interpretation in MMDetection-MMDetection (2)

1. Interpretation of data set files and specific parameters in MMDetection

This article introduces in detail the data set files and specific parameter meanings required to make your own MMDetection configuration file.
First, introduce the following CocoDataset class functions in the coco.py file. As the name suggests, if we use the coco data set format, we need to call coco. py file, if the coco public dataset is used, it can be called directly.
If you need to train your own data set, you need to change the CLASSES in the CocoDataset class under the coco.py file to the type corresponding to your own data set. If there is only one type, remember to add a comma after a class.
①Original coco public dataset category

CLASSES = ('person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',  
           'train', 'truck', 'boat', 'traffic light', 'fire hydrant',  
           'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',  
           'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe',  
           'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',  
           'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat',  
           'baseball glove', 'skateboard', 'surfboard', 'tennis racket',  
           'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl',  
           'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot',  
           'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',  
           'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop',  
           'mouse', 'remote', 'keyboard', 'cell phone', 'microwave',  
           'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock',  
           'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush')

The original coco data set has a total of 80 classes, namely 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light' ', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', ' bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite ', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife','spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush。'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush。'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush。'scissors', 'teddy bear', 'hair drier', 'toothbrush。'scissors', 'teddy bear', 'hair drier', 'toothbrush。

②Self-built data sets of more than one type

CLASSES = ('plane', 'baseball-diamond', 'bridge', 'ground-track-field', 
            'small-vehicle', 'large-vehicle', 'ship', 'tennis-court', 'basketball-court', 
            'storage-tank', 'soccer-ball-field', 'roundabout', 'harbor', 'swimming-pool', 
            'helicopter')

The self-built DOTA dataset in coco format has a total of 15 classes, namely 'plane', 'baseball-diamond', 'bridge', 'ground-track-field', 'small-vehicle', 'large-vehicle' ', 'ship', 'tennis-court', 'basketball-court','storage-tank', 'soccer-ball-field', 'roundabout', 'harbor', 'swimming-pool', 'helicopter' .

③ There is only one type of self-built data set

CLASSES =('Belt', )

The coco data set format is shown in the figure below

cocodata——annotations——instances_train2017.json
                     ——instances_val2017.json
        ——train2017
        ——val2017

The coco data set format is divided into three folders, namely annotations, train2017, val2017, and annotations contains two files instances_train2017.json and instances_val2017.json, which are the labeled json files of the training set and the labeled json files of the verification set; train2017 is the corresponding training set image; val2017 is the corresponding test set image.

The following is the data set file and parameter interpretation of the coco data set used for target detection coco_detection.py

dataset_type = 'CocoDataset'     #指采用mmdet/datasets下的coco.py文件中的CocoDataset(CustomDataset)类
data_root = 'data/coco/'         #coco数据集的路径
img_norm_cfg = dict(             #对输入图片进行标准化处理的配置,减去mean,除以std,要将读取的bgr格式图像转为rgb通道排列格式
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),                             #首先读取图片数据
    dict(type='LoadAnnotations', with_bbox=True),               #读取图片对应的Annotations标注文件,默认用于目标检测,带bbox框
    dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),#图像增强中的一种预处理方式,图片变换,最大尺寸为(1333,800),keep_ratio表示保持图片原始比例。keep_ratio=False时,直接按照config配置中的img_scale来缩放图片,大值代表长边,小值代表短边,不会保持原有图片比例。
    dict(type='RandomFlip', flip_ratio=0.5),                    #图像增强一种预处理方式,图像翻转,翻转几率0.5
    dict(type='Normalize', **img_norm_cfg),                     #图像标准化,用定义的img_norm_cfg参数进行图像标准化
    dict(type='Pad', size_divisor=32),                          #图像填充,将变换后的图像填充致能被32整除,这步操作的目的是避免卷积时,特征损失
    dict(type='DefaultFormatBundle'),                           #这个过程对result中的缺失参数进行补充,并将相关数据封装成tensor格式
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),#重新整理result中的keys中的参数
]
test_pipeline = [
    dict(type='LoadImageFromFile'),                             #读取图片数据
    dict(                                                       #封装好的测试数据的增强操作,大体和前面类似
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),                                  #图片变换,最大尺寸为(1333,800)
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),               #keep_ratio表示保持图片原始比例。keep_ratio=False时,直接按照config配置中的img_scale来缩放图片,大值代表长边,小值代表短边,不会保持原有图片比例
            dict(type='RandomFlip'),                            #图像增强一种预处理方式,图像翻转,翻转几率0.5,默认翻转几率也是0.5
            dict(type='Normalize', **img_norm_cfg),             #图像标准化,用定义的img_norm_cfg参数进行图像标准化
            dict(type='Pad', size_divisor=32),                  #图像填充,将变换后的图像填充致能被32整除,这步操作的目的是避免卷积时,特征损失
            dict(type='ImageToTensor', keys=['img']),           #把图片转成成torch的tensor数据
            dict(type='Collect', keys=['img']),                 ##重新整理result中的keys中的参数
        ])
]
data = dict(
    samples_per_gpu=2,                                          #每个GPU的batch_size,一般coco公共数据集,单个batch_size大概吃5000M显存,量力而行
    workers_per_gpu=2,                                          #每个GOU的线程数,这个线程数不是越大越好,需要自己适配
    train=dict(
        type=dataset_type,                                          #coco数据集格式
        ann_file=data_root + 'annotations/instances_train2017.json',#coco数据集训练集annotations标注文件路径
        img_prefix=data_root + 'train2017/',                        #coco数据集训练集路径
        pipeline=train_pipeline),                                   #pipeline采用上方定义过的train_pipeline
    val=dict(
        type=dataset_type,                                          #coco数据集格式
        ann_file=data_root + 'annotations/instances_val2017.json',  #coco数据集验证集annotations标注文件路径
        img_prefix=data_root + 'val2017/',                          #coco数据集验证集路径
        pipeline=test_pipeline),                                    #pipeline采用上方定义过的test_pipeline
    test=dict(
        type=dataset_type,                                          #coco数据集格式
        ann_file=data_root + 'annotations/instances_val2017.json',  #coco数据集测试集annotations标注文件路径
        img_prefix=data_root + 'val2017/',                          #coco数据集测试集路径
        pipeline=test_pipeline))                                    #pipeline采用上方定义过的test_pipeline
evaluation = dict(interval=1, metric='bbox')                        #evaluation = dict(interval=1, metric='bbox') ,每interval(间隔)一个epoch,进行一次评估,评估指标为metric中定义的

2. Interpretation of training plan files and specific parameters in MMDetection

The following is an explanation of the specific parameters in the training plan file. This file mainly includes the selection and parameter setting of the optimizer and the selection and parameter setting of the learning strategy. The meaning of warmup is: since the weights of the model are random at the beginning of training Initialized, if you choose a larger learning rate at this time, it may cause instability (oscillation) of the model. Choosing the Warmup preheating learning rate method can make the learning rate smaller in the epoch or some steps when the training starts. With a small preheating learning rate, the model can gradually stabilize. After the model is relatively stable, select the preset learning rate for training, so that the model convergence speed becomes faster and the model effect is better.

# optimizer
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001) #选用SGD优化器,初始学习率为0.02,momentum是SGD的一种加速超参,weight_decay权重惩罚参数为0.0001
optimizer_config = dict(grad_clip=None)                                  #防止梯度爆炸的策略
# learning policy 学习策略
lr_config = dict(
    policy='step',                                                       #采用step学习策略,固定步长进行学习率衰减
    warmup='linear', #网络训练开始时,用linear策略作为从预热的学习率,训练一些epoch或者step后,再修改为预先设置的学习率来完成训练
                     #由于刚开始训练时,模型的权重(weights)是随机初始化的,此时若选择一个较大的学习率,可能带来模型的不稳定(振荡),选择Warmup预热学习率的方式,可以使得开始训练的epoch或者一些step内学习率较小,在预热的小学习率下,模型可以慢慢趋于稳定,等模型相对稳定后再选择预先设置的学习率进行训练,使得模型收敛速度变得更快,模型效果更佳。
    warmup_iters=500,#预热学习率迭代500次
    warmup_ratio=0.001,#预热学习率的初始学习比率,0.001太小,一般0.1
    step=[8, 11])       #在8-11个epoch后开始进行step学习率策略
runner = dict(type='EpochBasedRunner', max_epochs=12)#默认最大训练轮数为12轮,自行更改

 3. Interpretation of the operation information file and specific parameters in MMDetection

3.1 Run the file code

The following is the running information file contained in a complete MMDetection configuration file, mainly including: weight configuration file, logging configuration file and other hyperparameter settings.

checkpoint_config = dict(interval=1) #权重配置文件,每间隔多少轮保存一次权重文件
# yapf:disable
log_config = dict(                   #日志记录器的配置文件,每间隔多少轮保存一次logger日志文件
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'), #开启文本记录
        # dict(type='TensorboardLoggerHook') #开启可视化训练,将训练结果记录到tensorboard中,方便可视化。如果需要开启将其取消注释即可
    ])
# yapf:enable
custom_hooks = [dict(type='NumClassCheckHook')]#自定义钩子,用户需要在before_run、after_run、before_epoch、after_epoch、before_iter和after_iter中指定钩子在训练的每个阶段将做什么。

dist_params = dict(backend='nccl')#类似与多机多卡、单机单卡等分布式训练的一个东西
log_level = 'INFO'                #程序正常运行的info级别的日志信息
load_from = None                  #预训练模型的权重文件
resume_from = None                #跟着上次的权重文件继续训,意义为是否在保存时覆盖
workflow = [('train', 1)]         #自定义为训练几个epoch,进行一次验证

3.2 Parameter details

checkpoint_config = dict(interval=1) : weight configuration file, how many rounds to save the weight file every interval
log_config = dict(interval=50,hooks=[dict(type='TextLoggerHook')]) : configuration file of the logger, Save the logger log file every number of rounds, dict(type='TensorboardLoggerHook') #Start visual training, record the training results in tensorboard for easy visualization. Uncomment it if you need to enable it.
custom_hooks = [dict(type='NumClassCheckHook')] : Custom hooks, users need to specify what the hook will do in each stage of training in before_run, after_run, before_epoch, after_epoch, before_iter and after_iter.
dist_params = dict(backend='nccl') : something similar to distributed training such as multi-machine multi-card, single-machine single-card, etc.
log_level = 'INFO' : info-level log information for the normal operation of the program
load_from = None : pre-trained model The weight file
resume_from = None : Continue training with the last weight file, which means whether to overwrite workflow when saving.
= [('train', 1)] : Customize to train several epochs and perform a verification

 

Interpretation of the data set files, training plan files, running information files in the above MMDetection, and playing with the model framework files and parameters in MMDetection-MMDetection (1)

Four files can form a complete MMDetection training test file. Readers can first read these four files in detail and understand the meaning of the corresponding parameters, and then combine the following links to make their own configuration files. Play MMDetection-MMDetection to make your own configuration
files (three)

Guess you like

Origin blog.csdn.net/weixin_42715977/article/details/130108936