MMYOLO framework labeling, training, testing the whole process (supplementary)

foreword

pycocotools installation problem

  • First MMYOLOdownload the entire project from the project address, and then enter it in the command window pip install openmim. Then use to cd mmyoloenter the project file. Note that this mmyolois a path. For example, if your project file is under the D drive, then you should write it cd D:\mmyolo. After entering the project folder, entermim install -r requirements/mminstall.txt
  • During the installation process, I only reported an error during the installation of one library, which is pycocotoolsthe library Microsoft Visual C++ 14.0 or greater is required. The author of the corresponding Github library emphasized that the error message was not installed Visual C++ 2015 build tools. In the document, the author provided a download address , but an error was always reported during the download process. So I found the offline version, download link
  • After the installation is complete, try again mim install -r requirements/mminstall.txtwithout error, the problem is solved!

xml file path problem

  • Because two people marked separately during the calibration, resulting in inconsistent paths xmlin the file path, but not everyone has this problem, here is just a little record.
import xml.etree.ElementTree as ET
import os
def modify_xml_path(xml_file):
    tree = ET.parse(xml_file)
    root = tree.getroot()

    for path_elem in root.iter('path'):
        path_elem.text = os.path.basename(path_elem.text)

    tree.write(xml_file)

folder_path = './data/xml'

for file in os.listdir(folder_path):
        file_path = os.path.join(folder_path, file)
        modify_xml_path(file_path)

xml file to json file

  • MMYOLOBecause I didn't read the tutorial document when I was doing data calibration before , I used Labelimgthe software to calibrate, VOCformat, and generate xmlfiles.
  • LabelmeHowever, the generated files are used in the tutorial json, and the subsequent data separation, labeling inspection, data set exploration, etc. are all based on jsonfiles, so xmlthe files need to be converted into jsonfiles.
  • First organize the files as follows:
-mmyolo
	- data
		- images
			- 0001.bmp
			- 0002.bmp
			- ...
		- xml
			- 0001.xml
			- 0002.xml
			- ...
	 - configs
	 ...
  • Create a new folder under the project folder, put the picture in mmyolo, and put the calibration file in .data.\data\images.\data\xml
  • Then .\tools\dataset_converterscreate a new file in the folder xml2json.pyand fill in the code:
import xml.etree.ElementTree as ET
import os
import json

coco = dict()
coco['images'] = []
coco['type'] = 'instances'
coco['annotations'] = []
coco['categories'] = []

category_set = dict()
image_set = set()

category_item_id = -1
image_id = 0
annotation_id = 0


def addCatItem(name):
    global category_item_id
    category_item = dict()
    category_item['supercategory'] = 'none'
    category_item_id += 1
    category_item['id'] = category_item_id
    category_item['name'] = name
    coco['categories'].append(category_item)
    category_set[name] = category_item_id
    return category_item_id


def addImgItem(file_name, size):
    global image_id
    if file_name is None:
        raise Exception('Could not find filename tag in xml file.')
    if size['width'] is None:
        raise Exception('Could not find width tag in xml file.')
    if size['height'] is None:
        raise Exception('Could not find height tag in xml file.')
    image_id += 1
    image_item = dict()
    image_item['id'] = image_id
    print(file_name)
    image_item['file_name'] = file_name + ".jpg"
    image_item['width'] = size['width']
    image_item['height'] = size['height']
    coco['images'].append(image_item)
    image_set.add(file_name)
    return image_id


def addAnnoItem(object_name, image_id, category_id, bbox):
    global annotation_id
    annotation_item = dict()
    annotation_item['segmentation'] = []
    seg = []
    # bbox[] is x,y,w,h
    # left_top
    seg.append(bbox[0])
    seg.append(bbox[1])
    # left_bottom
    seg.append(bbox[0])
    seg.append(bbox[1] + bbox[3])
    # right_bottom
    seg.append(bbox[0] + bbox[2])
    seg.append(bbox[1] + bbox[3])
    # right_top
    seg.append(bbox[0] + bbox[2])
    seg.append(bbox[1])

    annotation_item['segmentation'].append(seg)

    annotation_item['area'] = bbox[2] * bbox[3]
    annotation_item['iscrowd'] = 0
    annotation_item['ignore'] = 0
    annotation_item['image_id'] = image_id
    annotation_item['bbox'] = bbox
    annotation_item['category_id'] = category_id
    annotation_id += 1
    annotation_item['id'] = annotation_id
    coco['annotations'].append(annotation_item)


def parseXmlFiles(xml_path):
    for f in os.listdir(xml_path):
        if not f.endswith('.xml'):
            continue
        xmlname = f.split('.xml')[0]

        bndbox = dict()
        size = dict()
        current_image_id = None
        current_category_id = None
        file_name = None
        size['width'] = None
        size['height'] = None
        size['depth'] = None

        xml_file = os.path.join(xml_path, f)
        print(xml_file)

        tree = ET.parse(xml_file)
        root = tree.getroot()
        if root.tag != 'annotation':
            raise Exception('pascal voc xml root element should be annotation, rather than {}'.format(root.tag))

        # elem is <folder>, <filename>, <size>, <object>
        for elem in root:
            current_parent = elem.tag
            current_sub = None
            object_name = None

            if elem.tag == 'folder':
                continue

            if elem.tag == 'filename':
                file_name = xmlname
                if file_name in category_set:
                    raise Exception('file_name duplicated')

            # add img item only after parse <size> tag
            elif current_image_id is None and file_name is not None and size['width'] is not None:
                if file_name not in image_set:
                    current_image_id = addImgItem(file_name, size)
                    print('add image with {} and {}'.format(file_name, size))
                else:

                    raise Exception('duplicated image: {}'.format(file_name))

                    # subelem is <width>, <height>, <depth>, <name>, <bndbox>
            for subelem in elem:
                bndbox['xmin'] = None
                bndbox['xmax'] = None
                bndbox['ymin'] = None
                bndbox['ymax'] = None

                current_sub = subelem.tag
                if current_parent == 'object' and subelem.tag == 'name':
                    object_name = subelem.text
                    if object_name not in category_set:
                        current_category_id = addCatItem(object_name)
                    else:
                        current_category_id = category_set[object_name]

                elif current_parent == 'size':
                    if size[subelem.tag] is not None:
                        raise Exception('xml structure broken at size tag.')
                    size[subelem.tag] = int(subelem.text)

                # option is <xmin>, <ymin>, <xmax>, <ymax>, when subelem is <bndbox>
                for option in subelem:
                    if current_sub == 'bndbox':
                        if bndbox[option.tag] is not None:
                            raise Exception('xml structure corrupted at bndbox tag.')
                        bndbox[option.tag] = int(float(option.text))

                # only after parse the <object> tag
                if bndbox['xmin'] is not None:
                    if object_name is None:
                        raise Exception('xml structure broken at bndbox tag')
                    if current_image_id is None:
                        raise Exception('xml structure broken at bndbox tag')
                    if current_category_id is None:
                        raise Exception('xml structure broken at bndbox tag')
                    bbox = []
                    # x
                    bbox.append(bndbox['xmin'])
                    # y
                    bbox.append(bndbox['ymin'])
                    # w
                    bbox.append(bndbox['xmax'] - bndbox['xmin'])
                    # h
                    bbox.append(bndbox['ymax'] - bndbox['ymin'])
                    print('add annotation with {},{},{},{}'.format(object_name, current_image_id, current_category_id,
                                                                   bbox))
                    addAnnoItem(object_name, current_image_id, current_category_id, bbox)


if __name__ == '__main__':
    xml_path = './data/xml'
    json_file = './data/annotations/annotations_all.json'
    parseXmlFiles(xml_path)
    json.dump(coco, open(json_file, 'w'))
  • After running the code, the files can ./data/annotationsbe generated under the folder annotations_all.json. The above conversion code is a reference to a blogger’s blog, not written by me, but due to time constraints, I forgot the source address. If anyone sees it, please private message me and indicate the source.
  • It should be noted here that if the suffix of your picture is not jpgand png, please open the generated annotations_all.jsonfile, check file_namethe fields, and use a text editor to replace the suffix. For example, if my picture is .bmpin format, then I need to .jpgreplace it with.bmp
  • Finally, you need to ./data/annotationscreate a new one under the folder class_with_id.txtto save the type corresponding to the value label. We can open annotations_all.jsonthe file again, drag to the end, and find ''categories''the field, such as my jsonfile
"categories": [{
    
    "supercategory": "none", "id": 0, "name": "cat"}, {
    
    "supercategory": "none", "id": 1, "name": "dog"}]}
  • You can see that the type 0corresponds to catthe type , we open the file and fill in the following content:1dogclass_with_id.txt
0 cat
1 dog
  • So far, we have completed the conversion of tutorial 3.1 using scripts, and its format is the same as that of the tutorial. Final file format organization:
-mmyolo
	- data
		- images
			- 0001.bmp
			- 0002.bmp
			- ...
		- xml
			- 0001.xml
			- 0002.xml
			- ...
		- annotations
			- annotations_all.json
			- class_with_id.txt
	 - configs
	 ...

Check the converted COCO label

  • Check the data format using the files mmyolounder the project folder .\tools\analysis_tools\browse_coco_json.py.
  • Modify the default value of the file parameter, according to the tutorial, only modify --img-dirthe and --ann-fileparameter, add defaultoptions, the code is as follows
def parse_args():
    parser = argparse.ArgumentParser(description='Show coco json file')
    parser.add_argument('--data-root', default=None, help='dataset root')
    parser.add_argument(
        '--img-dir', default='data/images', help='image folder path')
    parser.add_argument(
        '--ann-file',
        default='data/annotations/annotations_all.json',
        help='ann file path')
    parser.add_argument(
        '--wait-time', type=float, default=2, help='the interval of show (s)')
    parser.add_argument(
        '--disp-all',
        action='store_true',
        help='Whether to display all types of data, '
        'such as bbox and mask.'
        ' Default is to display only bbox')
    parser.add_argument(
        '--category-names',
        type=str,
        default=None,
        nargs='+',
        help='Display category-specific data, e.g., "bicycle", "person"')
    parser.add_argument(
        '--shuffle',
        action='store_true',
        help='Whether to display in disorder')
    args = parser.parse_args()
    return args
  • You can also follow the method in the tutorial and enter in the command window:
python tools/analysis_tools/browse_coco_json.py --img-dir ${
    
    图片文件夹路径} \
                                                --ann-file ${
    
    COCO label json 路径}
  • After the check is correct, this step is completed

Divide the dataset

  • We can still use the files under the project file .\tools\misc\coco_split.pyto complete this step
  • Modify the default value of the file parameter, according to the tutorial, only modify the --json, --out-dirand --ratiosparameters, add defaultoptions, the code is as follows
def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        '--json', type=str, default='./data/annotations/annotations_all.json', help='COCO json label path')
    parser.add_argument(
        '--out-dir', type=str, default='./data/annotations', help='output path')
    parser.add_argument(
        '--ratios',
        default=[0.9,0.1],
        nargs='+',
        type=float,
        help='ratio for sub dataset, if set 2 number then will generate '
        'trainval + test (eg. "0.8 0.1 0.1" or "2 1 1"), if set 3 number '
        'then will generate train + val + test (eg. "0.85 0.15" or "2 1")')
    parser.add_argument(
        '--shuffle',
        action='store_true',
        help='Whether to display in disorder')
    parser.add_argument('--seed', default=2023, type=int, help='seed')
    args = parser.parse_args()
    return args
  • In particular, you need to pay attention --ratiosto the writing method, for [0.9,0.1], you can also use the method in the tutorial, and enter in the command window:
python tools/misc/coco_split.py --json ${
    
    COCO label json 路径} \
                                --out-dir ${
    
    划分 label json 保存根路径} \
                                --ratios ${
    
    划分比例} \
                                [--shuffle] \
                                [--seed ${
    
    划分的随机种子}]
python tools/misc/coco_split.py --json ./data/cat/annotations/annotations_all.json \
                                --out-dir ./data/cat/annotations \
                                --ratios 0.8 0.2 \
                                --shuffle \
                                --seed 10
-mmyolo
	- data
		- images
			- 0001.bmp
			- 0002.bmp
			- ...
		- xml
			- 0001.xml
			- 0002.xml
			- ...
		- annotations
			- annotations_all.json
			- class_with_id.txt
			- trainval.json
			- test.json
	 - configs
	 ...

Create a new config file

  • Create a new folder under ./configsthe folder custom_dataset, and custom_datasetcreate a new file under the folder yolov6_l_syncbn_fast_1xb8-100e_animal.py. In fact, the configuration file can be named directly, but this naming has a certain meaning. For example, the previous one yolov6_l_syncbn_fastmeans that I am training YOLOV6-lthe backbone, and syncbnit means that the data on all cards (global sample data) are used to calculate the BN layer during multi-card training. The mean and standard deviation fastare the model models, 1xb8-100eindicating that I use 1 GPU for training, batch sizewhich is 8 and max_epoch100. Hence the name.
  • Create a new folder under the project folder work_dirsas a directory for model saving and other work. Open the file under the project folder .\configs\yolov6\README.md, download the pre-training weight of YOLOv6-l in advance yolov6_l_syncbn_fast_8xb32-300e_coco_20221109_183156-91e3c447.pth, and put it work_dirsin the folder.
    insert image description here
  • Since I am training YOLOV6-la model, what I inherit is yolov6_l_syncbn_fast_8xb32-300e_coco.pythe file.
    Please add a picture description
  • The configuration file and its comments are as follows:
# 继承的配置文件
_base_ = '../yolov6/yolov6_l_syncbn_fast_8xb32-300e_coco.py'

# 训练的最大epochs数
max_epochs = 100
# 数据所在文件夹
data_root = './data/'

# 模型工作目录
work_dir = './work_dirs'

# 模型预训练权重
load_from = './work_dirs/yolov6_l_syncbn_fast_8xb32-300e_coco_20221109_183156-91e3c447.pth'  # noqa

# 根据自己的 GPU 情况,修改 batch size,YOLOv6-l 默认为 8卡 x 32bs
# 设定batch size为8
train_batch_size_per_gpu = 8
# train_num_workers = nGPU x 4,即1 x 4 = 4
train_num_workers = 4
# 每 interval 轮迭代进行一次保存一次权重
save_epoch_intervals = 2

# 根据自己的 GPU 情况,修改 base_lr,修改的比例是 base_lr_default * (your_bs / default_bs)
# 即(your_bs / default_bs) = (8 / (8 x 32)) = 1 / 32
base_lr = _base_.base_lr / 32

# 根据 class_with_id.txt 类别信息,设置 class_name,顺序一定要对
class_name = ('cracked', 'complete')
num_classes = len(class_name)
# palette参数里面的元组,有多少种类就有多少个元组(r,g,b),否则报错
metainfo = dict(
    classes=class_name,
    palette=[(220, 17, 58), (0, 143, 10)]  # 画图时候的颜色,随便设置即可
)

train_cfg = dict(
    max_epochs=max_epochs,
    val_begin=20,  # 第几个 epoch 后验证,这里设置 20 是因为前 20 个 epoch 精度不高,测试意义不大,故跳过
    val_interval=save_epoch_intervals,  # 每 val_interval 轮迭代进行一次测试评估
    dynamic_intervals=[(max_epochs - _base_.num_last_epochs, 1)] # 到max_epochs - _base_.num_last_epochs时,每1轮执行一次评估
)

model = dict(
    bbox_head=dict(
        head_module=dict(num_classes=num_classes)),
    train_cfg=dict(
        initial_assigner=dict(num_classes=num_classes),
        assigner=dict(num_classes=num_classes))
)

train_dataloader = dict(
    batch_size=train_batch_size_per_gpu,
    num_workers=train_num_workers,
    dataset=dict(
        _delete_=True,
        type='ClassBalancedDataset',
        oversample_thr=0.5,
        dataset=dict(
            type=_base_.dataset_type,
            data_root=data_root,
            metainfo=metainfo,
            ann_file='annotations/trainval.json',
            data_prefix=dict(img='images/'),
            filter_cfg=dict(filter_empty_gt=False, min_size=32),
            pipeline=_base_.train_pipeline)))

val_dataloader = dict(
    dataset=dict(
        metainfo=metainfo,
        data_root=data_root,
        ann_file='annotations/trainval.json',
        data_prefix=dict(img='images/')))

test_dataloader = val_dataloader

val_evaluator = dict(ann_file=data_root + 'annotations/trainval.json')
test_evaluator = val_evaluator

optim_wrapper = dict(optimizer=dict(lr=base_lr))

default_hooks = dict(
    checkpoint=dict(
        type='CheckpointHook',
        interval=save_epoch_intervals,
        max_keep_ckpts=5,
        save_best='auto'),
    param_scheduler=dict(max_epochs=max_epochs),
    # logger 输出的间隔
    logger=dict(type='LoggerHook', interval=10))

custom_hooks = [
    dict(
        type='EMAHook',
        ema_type='ExpMomentumEMA',
        momentum=0.0001,
        update_buffers=True,
        strict_load=False,
        priority=49),
    dict(
        type='mmdet.PipelineSwitchHook',
        switch_epoch=max_epochs - _base_.num_last_epochs,
        switch_pipeline=_base_.train_pipeline_stage2)
]

visualizer = dict(vis_backends=[dict(type='LocalVisBackend'), dict(type='WandbVisBackend')])
visualizer = dict(vis_backends=[dict(type='LocalVisBackend'),dict(type='TensorboardVisBackend')])
  • It is not difficult to understand the first half of the configuration file, but train_cfgit may be a bit confusing at that time. I will explain it in more detail in sections later.

Detailed explanation of config section

  • In fact, the writing of the above configuration files is inherited to mmenginethe library, Github project address , and reference documents . There is a Chinese version of the document, which is not too difficult to understand.
  • In general, all configuration files are mmengine.runnerwritten based on methods. You can read its API to have a deeper understanding of configuration files.

train_cfg

  • In mmengine.runnerthe method, train_cfgthe parameters are described as follows: a password for establishing a training loop. If it does not provide a "type" key, it should contain "by_epoch" to decide which type of training loop should be used EpochBasedTrainLoopor not IterBasedTrainLoop. If specified train_cfg, it should also be specified train_dataloader. The default is None.
  • EpochBasedTrainLoop parameter documentation , let's look at the configuration code
train_cfg = dict(
    max_epochs=max_epochs,
    val_begin=20, 
    val_interval=save_epoch_intervals,
    dynamic_intervals=[(max_epochs - _base_.num_last_epochs, 1)])
  • max_epochs = max_epochs: maximum training max_epochsconduction
  • val_begin = 20epoch: Evaluate the test set after the 20th
  • val_interval = save_epoch_intervals: One test evaluation val_intervalper iteration
  • dynamic_intervals = [(max_epochs - _base_.num_last_epochs, 1)]): When max_epochs - _base_.num_last_epochsthe time comes, the evaluation is performed every 1 round

model

  • modelThis piece is mainly used to control the model architecture, so its changes are related to the inherited original model. For example, what I want to train is, then I will YOLOV6-lcontinue _base_to search for the basic class according to the following, that is, .\configs\yolov6\yolov6_s_syncbn_fast_8xb32-400e_coco.pyto find bbox_headthe field. The code is as follows:
bbox_head=dict(
        type='YOLOv6Head',
        head_module=dict(
            type='YOLOv6HeadModule',
            num_classes=num_classes,
            in_channels=[128, 256, 512],
            widen_factor=widen_factor,
            norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),
            act_cfg=dict(type='SiLU', inplace=True),
            featmap_strides=[8, 16, 32]),
        loss_bbox=dict(
            type='IoULoss',
            iou_mode='giou',
            bbox_format='xyxy',
            reduction='mean',
            loss_weight=2.5,
            return_iou=False)),
  • There are also train_cfgfields, the code is as follows:
train_cfg=dict(
        initial_epoch=4,
        initial_assigner=dict(
            type='BatchATSSAssigner',
            num_classes=num_classes,
            topk=9,
            iou_calculator=dict(type='mmdet.BboxOverlaps2D')),
        assigner=dict(
            type='BatchTaskAlignedAssigner',
            num_classes=num_classes,
            topk=13,
            alpha=1,
            beta=6),
    ),
  • Compare the code given in the tutorial
model = dict(
    bbox_head=dict(
        head_module=dict(num_classes=num_classes)),
    train_cfg=dict(
        initial_assigner=dict(num_classes=num_classes),
        assigner=dict(num_classes=num_classes))
)
  • It can be seen that only the parameters related to the number of categories are changed. For a more detailed explanation of the architecture parameters, you can read the official tutorial to learn the YOLOV5 configuration file

train_dataloader

  • train_dataloader: Runner.train()Used in , to provide training data for the model, for DataLoadermore configurable parameters, please refer to the PyTorch API documentation
  • Because of the small amount of data in the tutorial, datasetthere is an operation in which repeats the current data set times RepeatDatasetin each , and setting 5 means repeating 5 times. If your data set is large enough and you don’t need such an operation, you can delete it directly and become:epochn
train_dataloader = dict(
    batch_size=train_batch_size_per_gpu,
    num_workers=train_num_workers,
    dataset=dict(
        type=_base_.dataset_type,
        data_root=data_root,
        metainfo=metainfo,
        ann_file='annotations/trainval.json',
        data_prefix=dict(img='images/'),
        filter_cfg=dict(filter_empty_gt=False, min_size=32),
        pipeline=_base_.train_pipeline))
  • Because there is a problem of unbalanced samples in my data set, I use ClassBalancedDatasetthe operation, which makes the number of samples of each category relatively balanced by resampling the original data set or adjusting the sample weights.
  • oversample_thris a floating point number between 0 and 1. It specifies a threshold for determining which classes of samples need to be oversampled. Specifically, if the number of samples of a certain class is less than oversample_thr * max_samples, then the samples of that class will be oversampled.
train_dataloader = dict(
    batch_size=train_batch_size_per_gpu,
    num_workers=train_num_workers,
    dataset=dict(
        _delete_=True,
        type='ClassBalancedDataset',
        oversample_thr=0.5,
        dataset=dict(
            type=_base_.dataset_type,
            data_root=data_root,
            metainfo=metainfo,
            ann_file='annotations/trainval.json',
            data_prefix=dict(img='images/'),
            filter_cfg=dict(filter_empty_gt=False, min_size=32),
            pipeline=_base_.train_pipeline)))
  • For more other data processing methods, you can view the documentation
  • For datasetmore detailed parameters in , you can BASEDATASETfind in , refer to the documentation , MMYOLOthere is also a new parameter in , the default typeis the data format 'CocoDataset'COCO
    • type: data format type
    • data_root: data_prefixand ann_filethe root directory of
    • metainfo: metadata of the dataset, such as class information
    • ann_file: Annotation file path
    • data_prefix: The prefix of the training data. The default is dict(img_path='')
    • filter_cfg: configuration for filtering data
    • pipeline: processing pipeline

val_dataloader

  • val_dataloaderSome of datasetthe parameters train_dataloaderare the same as those in , so I won’t go into details here
val_dataloader = dict(
    dataset=dict(
        metainfo=metainfo,
        data_root=data_root,
        ann_file='annotations/trainval.json',
        data_prefix=dict(img='images/')))

test_dataloader

  • In the tutorial, it is directly val_dataloaderassigned to test_dataloader. Of course we can also write our own
test_dataloader = dict(
    dataset=dict(
        metainfo=metainfo,
        data_root=data_root,
        ann_file='annotations/test.json',
        data_prefix=dict(img='images/')))

val_evaluator

  • val_evaluatorAn evaluator object for computing validation metrics. It can be a dictionary or a list of dictionaries to build the evaluator.
val_evaluator = dict(ann_file=data_root + 'annotations/trainval.json')
  • In the inherited configuration file, the complete val_evaluatorconfiguration is:
val_evaluator = dict(
    type='mmdet.CocoMetric',
    proposal_nums=(100, 1, 10),
    ann_file=data_root + val_ann_file,
    metric='bbox')
  • Equivalent to only changing annfilethe parameters in the inherited file

test_evaluator

  • val_evaluatorIn the tutorial, the value is directly assigned to test_evaluator, and we can also write it ourselves
test_evaluator = dict(ann_file=data_root + 'annotations/test.json')

optim_wrapper

  • optim_wrapperComputes the gradient of the model parameters. If automatic mixed precision or gradient accumulation training is required. optim_wrapperThe type should be AmpOptimizerWrapper.
optim_wrapper = dict(optimizer=dict(lr=base_lr))
  • In the inherited configuration file, the complete optimizer package code is:
optim_wrapper = dict(
    type='OptimWrapper',
    optimizer=dict(
        type='SGD',
        lr=base_lr,
        momentum=0.937,
        weight_decay=weight_decay,
        nesterov=True,
        batch_size_per_gpu=train_batch_size_per_gpu),
    constructor='YOLOv5OptimizerConstructor')
  • The way of writing in the tutorial is equivalent to only changing the learning rate in the inheritance file

hook

  • hookProgramming is a programming mode, which refers to setting a location (mount point) at one or more locations of the program. When the program runs to a certain location, it will automatically call all the methods registered to the location at runtime.

default hook

default_hooks = dict(
    checkpoint=dict(
        type='CheckpointHook',
        interval=save_epoch_intervals,
        max_keep_ckpts=5,
        save_best='auto'),
    param_scheduler=dict(max_epochs=max_epochs),
    ## logger 输出的间隔
    logger=dict(type='LoggerHook', interval=10))
  • hookThe default type in the tutorial CheckpointHookis CheckpointHookto save the weight of the model at a given interval. If it is distributed multi-card training, only the master (master) process will save the weight.
  • If you want to learn more about its functions, that is, more parameters, you can refer to the CheckpointHook API documentation . Here I select the parameters that appear in the tutorial file.
    • interval: save cycle. If by_epoch=True, interval represents epochs (period), otherwise represents the number of iterations.
    • max_keep_ckpts: Maximum checkpoints to keep. In some cases we only need the latest few checkpoints and want to delete the old ones to save disk space.
    • save_best: If a metric is specified, it will measure the best checkpoint during evaluation. If a set of metrics passes, it measures the best set of checkpoints corresponding to the passed metrics.
  • About ParamSchedulerHook, we can find the parameters in the inherited configuration file:
default_hooks = dict(
    param_scheduler=dict(
        type='YOLOv5ParamSchedulerHook',
        scheduler_type='cosine',
        lr_factor=lr_factor,
        max_epochs=max_epochs),
    checkpoint=dict(
        type='CheckpointHook',
        interval=save_epoch_intervals,
        max_keep_ckpts=max_keep_ckpts,
        save_best='auto'))
  • The way the tutorial is written is equivalent to only changing param_schedulerin max_epochs.
  • LoggerHookResponsible for collecting logs and outputting logs to the terminal or output to files, TensorBoard and other backends. interval=10Output (or save) a log every 10 iterations ( ) in the tutorial

custom hook

custom_hooks = [
    dict(
        type='EMAHook',
        ema_type='ExpMomentumEMA',
        momentum=0.0001,
        update_buffers=True,
        strict_load=False,
        priority=49),
    dict(
        type='mmdet.PipelineSwitchHook',
        switch_epoch=max_epochs - _base_.num_last_epochs,
        switch_pipeline=_base_.train_pipeline_stage2)
]
  • EMAHookAn exponential moving average operation is performed on the model during training to improve the robustness of the model. Note: The model generated by exponential moving average is only used for validation and testing, and does not affect training.
  • EMAHooK API documentation , ExponentialMovingAverage API documentation , explanation of some parameters
    • momentum: emathe momentum used to update the parameters
    • update_buffers: If yes True, it computes running averages of model parameters and buffers.
    • strict_load: Whether to strictly enforce that state_dictthe keys in the checkpoint match the returned keysself.module.state_dict
    • priority: hookpriority
  • mmdet.PipelineSwitchHookis MMDetectionpart of the library for switch_epochswitching data pipelines, API docs

Dataset visualization

  • We can follow the tutorial and use .\tools\analysis_tools\dataset_analysis.pythe file to analyze the data. Note: The data at this time has been transformed, such as the ClassBalancedDatasetOR RepeatDatasetoperation
  • This file can generate 4 types of analysis graphs:
    • Display bboxa distribution plot of classes and instance counts:show_bbox_num
    • Show bboxthe distribution of class and instance widths and heights:show_bbox_wh
    • bboxDistribution plot showing class and instance width/height ratios:show_bbox_wh_ratio
    • bboxDisplay the distribution of class and instance areas under the area-based rule :show_bbox_area
  • Modify the default value of the file parameter, according to the tutorial, modify ----configthe , --val-dataset, --class-name, --area-rule, --funcand --out-dirparameters, (configNote that you need to replace the code in the code --configto run, otherwise an error will be reported) to add defaultoptions, the code is as follows
def parse_args():
    parser = argparse.ArgumentParser(
        description='Distribution of categories and bbox instances')
    parser.add_argument('--config', default='./configs/custom_dataset/yolov6_l_syncbn_fast_1xb8-100e_animal.py', help='config file path')
    parser.add_argument(
        '--val-dataset',
        default=False,
        action='store_true',
        help='The default train_dataset.'
        'To change it to val_dataset, enter "--val-dataset"')
    parser.add_argument(
        '--class-name',
        default=None,
        type=str,
        help='Display specific class, e.g., "bicycle"')
    parser.add_argument(
        '--area-rule',
        default=None,
        type=int,
        nargs='+',
        help='Redefine area rules,but no more than three numbers.'
        ' e.g., 30 70 125')
    parser.add_argument(
        '--func',
        default=None,
        type=str,
        choices=[
            'show_bbox_num', 'show_bbox_wh', 'show_bbox_wh_ratio',
            'show_bbox_area'
        ],
        help='Dataset analysis function selection.')
    parser.add_argument(
        '--out-dir',
        default='./dataset_analysis',
        type=str,
        help='Output directory of dataset analysis visualization results,'
        ' Save in "./dataset_analysis/" by default')
    args = parser.parse_args()
    return args
  • You can also follow the method in the tutorial and enter in the command window:
python tools/analysis_tools/dataset_analysis.py ${
    
    CONFIG} \
                                                [--val-dataset ${
    
    TYPE}] \
                                                [--class-name ${
    
    CLASS_NAME}] \
                                                [--area-rule ${
    
    AREA_RULE}] \
                                                [--func ${
    
    FUNC}] \
                                                [--out-dir ${
    
    OUT_DIR}]
  • Check the training set data distribution
python tools/analysis_tools/dataset_analysis.py ./configs/custom_dataset/yolov6_l_syncbn_fast_1xb8-100e_animal.py \
                                                --out-dir work_dirs/dataset_analysis_cat/train_dataset

Optimize Anchor Size

  • Since I am training the YOLOV6 model, this step is not required

The data processing part in the visualization config configuration

  • We can follow the tutorial and use .\tools\analysis_tools\browse_dataset.pythe file visualization data processing part.
  • Modify the default value of the file parameter, follow the tutorial, modify configto --config, run the code
def parse_args():
    parser = argparse.ArgumentParser(description='Browse a dataset')
    parser.add_argument('--config', default='./configs/custom_dataset/yolov6_l_syncbn_fast_1xb8-100e_animal.py', help='train config file path')
    parser.add_argument(
        '--phase',
        '-p',
        default='train',
        type=str,
        choices=['train', 'test', 'val'],
        help='phase of dataset to visualize, accept "train" "test" and "val".'
        ' Defaults to "train".')
    parser.add_argument(
        '--mode',
        '-m',
        default='transformed',
        type=str,
        choices=['original', 'transformed', 'pipeline'],
        help='display mode; display original pictures or '
        'transformed pictures or comparison pictures. "original" '
        'means show images load from disk; "transformed" means '
        'to show images after transformed; "pipeline" means show all '
        'the intermediate images. Defaults to "transformed".')
    parser.add_argument(
        '--out-dir',
        default='output',
        type=str,
        help='If there is no display interface, you can save it.')
    parser.add_argument('--not-show', default=False, action='store_true')
    parser.add_argument(
        '--show-number',
        '-n',
        type=int,
        default=sys.maxsize,
        help='number of images selected to visualize, '
        'must bigger than 0. if the number is bigger than length '
        'of dataset, show all the images in dataset; '
        'default "sys.maxsize", show all images in dataset')
    parser.add_argument(
        '--show-interval',
        '-i',
        type=float,
        default=3,
        help='the interval of show (s)')
    parser.add_argument(
        '--cfg-options',
        nargs='+',
        action=DictAction,
        help='override some settings in the used config, the key-value pair '
        'in xxx=yyy format will be merged into config file. If the value to '
        'be overwritten is a list, it should be like key="[a,b]" or key=a,b '
        'It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" '
        'Note that the quotation marks are necessary and that no white space '
        'is allowed.')
    args = parser.parse_args()
    return args
  • You can use the following command to check whether the data processing is up to standard:
python tools/analysis_tools/browse_dataset.py ./configs/custom_dataset/yolov6_l_syncbn_fast_1xb8-100e_animal.py \
                                              --show-interval 3

train

training visualization

  • MMYOLOCurrently provide 2 ways wandband TensorBoard, according to your own situationchoose onecan

wandb

  • I personally recommend a method, because you only need to log in to the web page to see the training situation in real time, which is very convenient, and the visualization is better
  • First of all, you need to register on the official website of wandb , and obtain the `` API Keys in the settingswandb
  • Then install wandb on the command line and log in
pip install wandb
# 运行了 wandb login 后输入上文中获取到的 API Keys ,便登录成功。
wandb login
  • Finally, add the configuration code at the end of the newly created configfile .\configs\custom_dataset\yolov6_l_syncbn_fast_1xb8-100e_animal.py:
visualizer = dict(vis_backends=[dict(type='LocalVisBackend'), dict(type='WandbVisBackend')])

TensorBoard

  • First you need to install Tensorboardthe environment
pip install tensorboard
  • Then add the configuration code at the end of the newly created configfile :.\configs\custom_dataset\yolov6_l_syncbn_fast_1xb8-100e_animal.py
visualizer = dict(vis_backends=[dict(type='LocalVisBackend'),dict(type='TensorboardVisBackend')])
  • After running the training command, Tensorboardthe file will be generated under the visualization folder work_dirs\yolov6_l_syncbn_fast_1xb8-100e_animal\${TIMESTAMP}\vis_data. Run the following command to use Tensorboardthe viewing loss, learning rate and coco/bbox_mAPother visualization data on the web page:

executive training

  • Open .\tools\train.pythe file, modify the parameters, configchange to --config, and set the default value.
def parse_args():
    parser = argparse.ArgumentParser(description='Train a detector')
    parser.add_argument('--config',default='./configs/custom_dataset/yolov6_l_syncbn_fast_1xb8-100e_animal.py' ,help='train config file path')
    parser.add_argument('--work-dir', help='the dir to save logs and models')
    parser.add_argument(
        '--amp',
        action='store_true',
        default=False,
        help='enable automatic-mixed-precision training')
    parser.add_argument(
        '--resume',
        nargs='?',
        type=str,
        const='auto',
        help='If specify checkpoint path, resume from it, while if not '
        'specify, try to auto resume from the latest checkpoint '
        'in the work directory.')
    parser.add_argument(
        '--cfg-options',
        nargs='+',
        action=DictAction,
        help='override some settings in the used config, the key-value pair '
        'in xxx=yyy format will be merged into config file. If the value to '
        'be overwritten is a list, it should be like key="[a,b]" or key=a,b '
        'It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" '
        'Note that the quotation marks are necessary and that no white space '
        'is allowed.')
    parser.add_argument(
        '--launcher',
        choices=['none', 'pytorch', 'slurm', 'mpi'],
        default='none',
        help='job launcher')
    parser.add_argument('--local_rank', type=int, default=0)
    args = parser.parse_args()
    if 'LOCAL_RANK' not in os.environ:
        os.environ['LOCAL_RANK'] = str(args.local_rank)

    return args
  • After running, you can see the specific information of the training on wandbthe web page or intensorboard
    insert image description here
  • The following is the accuracy of 1 x 2080Ti, batch size = 8, training 100 epoch best accuracy weights work_dirs\best_coco_bbox_mAP_epoch_97:
coco/bbox_mAP: 0.8910  coco/bbox_mAP_50: 1.0000  coco/bbox_mAP_75: 1.0000  coco/bbox_mAP_s: -1.0000  coco/bbox_mAP_m: -1.0000  coco/bbox_mAP_l: 0.8910  data_time: 0.0004  time: 0.0256

reasoning

  • Use the best model for inference, open .\demo\image_demo.pythe file, change --imgthe , --config, --checkpointparameters, it is worth noting that imgthe parameters can be folder paths, individual file paths, or URL. --configThe parameter is the configuration file we created. --checkpointThe parameters are the optimal weights for training.
def parse_args():
    parser = ArgumentParser()
    parser.add_argument('--img', default='./data/images/Image_20230621152815633.bmp', help='Image path, include image file, dir and URL.')
    parser.add_argument('--config', default='./configs/custom_dataset/yolov6_l_syncbn_fast_1xb8-100e_animal.py', help='Config file')
    parser.add_argument('--checkpoint', default='./work_dirs/best_coco_bbox_mAP_epoch_97.pth', help='Checkpoint file')
    parser.add_argument(
        '--out-dir', default='./output', help='Path to output file')
    parser.add_argument(
        '--device', default='cpu', help='Device used for inference')
    parser.add_argument(
        '--show', action='store_true', help='Show the detection results')
    parser.add_argument(
        '--deploy',
        action='store_true',
        help='Switch model to deployment mode')
    parser.add_argument(
        '--tta',
        action='store_true',
        help='Whether to use test time augmentation')
    parser.add_argument(
        '--score-thr', type=float, default=0.3, help='Bbox score threshold')
    parser.add_argument(
        '--class-name',
        nargs='+',
        type=str,
        help='Only Save those classes if set')
    parser.add_argument(
        '--to-labelme',
        action='store_true',
        help='Output labelme style label file')
    args = parser.parse_args()
    return args

Guess you like

Origin blog.csdn.net/qq_20144897/article/details/131476209