[Target detection] Record my process of training YOLOX on my own infrared data set and the pits I encountered

Recently, I trained the YOLOX network (in the form of COCO data set) on my own infrared data set, and recorded the process.
YOLOX code: https://github.com/Megvii-BaseDetection/YOLOX

Dataset preparation

Generate a json file of data labels

The format of the data set in my hand is a jpg image and the corresponding txt label file. To train YOLOX, you need to process the label file first to generate a label file in the COCO data set format. Since there is no content in this part of the official code, I Find a similar code in the data_prepare.py file of QueryDet , and modify it slightly to generate the json tag file we need.

code modification

Modify the directory information of the dataset

Create your own exp file.
I want to train yolox_l first, so I created a new file yolox_l_mydata.py in the exps\default directory, and then copied the contents of yolox_l.py to increase the directory information of the dataset:

        self.data_dir = "dataset"
        self.train_ann = "dataset/annotations/instances_train2017.json"
        self.val_ann = "dataset/annotations/instances_val2017.json"

Modify the number of categories

Also increase the number of categories in the yolox_l_mydata.py file, here I have only one category of targets:

        self.num_classes = 1

The final yolox_l_mydata.py file is as follows:

#!/usr/bin/env python3
# -*- coding:utf-8 -*-
# Copyright (c) Megvii, Inc. and its affiliates.

import os

from yolox.exp import Exp as MyExp

class Exp(MyExp):
    def __init__(self):
        super(Exp, self).__init__()
        self.depth = 1.0
        self.width = 1.0
        self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
        
        self.data_dir = "dataset"
        self.train_ann = "dataset/annotations/instances_train2017.json"
        self.val_ann = "dataset/annotations/instances_val2017.json"
        
        self.num_classes = 1

edit label

The target I need to detect is named target, so comment out all the original categories in the yolox\data\datasets\coco_classes.py file and change it to:

#!/usr/bin/env python3
# -*- coding:utf-8 -*-
# Copyright (c) Megvii, Inc. and its affiliates.

COCO_CLASSES = (
    "target",
)

train

Then you should be able to train (hope I didn't miss anything):

python tools/train.py -f exps/default/yolox_l_mydata.py -d 2 -b 16 --fp16 -o -c pretrained/yolox_l.pth --cache

Among them,
-d: number of GPUs
-b: batch size (recommended to be set to 8 times the number of GPUs)
--fp16: mixed precision training
--cache: using RAM cache to speed up training (we now support RAM caching to speed up training! Make sure you have enough system RAM when adopting it. )

pit i encountered

There is a problem that the image and the label do not correspond when generating the json label file

This part of the content is recorded in [python] the tragedy caused by os.listdir .

After changing the data set, you need to manually delete the cache data file

This is the pit! Fucked me for many days! And the only reason I want to write this blog!
As you can see from the above, the labels I generated at the beginning did not correspond to the pictures, so I regenerated the json file again later, and then retrained the network, but! The change in loss is the same as before, and it can’t go down at all. I can’t figure it out, thinking that there is still a problem with my data processing part, so I have been looking for where the code for generating json is wrong. I searched and searched and searched. Did not find the problem.
Today, I am ready to change the data set format for training (I saw YOLOX trained in the format of training yolo: https://github.com/xialuxi/yolox-yolov5 , ready to try), and then opened the data set folder again , and suddenly found that there was a file that was not there before.img_resized_cache_train2017.array, this file is automatically generated when the network is trained for the first time, there are dozens of G, I guess the content of all the data sets is placed in this cache file, and it will be read directly when training the network in the future Cache files without re-reading from the annotations, train2017, and val2017 folders to speed up training. That is to say! Although I corrected the label file above, when I train the network later, it always reads the cache file generated by my previous wrong label file ! So the loss when I train the network can't be reduced, because the picture labels don't correspond!

Crying, I’m crying if I talk too much, thanks to the fact that I’ve been looking for this question for so long TT


The part about YOLOX training is recorded here first, and the network training is finished to see how the effect is.

Guess you like

Origin blog.csdn.net/zylooooooooong/article/details/121380271