[YOLOv6 deploys and trains its own data set in detail (super many BUG stepping records)]

Foreword:

Recently, I saw the YOLOv6 paper published by Meituan, whose speed can reach 1234FPS, so I wanted to try to deploy the test and train my own data set. Unexpectedly, the source code of the official clone is full of bugs. During the deployment process, I saw that other scholars also encountered problems. Troubled by the same BUG, ​​some BUG officials did not give a clear solution. After combining the experience of other scholars and my own understanding, I have successfully completed a full series of deployment of the model, including: training, evaluation and reasoning, (follow-up TensorRT accelerates and continues to update), and I will summarize the pitfall records, hoping to help Novices avoid pits.

Model download and environment deployment

Yolov6 model code: https://github.com/meituan/YOLOv6/tree/0.2.0
Yolov6 paper: https://arxiv.org/abs/2209.02976

Environment configuration

The code of the YOLO series will give the "requirements.txt" file of the required function package and version information, enter the environment where you are going to run the model, cd to the yolo project directory, and execute the installation command to automatically install "requirements.txt" Feature packages listed in the file:

cd YOLOv6
pip install -r requirements.txt

Note: If you need to use GPU for inference, you need to install the graphics card driver, CUDA, cudnn and GPU version of torch corresponding to the computer before. (For the operation of this part, please refer to my other blog: GPU version PyTorch detailed installation tutorial )

reasoning test

After the environment configuration is complete, you can use the official weights to test to check whether the basic environment deployment is correct:
(Download the official weights and click the desired weight name in the performance comparison table in the README to jump to the download:)
insert image description here
Test instructions:

python tools/infer.py --weights yolov6s.pt --source img.jpg / imgdir 

Note: It is not yet supported to call the camera to detect

The pit in the testing process! ! !

You can observe the structure of the YOLOv6 project, and you will find that it is unusual. Just like the test steps above, the author puts the code of reasoning, training, and evaluation under tools, and there is another subfolder of yolov6 in the project, which contains Many functions. This complex structure also leads to some errors in the calling process:

1. The path in infer.py is wrong

In fact, this point can be solved by carefully reading the error report. The author did not consider the structure of the folder in the source code, and the loading path of some parameters is wrong, such as the paths in the figure below. It is under tools, so when calling the file in the upper directory, add ".../" in front:
insert image description here

2. Error: AssertionError: font path not exists: ./yolov6/utils/Arial.ttf

This error is still the same problem. The official code has written the wrong path. Specifically, in the yolov6/core/inferer.py program, on line 244, a "." is also added in front of this path. The correct code is shown in the figure below: After
insert image description here
modification , just run the test command:
insert image description here

train your own dataset

1. Data set preparation

The data set in coco format is recommended in the official README. I used the VOC data set during training. I will not explain the annotation of the data set here, which is similar to YOLOv5. The exported tags can be exported in XML format. Use the conversion script to convert the XML tags into the txt tags required by yolo, and automatically arrange the format of the voc data set. The conversion script will be shared with you below: (The script is used when deploying yolov5, and the final generated file is named after yolov5, which is not a bug and does not affect the use!

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
import random
from shutil import copyfile

classes=["A""B"]   #标签标注时的name

TRAIN_RATIO = 80       #数据集划分比:训练80%     

def clear_hidden_files(path):
    dir_list = os.listdir(path)
    for i in dir_list:
        abspath = os.path.join(os.path.abspath(path), i)
        if os.path.isfile(abspath):
            if i.startswith("._"):
                os.remove(abspath)
        else:
            clear_hidden_files(abspath)

def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

def convert_annotation(image_id):
    in_file = open('VOCdevkit/VOC2007/Annotations/%s.xml' %image_id)        #只用修改这里加载的XML标签地址即可,执行脚本后会自动生成标准的VOC格式数据集
    out_file = open('VOCdevkit/VOC2007/YOLOLabels/%s.txt' %image_id, 'w')
    tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
    in_file.close()
    out_file.close()

wd = os.getcwd()
wd = os.getcwd()
data_base_dir = os.path.join(wd, "VOCdevkit/")
if not os.path.isdir(data_base_dir):
    os.mkdir(data_base_dir)
work_sapce_dir = os.path.join(data_base_dir, "VOC2007/")
if not os.path.isdir(work_sapce_dir):
    os.mkdir(work_sapce_dir)
annotation_dir = os.path.join(work_sapce_dir, "Annotations/")
if not os.path.isdir(annotation_dir):
        os.mkdir(annotation_dir)
clear_hidden_files(annotation_dir)
image_dir = os.path.join(work_sapce_dir, "JPEGImages/")
if not os.path.isdir(image_dir):
        os.mkdir(image_dir)
clear_hidden_files(image_dir)
yolo_labels_dir = os.path.join(work_sapce_dir, "YOLOLabels/")
if not os.path.isdir(yolo_labels_dir):
        os.mkdir(yolo_labels_dir)
clear_hidden_files(yolo_labels_dir)
yolov5_images_dir = os.path.join(data_base_dir, "images/")
if not os.path.isdir(yolov5_images_dir):
        os.mkdir(yolov5_images_dir)
clear_hidden_files(yolov5_images_dir)
yolov5_labels_dir = os.path.join(data_base_dir, "labels/")
if not os.path.isdir(yolov5_labels_dir):
        os.mkdir(yolov5_labels_dir)
clear_hidden_files(yolov5_labels_dir)
yolov5_images_train_dir = os.path.join(yolov5_images_dir, "train/")
if not os.path.isdir(yolov5_images_train_dir):
        os.mkdir(yolov5_images_train_dir)
clear_hidden_files(yolov5_images_train_dir)
yolov5_images_test_dir = os.path.join(yolov5_images_dir, "val/")
if not os.path.isdir(yolov5_images_test_dir):
        os.mkdir(yolov5_images_test_dir)
clear_hidden_files(yolov5_images_test_dir)
yolov5_labels_train_dir = os.path.join(yolov5_labels_dir, "train/")
if not os.path.isdir(yolov5_labels_train_dir):
        os.mkdir(yolov5_labels_train_dir)
clear_hidden_files(yolov5_labels_train_dir)
yolov5_labels_test_dir = os.path.join(yolov5_labels_dir, "val/")
if not os.path.isdir(yolov5_labels_test_dir):
        os.mkdir(yolov5_labels_test_dir)
clear_hidden_files(yolov5_labels_test_dir)

train_file = open(os.path.join(wd, "yolov5_train.txt"), 'w')
test_file = open(os.path.join(wd, "yolov5_val.txt"), 'w')
train_file.close()
test_file.close()
train_file = open(os.path.join(wd, "yolov5_train.txt"), 'a')
test_file = open(os.path.join(wd, "yolov5_val.txt"), 'a')
list_imgs = os.listdir(image_dir) # list image files
prob = random.randint(1, 100)
print("Probability: %d" % prob)
for i in range(0,len(list_imgs)):
    path = os.path.join(image_dir,list_imgs[i])
    if os.path.isfile(path):
        image_path = image_dir + list_imgs[i]
        voc_path = list_imgs[i]
        (nameWithoutExtention, extention) = os.path.splitext(os.path.basename(image_path))
        (voc_nameWithoutExtention, voc_extention) = os.path.splitext(os.path.basename(voc_path))
        annotation_name = nameWithoutExtention + '.xml'
        annotation_path = os.path.join(annotation_dir, annotation_name)
        label_name = nameWithoutExtention + '.txt'
        label_path = os.path.join(yolo_labels_dir, label_name)
    prob = random.randint(1, 100)
    print("Probability: %d" % prob)
    if(prob < TRAIN_RATIO): # train dataset
        if os.path.exists(annotation_path):
            train_file.write(image_path + '\n')
            convert_annotation(nameWithoutExtention) # convert label
            copyfile(image_path, yolov5_images_train_dir + voc_path)
            copyfile(label_path, yolov5_labels_train_dir + label_name)
    else: # test dataset
        if os.path.exists(annotation_path):
            test_file.write(image_path + '\n')
            convert_annotation(nameWithoutExtention) # convert label
            copyfile(image_path, yolov5_images_test_dir + voc_path)
            copyfile(label_path, yolov5_labels_test_dir + label_name)
train_file.close()
test_file.close()

I won’t say much about the production of data sets. There are also many tutorials for the production of VOC data sets.

2. Modify voc.yaml

Modify the path of the training data set in data/voc.yaml:
insert image description here

3. Modify the parameters loaded in tools/train.py and start training

Pay attention to the path here, as well as the loading of training weights, about '--batch-size' and '--workers' according to the performance of your computer, for example:

如果报错:RuntimeError: CUDA out of memory. Tried to allocate 200.00 MiB (GPU 0; 4.00 GiB total capacity; 2.88 GiB already allocated; 0 bytes free; 2.89 GiB reserved in total by PyTorch)
就减小’–batch-size’。

If an error is reported: the page is too small to complete the operation.
just reduce '--workers'

The pit in the training process! ! !

1、报错:AttributeError: ‘Trainer’ object has no attribute ‘epoch’

insert image description here
Analysis and solution:
According to the error message, it means that there is no "epoch" class or parameter in the "Trainer" class, which can be traced back to the "Trainer" class of the yolov6/core/engine.py program: at line 261, the program calls "self.epoch
" , as follows:
insert image description here
Continue to trace the cause of the problem upwards, and I am surprised to find that the attribute "epoch" is not defined in this class! ! ! So it caused a very low-level error that could not be called! ! ! The solution is also very simple, just define this attribute in the initialization of the class. In fact, epoch is not unfamiliar, it is the parameter defined in "args". Therefore, as shown in the figure below, get the parameter from "args" and pass it to the epoch attribute. :
insert image description here

Successful training:

After dealing with the above pits, you can train:
insert image description here

TensorRT accelerated inference

The tensorRT acceleration of yolov6 is being deployed recently, and this part of the content will be updated after finishing the arrangement!

Guess you like

Origin blog.csdn.net/uuhhy/article/details/127622432