YoloV4 trains its own data set


1- Create a working folder

Create a new project folder to store the files needed for the next training.

mkdir train    # 或者其他英文名

Switch to working folder

cd train

Create relevant folders and files as shown in the figure below.

.
├── JPEGImages
├── Annotations
├── labels
├── backup
├── data
│   ├── train.data
│   ├── train.names
│   ├── yolov4.cfg
│   ├── yolov4-tiny.cfg
├── darknet
├── gen_files.py
├── yolov4.conv.137
├── yolov4-tiny.conv.29
The role of related folders
File/folder effect
JPEGImages Used to store pictures needed for training
Annotations Used to store the XML annotation file corresponding to the training image
labels Used to store txt annotation files in YOLO format
backup Used to store the trained model file
data Used to store some parameter files needed for model training
The role of related documents
file effect How to get
darknet darknet executable After compiling darknet, copy the darknet executable file in
gen_files.py Used to process and archive training images and annotations See the content of the gen_files.py file below
yolov4.conv.137 The pre-training weight file of yolov4 on the coco dataset (excluding the yolo head layer) Download by yourself
yolov4-tiny.conv.29 The pre-training weight file of yolov4-tiny on the coco dataset (excluding the yolo head layer) Download by yourself
train.data Training data Create by yourself, add text content in it later
train.names Training label Create by yourself, add text content in it later
yolov4.cfg Structure files needed for training YoloV4 In the cfg folder of the darknet project, just copy it in
yolov4-tiny.cfg Structure files needed for training YoloV4-Tiny In the cfg folder of the darknet project, just copy it in

gen_files.pyThe content is as follows:

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
import random

# 类别列表
classes=["ball"]


# 递归清除Linux隐藏文件
def clear_hidden_files(path):
    dir_list = os.listdir(path)
    for i in dir_list:
        abspath = os.path.join(os.path.abspath(path), i)
        if os.path.isfile(abspath):
            if i.startswith("._"):
                os.remove(abspath)
        else:
            clear_hidden_files(abspath)


# PASCAL VOC格式的xml标注文件 转换为 YOLO格式的txt文本标注文件
def convert(size, box):
    '''
    size = (w, h)
    box = (xmin, xmax, ymin, ymax)
    '''
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)


# 执行单个标注文件的转换
def convert_annotation(image_id):
    in_file = open('Annotations/%s.xml' %image_id)
    out_file = open('labels/%s.txt' %image_id, 'w')
    tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
    in_file.close()
    out_file.close()


# 当前目录
wd = os.getcwd()

# 检查是否存在Annotations文件夹
annotation_dir = os.path.join(wd, "Annotations/")
if not os.path.isdir(annotation_dir):
        os.mkdir(annotation_dir)
# 清除隐藏文件
clear_hidden_files(annotation_dir)

# 检查是否存在JPEGImages文件夹
image_dir = os.path.join(wd, "JPEGImages/")
if not os.path.isdir(image_dir):
        os.mkdir(image_dir)
# 清除隐藏文件
clear_hidden_files(image_dir)

# 检查是否存在backup文件夹
backup_dir = os.path.join(wd, "backup/")
if not os.path.isdir(backup_dir):
        os.mkdir(backup_dir)
# 清除隐藏文件
clear_hidden_files(backup_dir)

# 检查是否存在labels文件夹
labels_dir = os.path.join(wd, "labels/")
if not os.path.isdir(labels_dir):
        os.mkdir(labels_dir)
# 清除隐藏文件
clear_hidden_files(labels_dir)

# 新建文件train.txt、test.txt
# 存放需要训练和测试的完整文件路径
train_file = open(os.path.join(wd, "train.txt"), 'w')
test_file = open(os.path.join(wd, "test.txt"), 'w')
train_file.close()
test_file.close()


# 训练数据集
train_file = open(os.path.join(wd, "train.txt"), 'a')
# 测试数据集
test_file = open(os.path.join(wd, "test.txt"), 'a')

# 列出所有图片文件
list = os.listdir(image_dir)
# 设置训练集/测试集划分比例的随机数
probo = random.randint(1, 100)
print("Probobility: %d" % probo)
for i in range(0,len(list)):
    path = os.path.join(image_dir,list[i])
    if os.path.isfile(path):
        image_path = image_dir + list[i]
        # 根据文件名,得到没有后缀的文件名和后缀名
        (nameWithoutExtention, extention) = os.path.splitext(os.path.basename(image_path))
        # 标注文件名
        annotation_name = nameWithoutExtention + '.xml'
        # 标注文件地址
        annotation_path = os.path.join(annotation_dir, annotation_name)
    # 设置训练集/测试集划分比例的随机数
    probo = random.randint(1, 100)
    print("Probobility: %d" % probo)
    # 训练集和测试集的划分,这里的75代表训练集/测试集的划分比例为75:25
    # 训练集
    if(probo < 75):
        if os.path.exists(annotation_path):
            # 在当前目录下的train.txt文本文件中,写入训练图片的完整地址
            train_file.write(image_path + '\n')
            # 执行标注格式转换
            convert_annotation(nameWithoutExtention)
    # 测试集
    else:
        if os.path.exists(annotation_path):
            # 在当前目录下的test.txt文本文件中,写入训练图片的完整地址
            test_file.write(image_path + '\n')
            # 执行标注格式转换
            convert_annotation(nameWithoutExtention)
# 文件操作结束后,关闭文件流
train_file.close()
test_file.close()

2- Prepare the training data set

2-1 Copy the pictures needed for training to the JPEGImagesfolder.

2-2 Copy the XML annotation file corresponding to the training picture to the Annotationsfollowing.

Ensure that labelsthere are no files and hidden files in the folder.
Ensure that the training images and annotations correspond to each other

Classes in 2-3 modification gen_files.pyare their own labels.

E.g:

classes = ["person", "phone", "chair"]

Several tags have been modified to several.

2-4 Run in the terminalgen_files.py

python3 gen_files.py

At this time, text files train.txtand files will be generated in the current training folder test.txt.
labelsA txt annotation file in YOLO format will be generated under the folder.

train.txtThe content of the file is a collection of absolute addresses of the training set pictures, one per line.
test.txtThe content of the file is a collection of the absolute addresses of the test set pictures, one per line.

It can be seen train.txt, and test.txtthe ratio of the number of entries in the previously set approximately 75:25.
Of course, it can also be set to other training set/test set division ratios.

labelsThe file under is JPEGImagesthe annotation file in YOLO format of each image in the folder, which is Annotationsconverted from the xml annotation file of the folder.

The final training only needs:

  • train.txt
  • test.txt
  • labelsTxt text annotation file in the folder
  • JPEGImagesImage files under the folder

The directory tree of the training folder is as follows:

.
├── JPEGImages
├── Annotations
├── labels
├── backup
├── data
│   ├── train.data
│   ├── train.names
│   ├── yolov4.cfg
│   ├── yolov4-tiny.cfg
├── darknet
├── gen_files.py
├── yolov4.conv.137
├── yolov4-tiny.conv.29
├── train.txt
├── test.txt

New train.txt, test.txt.
labelsAdd n txt annotation files in YOLO format under the folder.


3 Modify the configuration file

3-1 New data/train.namesfile

You can copy the files in the darknetdirectory data/voc.namesto the training directory data/train.names.
And then modified according to your situation, you can rename as: data/train.names.

The names file stores training labels, one label per line, no blank lines.

For example, in the case of 3 tags, it can be modified to:

person
phone
chair

Just replace it with the label you trained.

3-2 New data/train.datafile
can be copied from the darknetdirectory cfg/coco.datato the training directory data/train.data.

And then modified according to your situation, you can rename as: data/train.data.

The data file stores information such as the number of categories, the location of the training set, the location of the test set, the location of the names file, and the storage address of the trained model.

E.g:

classes = 80
train  = train.txt
valid  = test.txt
#valid = data/coco_val_5k.list
names = data/train.names
backup = backup
eval = coco

The file addresses here are relative to the training working folder.

Train.data related parameters and their functions
parameter name effect
classes Number of categories, mark and train a few categories, just write a few
train File address of training set train.txt
valid File address of test set test.txt
names File address of the names tag file
backup The storage folder of the trained model
eval Evaluation parameters (not yet understood)

Choose one of 3-3 and 3-4, depending on whether the model to be trained is YoloV4 or YoloV4-Tiny.


3-3 Create New data/yolov4-tiny.cfg
You can copy the darknetcontents in the directory cfg/yolov4-tiny.cfgto the training directory yolov4-tiny.cfg.
You can rename it according to your own situation data/yolov4-tiny-xxx.cfg.

batch = 64
subdivisions = 32

In the data/yolov4-tiny.cfgfile, the parameters of the two yololayers and the previous convolutionallayer need to be modified:

  • Both yololayers must be changed: yolothe classes in the layer are the number of categories
  • Filters in the yololayer before each layer convolutional= (category + 5) * 3

For example:
yolo layer: classes=1; convolutional layer: filters=18
yolo layer: classes=2; convolutional layer: filters=21
yolo layer: classes=4; convolutional layer: filters=27

3-4 New data/yolov4.cfg

You can copy the files in the darknetdirectory cfg/yolov4.cfgto the training directory yolov4.cfg.
You can rename it according to your own situation data/yolov4-xxx.cfg.

batch = 64
subdivisions = 32

In the data/yolov4.cfgfile, the parameters of the two yololayers and the previous convolutionallayer need to be modified:

  • Both yololayers must be changed: yolothe classes in the layer are the number of categories
  • Filters in the yololayer before each layer convolutional= (category + 5) * 3

For example:
yolo layer: classes=1; convolutional layer: filters=18
yolo layer: classes=2; convolutional layer: filters=21
yolo layer: classes=4; convolutional layer: filters=27


4- train your own data set

4-1 Make sure yolov4-tiny.conv.29or yolov4.conv.137under the training folder

4-2 start training

Training command:

./darknet detector train data文件地址 cfg文件地址 预训练权重文件(不含yolo head层) -map

If you do not need to display the map changes during the training process, add at the end of the command -map, namely:

./darknet detector train data/train.data data/yolov4-tiny.cfg yolov4-tiny.conv.29

If you need to display map changes during training, add at the end of the command -map, namely:

./darknet detector train data/train.data data/yolov4-tiny.cfg yolov4-tiny.conv.29 -map

For training of the YoloV4 model, you only need to replace the cfg configuration file and pre-training weight file in the command with the YoloV4 version.

4-3 training suggestions

  • batch=64
  • subdivisions=4(or 2,1)
  • YOLOv4: max_batchesSet it to (classes*2000); but the minimum is 4000. YOLOv4-tiny can reduce
  • YOLOv4: Change stepsto max_batches80% and 90%; for example steps=3200, 3600. YOLOv4-tiny can be reduced accordingly
  • To increase the resolution of the network can be increased heightand widthvalues, but must be 32a multiple (height = 608, width = 608 or an integer multiple of 32). This helps improve detection accuracy

5- Test the trained network model

After training, you can see the weight file in the backup directory.

Test picture

./darknet detector test data/train.data data/yolov4-tiny.cfg backup/yolov4-tiny_best.weights xxx.jpg

Test video

./darknet detector demo data/train.data data/yolov4-tiny.cfg backup/yolov4-tiny_best.weights xxx.mp4

6-Performance Statistics

The performance of the model is mainly reflected in: mAP

6-1 Statistics mAP@IoU=0.50:

./darknet detector map data/train.data data/yolov4-tiny.cfg backup/yolov4-tiny_best.weights

6-2 Statistics mAP@IoU=0.75:

./darknet detector map data/train.data data/yolov4-tiny.cfg backup/yolov4-tiny_best.weights -iou_thresh 0.75

7-Anchor Box priori box cluster analysis and modification

7-1 Use k-meansclustering to obtain the a priori box size of your own data set

For YoloV4-Tiny:

./darknet detector calc_anchors data/train.data -num_of_clusters 6 -width 416 -height 416

For YoloV4:

./darknet detector calc_anchors data/train.data -num_of_clusters 9 -width 416 -height 416

7-2 Modify the a priori box size in the cfg file

cfgSeveral numbers of anchors positions in the file

7-3 Retrain and test

Guess you like

Origin blog.csdn.net/LK007CX/article/details/109719420