YOLOv5 trains its own data set (super detailed)

 

 

Table of contents

 1. Prepare the deep learning environment

2. Prepare your own dataset

1. Create a dataset

 2. Convert data format

 3. Configuration file

 3. Model training

1. Download the pre-trained model

2. Training

4. Model testing

 5. Model reasoning


The whole process of YOLOv5 training its own data set mainly includes: environment installation----making data set----model training----model testing----model reasoning

 1. Prepare the deep learning environment

My laptop system is: Windows10
first enter the YOLOv5 open source website  , manually download the zip or git clone remote warehouse, I downloaded the 5.0 version code of YOLOv5, there will be a requirements.txt file in the code folder, which describes the requirements installation package.

The final version of pytorch installed in this article is 1.8.1 , the version of torchvision is 0.9.1 , and the version of python is 3.7.10 . Other dependent libraries can be installed according to the requirements.txt file.

2. Prepare your own dataset

When I was training YOLOv5, the data format I chose was VOC, so the following will introduce how to convert my own data set into one that can be used directly by YOLOv5.

1. Create a dataset

Create a mydata folder under the data directory in the YOLOv5 folder (the name can be customized), the directory structure is as follows, put the xml files and pictures marked by the labelImg before into the corresponding directory mydata ...images # store pictures...xml
#
store
pictures Corresponding xml file
...dataSet #Four files of train.txt, val.txt, test.txt and trainval.txt will be automatically generated in the Main folder, storing the names of the training set, verification set, and test set pictures (without suffix .jpg)
as follows:
The contents of the mydata folder are as follows:

  • image is JPEGImages in the VOC dataset format, the content is as follows:

  • Below the xml folder is the .xml file (the labeling tool uses labelImage), the content is as follows: 

  • The division of the training set, verification set, and test set is stored under the dataSet folder. A split_train_val.py file can be created through script generation. The code content is as follows:
# coding:utf-8

import os
import random
import argparse

parser = argparse.ArgumentParser()
# xml文件的地址,根据自己的数据进行修改 xml一般存放在Annotations下
parser.add_argument('--xml_path', default='xml', type=str, help='input xml label path')
# 数据集的划分,地址选择自己数据下的ImageSets/Main
parser.add_argument('--txt_path', default='dataSet', type=str, help='output txt label path')
opt = parser.parse_args()

trainval_percent = 1.0
train_percent = 0.9
xmlfilepath = opt.xml_path
txtsavepath = opt.txt_path
total_xml = os.listdir(xmlfilepath)
if not os.path.exists(txtsavepath):
    os.makedirs(txtsavepath)

num = len(total_xml)
list_index = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list_index, tv)
train = random.sample(trainval, tr)

file_trainval = open(txtsavepath + '/trainval.txt', 'w')
file_test = open(txtsavepath + '/test.txt', 'w')
file_train = open(txtsavepath + '/train.txt', 'w')
file_val = open(txtsavepath + '/val.txt', 'w')

for i in list_index:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        file_trainval.write(name)
        if i in train:
            file_train.write(name)
        else:
            file_val.write(name)
    else:
        file_test.write(name)

file_trainval.close()
file_train.close()
file_val.close()
file_test.close()
  •  After running the code, the following four txt documents are generated under the dataSet folder:

  • The contents of the three txt files are as follows: 

 2. Convert data format

Next, prepare labels, convert the data set format into yolo_txt format, that is, extract the bbox information for each xml annotation into txt format, each image corresponds to a txt file, and each line of the file contains information about a target, including class, x_center, y_center, width, height format. The format is as follows:

  •  Create a voc_label.py file, generate a label label (to be used in training) from the training set, verification set, and test set , and import the data set path into the txt file. The code content is as follows:
# -*- coding: utf-8 -*-
import xml.etree.ElementTree as ET
import os
from os import getcwd

sets = ['train', 'val', 'test']
classes = ["a", "b"]   # 改成自己的类别
abs_path = os.getcwd()
print(abs_path)

def convert(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[1]) / 2.0 - 1
    y = (box[2] + box[3]) / 2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return x, y, w, h

def convert_annotation(image_id):
    in_file = open('data/mydata/xml/%s.xml' % (image_id), encoding='UTF-8')
    out_file = open('data/mydata/labels/%s.txt' % (image_id), 'w')
    tree = ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)
    for obj in root.iter('object'):
        # difficult = obj.find('difficult').text
        difficult = obj.find('Difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
             float(xmlbox.find('ymax').text))
        b1, b2, b3, b4 = b
        # 标注越界修正
        if b2 > w:
            b2 = w
        if b4 > h:
            b4 = h
        b = (b1, b2, b3, b4)
        bb = convert((w, h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

wd = getcwd()
for image_set in sets:
    if not os.path.exists('data/mydata/labels/'):
        os.makedirs('data/mydata/labels/')
    image_ids = open('data/mydata/dataSet/%s.txt' % (image_set)).read().strip().split()
    list_file = open('paper_data/%s.txt' % (image_set), 'w')
    for image_id in image_ids:
        list_file.write(abs_path + '/mydata/images/%s.jpg\n' % (image_id))
        convert_annotation(image_id)
    list_file.close()

 3. Configuration file

1) The configuration of the data set
Create a new mydata.yaml file (you can customize the name) under the data folder in the yolov5 directory to store the division files (train.txt and val.txt) of the training set and the verification set. The two files are generated by running the voc_label.py code, and then the number of categories of the target and the list of specific categories. The content of mydata.yaml is as follows:

2) Choose a model you need, and the
model folder under the yolov5 directory is the configuration file of the model. The s, m, l, and x versions are provided here, and gradually increase (as the architecture increases, the training time is also gradually increased. Increase), assuming yolov5x.yaml is used, only one parameter needs to be modified, and nc is changed to the number of categories of its own, which needs to be rounded (optional) as follows:

 So far, the custom dataset has been created, and the next step is to train the model.

 3. Model training

1. Download the pre-trained model

Download the corresponding version of the model from the GitHub open source website of YOLOv5

2. Training

Before officially starting training, you need to make the following modifications to train.py:

The above parameters are explained as follows:
epochs: refers to how many times the entire data set will be iterated during the training process. If the graphics card is not working, you can adjust it smaller.
batch-size: How many pictures are viewed at a time before the weight is updated, the mini-batch with gradient descent, if the graphics card is not good, you can adjust it smaller.
cfg: the configuration file for storing the model structure
data: the file for storing the training and testing data
img-size: the width and height of the input image, if the graphics card is not enough, you can adjust it smaller.

Then run the training command as follows:

python train.py --img 640 --batch 32 --epoch 300 --data data/mydata.yaml --cfg models/yolov5x.yaml --weights weights/yolov5x.pt --device '0,1' 

4. Model testing

To evaluate the quality of the model is to evaluate the effect of the model on the labeled test set or verification set. The most commonly used evaluation indicator in target detection is mAP. Specify the dataset configuration file and training result model in the test.py file, as follows:

Test the model with the following command:

python test.py  --data data/mydata.yaml --weights runs/exp1/weights/best.pt --augment

 The model test results are as follows:

 5. Model reasoning

Finally, the model performs inference on the unlabeled data set, specify the path of the test image and test model in the detect.py file, and other parameters (img_size, confidence object confidence threshold, IOU threshold for NMS) can be modified by yourself, as follows:

Use the following command, where weights can use the most satisfactory training model, and source can provide a folder path containing all test pictures.

 python detect.py --weights runs/exp1/weights/best.pt --source inference/images/ --device 0,1

After the test is completed, each test image will generate a result image file in the specified inference/output output folder, as follows:

The data set I trained is a mask data set, and the effect after detection is shown in the figure below:

Guess you like

Origin blog.csdn.net/qq_40716944/article/details/118188085