Use yolov5 to train your own data set

Use yolov5 to train your own data set

1. Configuration environment

Configure the environment required for yolov5 in windows. For specific operations, please see: https://blog.csdn.net/Wxy971122/article/details/114641375

2. Construction of data set
  1. This time we mainly collect pictures of car wheels for wheel identification and detection.
    The collection object is Toyota cars, with a total of 483 pictures and 32 types.
    Insert image description here
  2. Create a data folder in the yolov5-master folder, and create the following four folders in it, (because I want to test different brands of vehicles, so here I put the Toyota wheel hub pictures in the data\images\FT file folder):
    Insert image description here
    The Annotations folder stores the .xml files generated by the labeled images, and the file names correspond to the image names one-to-one;
    the images folder stores the collected images (.jpg format);
    the ImageSets folder stores the divided training Set and test set files;
    txt files storing label annotation information in the labels folder, corresponding to the images one-to-one.
  3. To label the data set,
    first install the labelimg tool, enter the yolov5 environment in anaconda, enter pip install labelimg to install.
    Insert image description here
    After the installation is completed, enter labelimg to open the software. Select the folder to be marked in "Open Dir" to open, and click "Change Save" Dir", select the save path of the .xml file generated by the mark, and store it in the \data\Annotations\FT folder.
    Insert image description here
    After the marking is completed, an .xml file corresponding to the image name will be generated in the Annotations folder.
    Insert image description here
  4. Open yolov5-master with pycharm, and create a new folder "makeTxt.py" in the root directory of yolov5-master, which is used to classify the data set into a training data set and a test data set. The default train, val, and test are in accordance with 8:1:1 The proportion of random classification, the code is as follows:
import os
import random


trainval_percent = 0.9
train_percent = 0.9
xmlfilepath = 'data/Annotations/FT'
txtsavepath = 'data/ImageSets/FT'
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open('data/ImageSets/FT/trainval.txt', 'w')
ftest = open('data/ImageSets/FT/test.txt', 'w')
ftrain = open('data/ImageSets/FT/train.txt', 'w')
fval = open('data/ImageSets/FT/val.txt', 'w')

for i in list:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

Four folders will be generated, containing the image names of the training data set and the test data set:
train.txt: The name of the image used for training is written
val.txt: The name of the image used for verification is written
trainval.txt: A collection of train and val
test.txt: Writes the name of the image used for testing

The results are as follows:
Insert image description here
5. Create a new file "voc_label.py", read the annotation information in the xml file after the image data set is annotated and write it into the txt file. The code is as follows:

import xml.etree.ElementTree as ET
import pickle
import os
# os.listdir() 方法用于返回指定的文件夹包含的文件或文件夹的名字的列表
from os import listdir, getcwd
from os.path import join


sets = ['train', 'test', 'val']
classes = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18',
         '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32']


# 进行归一化操作
def convert(size, box):  # size:(原图w,原图h) , box:(xmin,xmax,ymin,ymax)
    dw = 1./size[0]      # 1/w
    dh = 1./size[1]      # 1/h
    x = (box[0] + box[1])/2.0   # 物体在图中的中心点x坐标
    y = (box[2] + box[3])/2.0   # 物体在图中的中心点y坐标
    w = box[1] - box[0]         # 物体实际像素宽度
    h = box[3] - box[2]         # 物体实际像素高度
    x = x*dw    # 物体中心点x的坐标比(相当于 x/原图w)
    w = w*dw    # 物体宽度的宽度比(相当于 w/原图w)
    y = y*dh    # 物体中心点y的坐标比(相当于 y/原图h)
    h = h*dh    # 物体宽度的宽度比(相当于 h/原图h)
    return (x, y, w, h)    # 返回 相对于原图的物体中心点的x坐标比,y坐标比,宽度比,高度比,取值范围[0-1]


def convert_annotation(image_id):
    '''
    将对应文件名的xml文件转化为label文件,xml文件包含了对应的bunding框以及图片长款大小等信息,
    通过对其解析,然后进行归一化最终读到label文件中去,也就是说
    一张图片文件对应一个xml文件,然后通过解析和归一化,能够将对应的信息保存到唯一一个label文件中去
    label文件中的格式:class x y w h  同时,一张图片对应的类别有多个,所以对应的bunding的信息也有多个
    '''
    # 对应的通过year 找到相应的文件夹,并且打开相应image_id的xml文件,其对应bund文件
    in_file = open('data/Annotations/FT/%s.xml' % (image_id), encoding='utf-8')
    # 准备在对应的image_id 中写入对应的label,分别为
    # <object-class> <x> <y> <width> <height>
    out_file = open('data/labels/FT/%s.txt' % (image_id), 'w', encoding='utf-8')
    # 解析xml文件
    tree = ET.parse(in_file)
    # 获得对应的键值对
    root = tree.getroot()
    # 获得图片的尺寸大小
    size = root.find('size')
    # 如果xml内的标记为空,增加判断条件
    if size != None:
        # 获得宽
        w = int(size.find('width').text)
        # 获得高
        h = int(size.find('height').text)
        # 遍历目标obj
        for obj in root.iter('object'):
            # 获得difficult ??
            difficult = obj.find('difficult').text
            # 获得类别 =string 类型
            cls = obj.find('name').text
            # 如果类别不是对应在我们预定好的class文件中,或difficult==1则跳过
            if cls not in classes or int(difficult) == 1:
                continue
            # 通过类别名称找到id
            cls_id = classes.index(cls)
            # 找到bndbox 对象
            xmlbox = obj.find('bndbox')
            # 获取对应的bndbox的数组 = ['xmin','xmax','ymin','ymax']
            b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
                 float(xmlbox.find('ymax').text))
            print(image_id, cls, b)
            # 带入进行归一化操作
            # w = 宽, h = 高, b= bndbox的数组 = ['xmin','xmax','ymin','ymax']
            bb = convert((w, h), b)
            # bb 对应的是归一化后的(x,y,w,h)
            # 生成 calss x y w h 在label文件中
            out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')


# 返回当前工作目录
wd = getcwd()
print(wd)


for image_set in sets:
    '''
    对所有的文件数据集进行遍历
    做了两个工作:
    1.将所有图片文件都遍历一遍,并且将其所有的全路径都写在对应的txt文件中去,方便定位
    2.同时对所有的图片文件进行解析和转化,将其对应的bundingbox 以及类别的信息全部解析写到label 文件中去
         最后再通过直接读取文件,就能找到对应的label 信息
    '''
    # 先找labels文件夹如果不存在则创建
    if not os.path.exists('data/labels/FT/'):
        os.makedirs('data/labels/FT/')
    # 读取在ImageSets/Main 中的train、test..等文件的内容
    # 包含对应的文件名称
    image_ids = open('data/ImageSets/FT/%s.txt' % (image_set)).read().strip().split()
    # 打开对应的2012_train.txt 文件对其进行写入准备
    list_file = open('data/%s.txt' % (image_set), 'w')
    # 将对应的文件_id以及全路径写进去并换行
    for image_id in image_ids:
        list_file.write('data/images/FT/%s.jpg\n' % (image_id))
        # 调用  year = 年份  image_id = 对应的文件名_id
        convert_annotation(image_id)
    # 关闭文件
    list_file.close()

The results are as follows:
Insert image description here
test.txt, train.txt, val.txt files, train.txt and other txt files are the absolute paths of the locations of the divided images. For example, train.txt contains the absolute paths of all training set images. :
Insert image description here
After running, the annotation information of all image data sets will appear in the labels folder, as shown in the following figure:
Insert image description here
6. Create the "ft.yaml" file in the data directory, where train, val, and test are followed by the training set and test set respectively. The path of the image, nc is the number of categories in the data set, and names is the name of the category. Everyone sets it according to their own paths and parameters. The code is as follows:

# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: data/train.txt     # 390 images
val: data/val.txt         # 44 images
test: data/test.txt       # 49 images

# number of classes
nc: 32

# class names
names: [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18',
         '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32' ]

Modify the category parameters in "yolov5x.yaml", change nc to 32 here. (Modify the .yaml file whichever weight is used for training)
Insert image description here
7. Modify the parameters in "train.py". Because yolov5x is used for training, which requires a large memory, change the batch-size to 2.

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    # 加载的权重文件
    parser.add_argument('--weights', type=str, default='yolov5x.pt', help='initial weights path')
    # 模型配置文件,网络结构,使用修改好的yolov5m.yaml文件
    parser.add_argument('--cfg', type=str, default='models/yolov5x.yaml', help='model.yaml path')
    # 数据集配置文件,数据集路径,类名等,使用数据集方面的xx.yaml文件
    parser.add_argument('--data', type=str, default='data/ft.yaml', help='data.yaml path')
    # 超参数文件
    parser.add_argument('--hyp', type=str, default='data/hyp.scratch.yaml', help='hyperparameters path')
    # 训练总轮次,1个epoch等于使用训练集中的全部样本训练一次,值越大模型越精确,训练时间也越长
    parser.add_argument('--epochs', type=int, default=300)
    # 批次大小,一次训练所选取的样本数
    parser.add_argument('--batch-size', type=int, default=2, help='total batch size for all GPUs')
    # 输入图片分辨率大小,nargs='+'表示参数可设置一个或多个
    parser.add_argument('--img-size', nargs='+', type=int, default=[320, 320], help='[train, test] image sizes')
    # 是否采用矩形训练,默认False,开启后可显著的减少推理时间
    parser.add_argument('--rect', action='store_true', help='rectangular training')
    # 接着打断训练上次的结果接着训练
    parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')
    # 不保存模型,默认False
    parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
    # 不进行test,默认False
    parser.add_argument('--notest', action='store_true', help='only test final epoch')
    # 不自动调整anchor,默认False
    parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check')
    # 是否进行超参数进化,默认False
    parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
    # 谷歌云盘bucket,一般不会用到
    parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
    # 是否提前缓存图片到内存,以加快训练速度,默认False
    parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
    # 选用加权图像进行训练
    parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')
    # 训练的设备,cpu;0(表示一个gpu设备cuda:0);0,1,2,3(多个gpu设备)。值为空时,训练时默认使用计算机自带的显卡或CPU
    parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    # 是否进行多尺度训练,默认False
    parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
    # 数据集是否只有一个类别,默认False
    parser.add_argument('--single-cls', action='store_true', help='train multi-class data as single-class')
    # 是否使用adam优化器
    parser.add_argument('--adam', action='store_true', help='use torch.optim.Adam() optimizer')
    # 是否使用跨卡同步BN,在DDP模式使用
    parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')
    # gpu编号
    parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify')
    # W&B记录的图像数,最大为100
    parser.add_argument('--log-imgs', type=int, default=16, help='number of images for W&B logging, max 100')
    # 记录最终训练的模型,即last.pt
    parser.add_argument('--log-artifacts', action='store_true', help='log artifacts, i.e. final trained model')
    # dataloader的最大worker数量
    parser.add_argument('--workers', type=int, default=4, help='maximum number of dataloader workers')
    # 训练结果所存放的路径,默认为runs/train
    parser.add_argument('--project', default='runs/train', help='save to project/name')
    # 训练结果所在文件夹的名称,默认为exp
    parser.add_argument('--name', default='exp', help='save to project/name')
    # 若现有的project/name存在,则不进行递增
    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
    parser.add_argument('--quad', action='store_true', help='quad dataloader')
    opt = parser.parse_args()

After modifying the parameters, you can start training. Here are some possible errors:
(1) api_key not configured. Insert image description here
If this error occurs, you can enter: wandb login in the yolov5 environment
Insert image description here
and enter the given URL to register to get the API key.
(2) No module named 'ipywidgets'
Insert image description here
If this error occurs, you can enter it in the yolov5 environment: pip install ipywidgets
Insert image description here
(3) CUDA out of memory Insert image description here
This error indicates that CUDA memory is not enough, and the batch-size can be changed smaller.

If none of the above errors appear, the program can be run directly:
Insert image description here
This means that it has been successfully run and training has officially started. The process requires patience.
After the training is completed, the exp folder will be generated in \runs\train, including the training results. The result.png is as follows:
Insert image description here
8. After the training process is over, you can test the training results. First, open "test.py " file and modify the parameters in it:
Insert image description here
The results are as follows:
Insert image description here
Modify the parameters in the "detect.py" file and test:
Insert image description here
The results are as follows:
Insert image description here
The results can be viewed in the exp folder generated in \runs\detect. The test result is 483 All photos are correctly identified.
Insert image description here
Insert image description here
At this point, training your own data set and testing with yolov5 is complete.

Guess you like

Origin blog.csdn.net/Wxy971122/article/details/114841452