YOLOv7 nanny-level tutorial (individuals step on countless pits)----train your own data set

Table of contents

I. Introduction:

2. YOLOv7 code download

3. Environment configuration

4. Test results

 5. Make your own dataset

6. Train your own data set


I. Introduction:

The previous article has explained in detail how to install the environment required for deep learning. This article will explain in detail how to configure YOLOv7 on a local computer or server, and then use your own data set for training, reasoning, and detection.

2. YOLOv7 code download

YOLOv7 is built by the original YOLOv4 team, which achieves a good balance between accuracy and speed, and is now an excellent target detection model

Paper address: https://arxiv.org/abs/2207.02696

Thesis code download address: mirrors / WongKinYiu / yolov7 · GitCode

 This piece directly downloads the zip installation package and opens it.

3. Environment configuration

If it is a windows system, open the Anaconda terminal. If it is a remote server, just create one directly. The remaining windows and servers are all one operation.

As follows: enter conda create -n yolov7 (represents the environment name) python=3.8 (use the version of Python), and then create it

 After the environment installation is complete, conda activate yolov7 enters the environment just created (this piece I set as yolov7_1, just a name, harmless)

Then cd to switch to the yolov7-main folder after downloading and decompressing just now

Next, just install the requirements.txt file, and adding this Tsinghua image source later will make it faster.

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple 

 

 For this, my personal torch and trochvison are directly designated, and you can follow the official ones. (This library also writes that the torch version cannot be equal to 1.12.0, and the torchvision cannot be equal to 0.13.0, so be sure to pay attention.)

Next, there is a very important point that must be emphasized! ! !

(1) If the latest 1.24.1 numpy library is installed, the module numpy has no attribute int error will occur . I have been looking for this error for a long time. This is because of the numpy version. Versions above 1.24 do not have int and have been changed to inf , Just change to the 1.23 version, or just change the int that is reported as an error to inf, so the numpy library in requirements.txt is recommended to be replaced directly with numpy==1.23.0, there is no problem with this.

Wait for the installation to enter pip list to check whether the installation is correct

This piece can actually be seen. In fact, both torch and torchvision are cpu version, not gpu version. You need to find the torch password suitable for your cuda version on this website and download it

Pytorch download address: Previous PyTorch Versions | PyTorch

 

For example, my cuda version of 11.3 copies this instruction

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113

 

 After the installation is complete, you can see that there are more +cu113 in the back, and the basic configuration of the gpu training environment is over.

4. Test results

At this time, open detect.py, the default in --wgeights needs to be modified to the weight of the download, the specific address is below the YOLOv7 source code download page

 

After detect, find the detected image in runs. If the detection frame appears, it means that the basic layout of the model is completed.

Let me talk about the second important point

(2) If your project scenario is as follows:

The operating system is win10 or win11

GPU: RTX1650, 1660, 1660Ti, the detection frame will appear in the torch environment of the cpu, but the detection frame cannot be recognized in the gpu

As shown in the picture above, the blogger’s own computer is 1660Ti. My personal guess is that this is because RTX1660Ti does not reach CUDnn_Half

Use requirements, if yolov7, add in the main function:

torch.backends.cudnn.enabled = False

 5. Make your own dataset

The production of the data set should be done carefully. After all, it is necessary to see how well the model is trained and make improvements.

The folder settings are as follows

 Annotations is the xml file of the dataset, and a Main folder is created in ImageStes, and JPEGImages is the image of the dataset. Next, the xml file needs to be divided and then converted into a txt file, because yolo uses the txt format.

 Create a split.py file and paste the following code into it. This block only writes the training set and verification set, and there is no test set. You can change and rewrite it yourself if you need

import os
import random

xmlfilepath = r'../VOCData/VOCTrainVal/Annotations/'  # xml文件的路径
saveBasePath = r'../VOCData/VOCTrainVal/ImageSets/'  # 生成的txt文件的保存路径

trainval_percent = 0.9 # 训练验证集占整个数据集的比重(划分训练集和测试验证集)
train_percent = 0.9  # 训练集占整个训练验证集的比重(划分训练集和验证集)
total_xml = os.listdir(xmlfilepath)
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

print("train and val size", tv)
print("traub suze", tr)
ftrainval = open(os.path.join(saveBasePath, 'Main/trainval.txt'), 'w')
ftest = open(os.path.join(saveBasePath, 'Main/test.txt'), 'w')
ftrain = open(os.path.join(saveBasePath, 'Main/train.txt'), 'w')
fval = open(os.path.join(saveBasePath, 'Main/val.txt'), 'w')

for i in list:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

You can see the output results. After dividing the data set, four txt files are generated

Then convert the xml file to a txt file

This block creates a voc_label.py file

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
import  shutil

sets=[('TrainVal', 'train'), ('TrainVal', 'val'), ('Test', 'test')]

classes =["mask_weared_incorrect","with_mask","without_mask"]

def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

def convert_annotation(year, image_set, image_id):
    in_file = open('VOC%s/Annotations/%s.xml'%(year, image_id))
    out_file = open('VOC%s/labels/%s_%s/%s.txt'%(year, year, image_set, image_id), 'w',encoding='utf-8')
    tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

def copy_images(year,image_set, image_id):
    in_file = 'VOC%s/JPEGImages/%s.jpg'%(year, image_id)
    out_flie = 'VOC%s/images/%s_%s/%s.jpg'%(year, year, image_set, image_id)
    shutil.copy(in_file, out_flie)

wd = getcwd()

for year, image_set in sets:
    if not os.path.exists('VOC%s/labels/%s_%s'%(year,year, image_set)):
        os.makedirs('VOC%s/labels/%s_%s'%(year,year, image_set))
    if not os.path.exists('VOC%s/images/%s_%s'%(year,year, image_set)):
        os.makedirs('VOC%s/images/%s_%s'%(year,year, image_set))
    image_ids = open('VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
    list_file = open('VOC%s/%s_%s.txt'%(year, year, image_set), 'w')
    for image_id in image_ids:
        list_file.write('%s/VOC%s/images/%s_%s/%s.jpg\n'%(wd, year, year, image_set, image_id))
        convert_annotation(year, image_set, image_id)
        copy_images(year, image_set, image_id)

    list_file.close()



 After conversion, it will change from three folders to five folders, and then there are training and verification txt files.

 The next step is to create a yaml file of your own data set. This piece of my file is named myvoc.yaml

# 上面那三个txt文件的位置
train: ./VOCData/VOCTrainVal/TrainVal_train.txt
val: ./VOCData/VOCTrainVal/TrainVal_val.txt
test: ./VOCData/VOCTest/Test_test.txt
# number of classes
nc: 3  # 修改为自己的类别数量
# class names
names: ["第一个标签", "第二个标签","第三个标签"]   # 自己来的类别名称  0 ,1 , 2  

There are several categories of tags, so the nc category will be changed to several. At this point, the basic work has been done, and the next step is training.

6. Train your own data set

 --weights represent weights, you can use the default weights, you can also use the official training weights yolov7_training.pt without pre-training weights

--data represents the data set, this piece can be written to the location of the data set we just made, you can use a relative path, or you can use an absolute path

--batch-size represents the size, which is adjusted according to the situation of the personal computer, generally ranging from 2 to 16, all of which are even numbers

--resume to continue training. If the training is terminated due to power failure or other force majeure factors, change the default here to True, and you can continue the last training.

Next, just start training. If it is a windows system, it will be trained directly, or if it is a server.

 Enter the following command

python train.py --weights yolov7.pt --cfg ./cfg/training/yolov7.yaml
--data  VOCData/myvoc.yaml  --device 0 --batch-size 2 --epoch 300

Then start training, the training results are saved under runs/train/exp, and you can see a series of data after the training is over.

If you encounter any problems during the reproduction or training process, you can privately message the blogger, and you will reply in time when you see it. Writing is not easy, please give it a like, and learn and progress together.

2-16 update

Many students sent private messages saying that there will be a decoding error of this UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 34: illegal multibyte sequence.

        The first step to solve this problem is to check whether the xml file contains a Chinese path. If the xml file has a Chinese path, this problem will occur.

        The second step, if you are training on the windows platform, if you still have this problem after the inspection, it is recommended to discard the xml dataset and use the txt file for division directly, because yolo is trained by converting xml to txt file, which is helpful for students txt file is successfully debugged on windows. If you only have xml files, it is recommended to train in the linux environment. V7 has many minor problems. I personally use xml files to run successfully on Ubuntu, but windows will also report errors, so the windows platform recommends datasets in txt format.

Updated March 31

        In view of many students' private letters or comments, I will update this article again. Since I was busy a while ago, I will update the txt version.

        The txt version can be selected to run in the following situations

        1. If you are using the windows version, you often report an error after converting the xml file to the txt file after training. It is recommended to use this version directly

        2. If your data set only has pictures and txt versions, just remake the data set directly, directly make a txt version of the data set, and start the tutorial now.

        

        First, create a folder again under the yolov7-main folder, named datasets, and then create images and labels folders under datasets, where images store pictures, and labels store datasets in txt format

        The data.yaml file is the source of the data set in the train.py file, and the settings are as follows

 

 Among them, the data set is divided into train training set, val verification set, and test test set, where nc is the number of categories, corresponding to the number in names.

After preparing these basic folders, create a split_txt.py file at the same layer as the train.py file under the yolov7-main folder to prepare for dividing the data set.

# 将图片和标注数据按比例切分为 训练集和测试集
# 直接划分txt文件jpg文件
#### 强调!!! 路径中不能出现中文,否则报错找不到文件
import shutil
import random
import os

# 原始路径
image_original_path = r"图片地址/JPEGImages/"
label_original_path = r"标签地址/labels/"

cur_path = os.getcwd()

# 训练集路径
train_image_path = os.path.join(cur_path, "datasets/images/train/")
train_label_path = os.path.join(cur_path, "datasets/labels/train/")
print("----------")
# 验证集路径
val_image_path = os.path.join(cur_path, "datasets/images/val/")
val_label_path = os.path.join(cur_path, "datasets/labels/val/")
print("----------")
# 测试集路径
test_image_path = os.path.join(cur_path, "datasets/images/test/")
test_label_path = os.path.join(cur_path, "datasets/labels/test/")
print("----------")
# 训练集目录
list_train = os.path.join(cur_path, "datasets/train.txt")
list_val = os.path.join(cur_path, "datasets/val.txt")
list_test = os.path.join(cur_path, "datasets/test.txt")
print("----------")
train_percent = 0.8
val_percent = 0.1
test_percent = 0.1
print("----------")

def del_file(path):
    for i in os.listdir(path):
        file_data = path + "\\" + i
        os.remove(file_data)


def mkdir():
    if not os.path.exists(train_image_path):
        os.makedirs(train_image_path)
    else:
        del_file(train_image_path)
    if not os.path.exists(train_label_path):
        os.makedirs(train_label_path)
    else:
        del_file(train_label_path)

    if not os.path.exists(val_image_path):
        os.makedirs(val_image_path)
    else:
        del_file(val_image_path)
    if not os.path.exists(val_label_path):
        os.makedirs(val_label_path)
    else:
        del_file(val_label_path)

    if not os.path.exists(test_image_path):
        os.makedirs(test_image_path)
    else:
        del_file(test_image_path)
    if not os.path.exists(test_label_path):
        os.makedirs(test_label_path)
    else:
        del_file(test_label_path)


def clearfile():
    if os.path.exists(list_train):
        os.remove(list_train)
    if os.path.exists(list_val):
        os.remove(list_val)
    if os.path.exists(list_test):
        os.remove(list_test)


def main():
    mkdir()
    clearfile()

    file_train = open(list_train, 'w')
    file_val = open(list_val, 'w')
    file_test = open(list_test, 'w')

    total_txt = os.listdir(label_original_path)
    num_txt = len(total_txt)
    list_all_txt = range(num_txt)

    num_train = int(num_txt * train_percent)
    num_val = int(num_txt * val_percent)
    num_test = num_txt - num_train - num_val

    train = random.sample(list_all_txt, num_train)
    # train从list_all_txt取出num_train个元素
    # 所以list_all_txt列表只剩下了这些元素
    val_test = [i for i in list_all_txt if not i in train]
    # 再从val_test取出num_val个元素,val_test剩下的元素就是test
    val = random.sample(val_test, num_val)

    print("训练集数目:{}, 验证集数目:{}, 测试集数目:{}".format(len(train), len(val), len(val_test) - len(val)))
    for i in list_all_txt:
        name = total_txt[i][:-4]

        srcImage = image_original_path + name + '.jpg'
        srcLabel = label_original_path + name + ".txt"

        if i in train:
            dst_train_Image = train_image_path + name + '.jpg'
            dst_train_Label = train_label_path + name + '.txt'
            shutil.copyfile(srcImage, dst_train_Image)
            shutil.copyfile(srcLabel, dst_train_Label)
            file_train.write(dst_train_Image + '\n')
        elif i in val:
            dst_val_Image = val_image_path + name + '.jpg'
            dst_val_Label = val_label_path + name + '.txt'
            shutil.copyfile(srcImage, dst_val_Image)
            shutil.copyfile(srcLabel, dst_val_Label)
            file_val.write(dst_val_Image + '\n')
        else:
            dst_test_Image = test_image_path + name + '.jpg'
            dst_test_Label = test_label_path + name + '.txt'
            shutil.copyfile(srcImage, dst_test_Image)
            shutil.copyfile(srcLabel, dst_test_Label)
            file_test.write(dst_test_Image + '\n')

    file_train.close()
    file_val.close()
    file_test.close()


if __name__ == "__main__":
    main()

Among them are the most original image address and label address. The original data set is divided into 8:1:1. After clicking Run, the number of training sets, the number of test sets, and the number of verification sets will appear. The results are shown in the figure:

At this time, there will be several more txt files in the original datasets folder

 At this time, the txt version of the data set is completed, and in the train.py file, it is ok to replace the source of the data set in the data column.

 Replace it with the previously created data.yaml file under datatsets and click Run to start training.

 If you encounter any problems during the reproduction or training process, you can privately message the blogger, and you will reply in time when you see it. Writing is not easy, please give me a like and a follow to learn and progress together.

Guess you like

Origin blog.csdn.net/weixin_55749226/article/details/128480595