Table of contents
I. Introduction:
The previous article has explained in detail how to install the environment required for deep learning. This article will explain in detail how to configure YOLOv7 on a local computer or server, and then use your own data set for training, reasoning, and detection.
2. YOLOv7 code download
YOLOv7 is built by the original YOLOv4 team, which achieves a good balance between accuracy and speed, and is now an excellent target detection model
Paper address: https://arxiv.org/abs/2207.02696
Thesis code download address: mirrors / WongKinYiu / yolov7 · GitCode
This piece directly downloads the zip installation package and opens it.
3. Environment configuration
If it is a windows system, open the Anaconda terminal. If it is a remote server, just create one directly. The remaining windows and servers are all one operation.
As follows: enter conda create -n yolov7 (represents the environment name) python=3.8 (use the version of Python), and then create it
After the environment installation is complete, conda activate yolov7 enters the environment just created (this piece I set as yolov7_1, just a name, harmless)
Then cd to switch to the yolov7-main folder after downloading and decompressing just now
Next, just install the requirements.txt file, and adding this Tsinghua image source later will make it faster.
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
For this, my personal torch and trochvison are directly designated, and you can follow the official ones. (This library also writes that the torch version cannot be equal to 1.12.0, and the torchvision cannot be equal to 0.13.0, so be sure to pay attention.)
Next, there is a very important point that must be emphasized! ! !
(1) If the latest 1.24.1 numpy library is installed, the module numpy has no attribute int error will occur . I have been looking for this error for a long time. This is because of the numpy version. Versions above 1.24 do not have int and have been changed to inf , Just change to the 1.23 version, or just change the int that is reported as an error to inf, so the numpy library in requirements.txt is recommended to be replaced directly with numpy==1.23.0, there is no problem with this.
Wait for the installation to enter pip list to check whether the installation is correct
This piece can actually be seen. In fact, both torch and torchvision are cpu version, not gpu version. You need to find the torch password suitable for your cuda version on this website and download it
Pytorch download address: Previous PyTorch Versions | PyTorch
For example, my cuda version of 11.3 copies this instruction
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
After the installation is complete, you can see that there are more +cu113 in the back, and the basic configuration of the gpu training environment is over.
4. Test results
At this time, open detect.py, the default in --wgeights needs to be modified to the weight of the download, the specific address is below the YOLOv7 source code download page
After detect, find the detected image in runs. If the detection frame appears, it means that the basic layout of the model is completed.
Let me talk about the second important point
(2) If your project scenario is as follows:
The operating system is win10 or win11
GPU: RTX1650, 1660, 1660Ti, the detection frame will appear in the torch environment of the cpu, but the detection frame cannot be recognized in the gpu
As shown in the picture above, the blogger’s own computer is 1660Ti. My personal guess is that this is because RTX1660Ti does not reach CUDnn_Half
Use requirements, if yolov7, add in the main function:
torch.backends.cudnn.enabled = False
5. Make your own dataset
The production of the data set should be done carefully. After all, it is necessary to see how well the model is trained and make improvements.
The folder settings are as follows
Annotations is the xml file of the dataset, and a Main folder is created in ImageStes, and JPEGImages is the image of the dataset. Next, the xml file needs to be divided and then converted into a txt file, because yolo uses the txt format.
Create a split.py file and paste the following code into it. This block only writes the training set and verification set, and there is no test set. You can change and rewrite it yourself if you need
import os
import random
xmlfilepath = r'../VOCData/VOCTrainVal/Annotations/' # xml文件的路径
saveBasePath = r'../VOCData/VOCTrainVal/ImageSets/' # 生成的txt文件的保存路径
trainval_percent = 0.9 # 训练验证集占整个数据集的比重(划分训练集和测试验证集)
train_percent = 0.9 # 训练集占整个训练验证集的比重(划分训练集和验证集)
total_xml = os.listdir(xmlfilepath)
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
print("train and val size", tv)
print("traub suze", tr)
ftrainval = open(os.path.join(saveBasePath, 'Main/trainval.txt'), 'w')
ftest = open(os.path.join(saveBasePath, 'Main/test.txt'), 'w')
ftrain = open(os.path.join(saveBasePath, 'Main/train.txt'), 'w')
fval = open(os.path.join(saveBasePath, 'Main/val.txt'), 'w')
for i in list:
name = total_xml[i][:-4] + '\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest.close()
You can see the output results. After dividing the data set, four txt files are generated
Then convert the xml file to a txt file
This block creates a voc_label.py file
import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
import shutil
sets=[('TrainVal', 'train'), ('TrainVal', 'val'), ('Test', 'test')]
classes =["mask_weared_incorrect","with_mask","without_mask"]
def convert(size, box):
dw = 1./size[0]
dh = 1./size[1]
x = (box[0] + box[1])/2.0
y = (box[2] + box[3])/2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
def convert_annotation(year, image_set, image_id):
in_file = open('VOC%s/Annotations/%s.xml'%(year, image_id))
out_file = open('VOC%s/labels/%s_%s/%s.txt'%(year, year, image_set, image_id), 'w',encoding='utf-8')
tree=ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult) == 1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
bb = convert((w,h), b)
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
def copy_images(year,image_set, image_id):
in_file = 'VOC%s/JPEGImages/%s.jpg'%(year, image_id)
out_flie = 'VOC%s/images/%s_%s/%s.jpg'%(year, year, image_set, image_id)
shutil.copy(in_file, out_flie)
wd = getcwd()
for year, image_set in sets:
if not os.path.exists('VOC%s/labels/%s_%s'%(year,year, image_set)):
os.makedirs('VOC%s/labels/%s_%s'%(year,year, image_set))
if not os.path.exists('VOC%s/images/%s_%s'%(year,year, image_set)):
os.makedirs('VOC%s/images/%s_%s'%(year,year, image_set))
image_ids = open('VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
list_file = open('VOC%s/%s_%s.txt'%(year, year, image_set), 'w')
for image_id in image_ids:
list_file.write('%s/VOC%s/images/%s_%s/%s.jpg\n'%(wd, year, year, image_set, image_id))
convert_annotation(year, image_set, image_id)
copy_images(year, image_set, image_id)
list_file.close()
After conversion, it will change from three folders to five folders, and then there are training and verification txt files.
The next step is to create a yaml file of your own data set. This piece of my file is named myvoc.yaml
# 上面那三个txt文件的位置
train: ./VOCData/VOCTrainVal/TrainVal_train.txt
val: ./VOCData/VOCTrainVal/TrainVal_val.txt
test: ./VOCData/VOCTest/Test_test.txt
# number of classes
nc: 3 # 修改为自己的类别数量
# class names
names: ["第一个标签", "第二个标签","第三个标签"] # 自己来的类别名称 0 ,1 , 2
There are several categories of tags, so the nc category will be changed to several. At this point, the basic work has been done, and the next step is training.
6. Train your own data set
--weights represent weights, you can use the default weights, you can also use the official training weights yolov7_training.pt without pre-training weights
--data represents the data set, this piece can be written to the location of the data set we just made, you can use a relative path, or you can use an absolute path
--batch-size represents the size, which is adjusted according to the situation of the personal computer, generally ranging from 2 to 16, all of which are even numbers
--resume to continue training. If the training is terminated due to power failure or other force majeure factors, change the default here to True, and you can continue the last training.
Next, just start training. If it is a windows system, it will be trained directly, or if it is a server.
Enter the following command
python train.py --weights yolov7.pt --cfg ./cfg/training/yolov7.yaml
--data VOCData/myvoc.yaml --device 0 --batch-size 2 --epoch 300
Then start training, the training results are saved under runs/train/exp, and you can see a series of data after the training is over.
If you encounter any problems during the reproduction or training process, you can privately message the blogger, and you will reply in time when you see it. Writing is not easy, please give it a like, and learn and progress together.
2-16 update
Many students sent private messages saying that there will be a decoding error of this UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 34: illegal multibyte sequence.
The first step to solve this problem is to check whether the xml file contains a Chinese path. If the xml file has a Chinese path, this problem will occur.
The second step, if you are training on the windows platform, if you still have this problem after the inspection, it is recommended to discard the xml dataset and use the txt file for division directly, because yolo is trained by converting xml to txt file, which is helpful for students txt file is successfully debugged on windows. If you only have xml files, it is recommended to train in the linux environment. V7 has many minor problems. I personally use xml files to run successfully on Ubuntu, but windows will also report errors, so the windows platform recommends datasets in txt format.
Updated March 31
In view of many students' private letters or comments, I will update this article again. Since I was busy a while ago, I will update the txt version.
The txt version can be selected to run in the following situations
1. If you are using the windows version, you often report an error after converting the xml file to the txt file after training. It is recommended to use this version directly
2. If your data set only has pictures and txt versions, just remake the data set directly, directly make a txt version of the data set, and start the tutorial now.
First, create a folder again under the yolov7-main folder, named datasets, and then create images and labels folders under datasets, where images store pictures, and labels store datasets in txt format
The data.yaml file is the source of the data set in the train.py file, and the settings are as follows
Among them, the data set is divided into train training set, val verification set, and test test set, where nc is the number of categories, corresponding to the number in names.
After preparing these basic folders, create a split_txt.py file at the same layer as the train.py file under the yolov7-main folder to prepare for dividing the data set.
# 将图片和标注数据按比例切分为 训练集和测试集
# 直接划分txt文件jpg文件
#### 强调!!! 路径中不能出现中文,否则报错找不到文件
import shutil
import random
import os
# 原始路径
image_original_path = r"图片地址/JPEGImages/"
label_original_path = r"标签地址/labels/"
cur_path = os.getcwd()
# 训练集路径
train_image_path = os.path.join(cur_path, "datasets/images/train/")
train_label_path = os.path.join(cur_path, "datasets/labels/train/")
print("----------")
# 验证集路径
val_image_path = os.path.join(cur_path, "datasets/images/val/")
val_label_path = os.path.join(cur_path, "datasets/labels/val/")
print("----------")
# 测试集路径
test_image_path = os.path.join(cur_path, "datasets/images/test/")
test_label_path = os.path.join(cur_path, "datasets/labels/test/")
print("----------")
# 训练集目录
list_train = os.path.join(cur_path, "datasets/train.txt")
list_val = os.path.join(cur_path, "datasets/val.txt")
list_test = os.path.join(cur_path, "datasets/test.txt")
print("----------")
train_percent = 0.8
val_percent = 0.1
test_percent = 0.1
print("----------")
def del_file(path):
for i in os.listdir(path):
file_data = path + "\\" + i
os.remove(file_data)
def mkdir():
if not os.path.exists(train_image_path):
os.makedirs(train_image_path)
else:
del_file(train_image_path)
if not os.path.exists(train_label_path):
os.makedirs(train_label_path)
else:
del_file(train_label_path)
if not os.path.exists(val_image_path):
os.makedirs(val_image_path)
else:
del_file(val_image_path)
if not os.path.exists(val_label_path):
os.makedirs(val_label_path)
else:
del_file(val_label_path)
if not os.path.exists(test_image_path):
os.makedirs(test_image_path)
else:
del_file(test_image_path)
if not os.path.exists(test_label_path):
os.makedirs(test_label_path)
else:
del_file(test_label_path)
def clearfile():
if os.path.exists(list_train):
os.remove(list_train)
if os.path.exists(list_val):
os.remove(list_val)
if os.path.exists(list_test):
os.remove(list_test)
def main():
mkdir()
clearfile()
file_train = open(list_train, 'w')
file_val = open(list_val, 'w')
file_test = open(list_test, 'w')
total_txt = os.listdir(label_original_path)
num_txt = len(total_txt)
list_all_txt = range(num_txt)
num_train = int(num_txt * train_percent)
num_val = int(num_txt * val_percent)
num_test = num_txt - num_train - num_val
train = random.sample(list_all_txt, num_train)
# train从list_all_txt取出num_train个元素
# 所以list_all_txt列表只剩下了这些元素
val_test = [i for i in list_all_txt if not i in train]
# 再从val_test取出num_val个元素,val_test剩下的元素就是test
val = random.sample(val_test, num_val)
print("训练集数目:{}, 验证集数目:{}, 测试集数目:{}".format(len(train), len(val), len(val_test) - len(val)))
for i in list_all_txt:
name = total_txt[i][:-4]
srcImage = image_original_path + name + '.jpg'
srcLabel = label_original_path + name + ".txt"
if i in train:
dst_train_Image = train_image_path + name + '.jpg'
dst_train_Label = train_label_path + name + '.txt'
shutil.copyfile(srcImage, dst_train_Image)
shutil.copyfile(srcLabel, dst_train_Label)
file_train.write(dst_train_Image + '\n')
elif i in val:
dst_val_Image = val_image_path + name + '.jpg'
dst_val_Label = val_label_path + name + '.txt'
shutil.copyfile(srcImage, dst_val_Image)
shutil.copyfile(srcLabel, dst_val_Label)
file_val.write(dst_val_Image + '\n')
else:
dst_test_Image = test_image_path + name + '.jpg'
dst_test_Label = test_label_path + name + '.txt'
shutil.copyfile(srcImage, dst_test_Image)
shutil.copyfile(srcLabel, dst_test_Label)
file_test.write(dst_test_Image + '\n')
file_train.close()
file_val.close()
file_test.close()
if __name__ == "__main__":
main()
Among them are the most original image address and label address. The original data set is divided into 8:1:1. After clicking Run, the number of training sets, the number of test sets, and the number of verification sets will appear. The results are shown in the figure:
At this time, there will be several more txt files in the original datasets folder
At this time, the txt version of the data set is completed, and in the train.py file, it is ok to replace the source of the data set in the data column.
Replace it with the previously created data.yaml file under datatsets and click Run to start training.
If you encounter any problems during the reproduction or training process, you can privately message the blogger, and you will reply in time when you see it. Writing is not easy, please give me a like and a follow to learn and progress together.