VOC test set and training set
###############################################
Students, don’t just follow me to operate this series of files, because This is a record of stepping on pits, not a tutorial. I just recorded the whole process, so that students in the future can avoid these pits when operating. I hope you can read the entire series of operation processes and then operate after consideration
### #################################
In the virtual machine in the previous section,
when the data set is re-modified
There must be a test set and a training set during training, so here you need to use code to divide the data and put it in the ImageSets/Main folder. The code is as follows. As for the division ratio of the training verification set and the test set, as well as the division ratio of the training set and the verification set, it is determined according to your own data situation. Use the following code to divide:
Move to [/home/heying/darknet/scripts/VOCdevkit]
and name it VOC2021
Modify the path of xml again, because there are two hosts for labeling and training hosts, so the path generated by the labeling process is wrong
Edit the rename_xml.py file
########此文件可将.xml中的路径进行修改##############
import os
import os.path
from xml.etree.ElementTree import parse, Element
# .xml文件地址
path = "/home/xiong/VOC/VOC_MAX/Annotations/"
# 得到文件夹下所有文件名称
files = os.listdir(path)
s = []
# 遍历文件夹
for xmlFile in files:
# 判断是否是文件夹,不是文件夹才打开
if not os.path.isdir(xmlFile):
print(xmlFile)
pass
path = "/home/xiong/VOC/VOC_MAX/Annotations/"
newStr = os.path.join(path, xmlFile)
#最核心的部分,路径拼接,输入的是具体路径
#得到.xml文件的根(也就是annotation)
dom = parse(newStr)
root = dom.getroot()
#获得后缀.前的文件名(分离文件名和扩展名)
part = os.path.splitext(xmlFile)[0]
# 文件名+后缀
part1 = part + '.jpg'
# path里的新属性值:
newStr1 = '/home/xiong/VOC/VOC_MAX/JPEGImages/' + part1
#通过句柄找到path的子节点,然后给子节点设置内容
root.find('path').text = newStr1
# #打印输出
print('已经修改')
dom.write(newStr, xml_declaration=True)
pass
#原文链接:https://blog.csdn.net/weixin_45392405/article/details/106679679
run
You can see that the path modification was successful
Then find and open the set_txt.py file
to convert the VOC data set into a txt file. Python
should pay attention to the path of the highlighted area of the code and your file path.
#####################可以生成数据集需要的.txt文件#######################
import os
import random
trainval_percent = 1
train_percent = 0.5
xmlfilepath = '/home/xiong/VOC/VOC_MAX/Annotations' #标注生成的文件夹
txtsavepath = '/home/xiong/VOC/VOC_MAX/ImageSets/Main' #测试集,验证集,训练验证集等的存放文件夹
total_xml = os.listdir(xmlfilepath)
num=len(total_xml)
list=range(num)
tv=int(num*trainval_percent)
tr=int(tv*train_percent)
trainval= random.sample(list,tv)
train=random.sample(trainval,tr)
ftrainval = open('/home/xiong/VOC/VOC_MAX/ImageSets/Main/trainval.txt', 'w')
#生成一个训练验证集
ftest = open('/home/xiong/VOC/VOC_MAX/ImageSets/Main/test.txt', 'w')
#生成一个随机进行图片抽取来测试的文件
ftrain = open('/home/xiong/VOC/VOC_MAX/ImageSets/Main/train.txt', 'w')
#生成一个训练集,进行深度训练学习
fval = open('/home/xiong/VOC/VOC_MAX/ImageSets/Main/val.txt', 'w')
#生成一个验证集,当训练完后,进行准确率验证
for i in list:
name=total_xml[i][:-4]+'\n'
if i in trainval:
ftrainval.write(name)
ftrain.write(name)
if i in train:
g=0
else:
fval.write(name)
ftest.write(name)
#else:
ftrainval.close()
ftrain.close()
fval.close()
ftest .close()
Save and exit after modification
Use python3 to run the set_txt.py file to generate test.txt, train.txt, trainval.txt, val.txt.
python3 set_txt.py
You can view the effect
preparation before training
Open the voc_label.py file in the same directory as VOCdevkit, modify the relevant configuration and
modify it according to your own situation in the file
#这里的参数使用命名文件夹的参数
sets=[('2021', 'train'),('2021', 'test'),('2021', ‘val’),('2021', ‘trainval’)]
#使用的所有标签
classes = ["red", "green", "null"]
在修改所使用的标签文件,要注意标签的顺序
When done run voc_label.py with python3
python3 voc_label.py
You can see the files and contents trainval.txt and val.txt generated by the program in the path
. File, which contains the absolute path of the corresponding image file, which is used as a reference during training.
Create a voc2021.names file in the darknet/data directory (the file name does not matter, the suffix must be .names), the content is the class name
touch voc2021.names
Then find and modify the cfg/voc.data file, as shown below after modification:
classes= 3
train = /home/heying/darknet/scripts/2021_train.txt
valid = /home/heying/darknet/scripts/2021_test.txt
names = data/voc2021.names
backup = /home/heying/darknet/backup/
Among them,
[classes= 3] refers to the number of tags, and this process is 3
[train = /home/heying/darknet/scripts/2021_train.txt] The absolute path of the newly generated 2021_train.txt
[valid = /home/heying/ darknet/scripts/2021_test.txt] the absolute path of the newly generated 2021_test.txt
[names = data/voc2021.names] the file with the label name just set
[backup = /home/heying/darknet/backup/] the weight during training save path
Save and exit after completion
, then modify the cfg/yolov3-voc.cfg file
yolov3-voc.cfg explanation
It is mainly the adjustment of filters and classes. Searching for yolo , there are three places that
need to be modified as follows. Modify, but pay attention to the [max_batches] parameter is the total number of training times, here I take the total number of 5000 training times. Save and exit when done