yolov3 training (five) darknet VOC test set and training set and pre-training preparation

VOC test set and training set



###############################################
Students, don’t just follow me to operate this series of files, because This is a record of stepping on pits, not a tutorial. I just recorded the whole process, so that students in the future can avoid these pits when operating. I hope you can read the entire series of operation processes and then operate after consideration
### #################################

In the virtual machine in the previous section,
when the data set is re-modified

There must be a test set and a training set during training, so here you need to use code to divide the data and put it in the ImageSets/Main folder. The code is as follows. As for the division ratio of the training verification set and the test set, as well as the division ratio of the training set and the verification set, it is determined according to your own data situation. Use the following code to divide:

Move to [/home/heying/darknet/scripts/VOCdevkit]
and name it VOC2021

Modify the path of xml again, because there are two hosts for labeling and training hosts, so the path generated by the labeling process is wrong

Edit the rename_xml.py file

########此文件可将.xml中的路径进行修改##############

import os
import os.path
from xml.etree.ElementTree import parse, Element

# .xml文件地址
path = "/home/xiong/VOC/VOC_MAX/Annotations/"
# 得到文件夹下所有文件名称
files = os.listdir(path)  
s = []
# 遍历文件夹
for xmlFile in files:
    # 判断是否是文件夹,不是文件夹才打开
    if not os.path.isdir(xmlFile):
        print(xmlFile)
        pass
    path = "/home/xiong/VOC/VOC_MAX/Annotations/"
    newStr = os.path.join(path, xmlFile)
    #最核心的部分,路径拼接,输入的是具体路径
    #得到.xml文件的根(也就是annotation)
    dom = parse(newStr)
    root = dom.getroot()
    #获得后缀.前的文件名(分离文件名和扩展名)
    part = os.path.splitext(xmlFile)[0]
    # 文件名+后缀
    part1 = part + '.jpg'
    # path里的新属性值:
    newStr1 = '/home/xiong/VOC/VOC_MAX/JPEGImages/' + part1
    #通过句柄找到path的子节点,然后给子节点设置内容
    root.find('path').text = newStr1  
    # #打印输出
    print('已经修改')
    dom.write(newStr, xml_declaration=True)
    pass


#原文链接:https://blog.csdn.net/weixin_45392405/article/details/106679679

run

You can see that the path modification was successful
insert image description here

Then find and open the set_txt.py file
to convert the VOC data set into a txt file. Python
should pay attention to the path of the highlighted area of ​​​​the code and your file path.


#####################可以生成数据集需要的.txt文件#######################

import os
import random
 
trainval_percent = 1
train_percent = 0.5
xmlfilepath = '/home/xiong/VOC/VOC_MAX/Annotations'	#标注生成的文件夹
txtsavepath = '/home/xiong/VOC/VOC_MAX/ImageSets/Main'	#测试集,验证集,训练验证集等的存放文件夹
total_xml = os.listdir(xmlfilepath)
 
num=len(total_xml)
list=range(num)
tv=int(num*trainval_percent)
tr=int(tv*train_percent)
trainval= random.sample(list,tv)
train=random.sample(trainval,tr)
 
ftrainval = open('/home/xiong/VOC/VOC_MAX/ImageSets/Main/trainval.txt', 'w')
#生成一个训练验证集

ftest = open('/home/xiong/VOC/VOC_MAX/ImageSets/Main/test.txt', 'w')
#生成一个随机进行图片抽取来测试的文件

ftrain = open('/home/xiong/VOC/VOC_MAX/ImageSets/Main/train.txt', 'w')
#生成一个训练集,进行深度训练学习

fval = open('/home/xiong/VOC/VOC_MAX/ImageSets/Main/val.txt', 'w')
#生成一个验证集,当训练完后,进行准确率验证
 
for i  in list:
    name=total_xml[i][:-4]+'\n'
    if i in trainval:
        ftrainval.write(name)
        ftrain.write(name)
        if i in train:
            g=0
        else:
            fval.write(name)
            ftest.write(name)
    #else:
        
 
ftrainval.close()
ftrain.close()
fval.close()
ftest .close()

insert image description here
Save and exit after modification

Use python3 to run the set_txt.py file to generate test.txt, train.txt, trainval.txt, val.txt.

python3 set_txt.py

insert image description here
You can view the effect
insert image description here

preparation before training

Open the voc_label.py file in the same directory as VOCdevkit, modify the relevant configuration and
modify it according to your own situation in the file

#这里的参数使用命名文件夹的参数
sets=[('2021', 'train'),('2021', 'test'),('2021', ‘val’),('2021', ‘trainval’)]

#使用的所有标签
classes = ["red", "green", "null"]
在修改所使用的标签文件,要注意标签的顺序

When done run voc_label.py with python3

python3 voc_label.py

insert image description here

You can see the files and contents trainval.txt and val.txt generated by the program in the path
insert image description here

. File, which contains the absolute path of the corresponding image file, which is used as a reference during training.
insert image description here


Create a voc2021.names file in the darknet/data directory (the file name does not matter, the suffix must be .names), the content is the class name

touch voc2021.names

insert image description here


Then find and modify the cfg/voc.data file, as shown below after modification:

classes= 3
train  = /home/heying/darknet/scripts/2021_train.txt
valid  = /home/heying/darknet/scripts/2021_test.txt
names = data/voc2021.names
backup = /home/heying/darknet/backup/

insert image description here

Among them,
[classes= 3] refers to the number of tags, and this process is 3
[train = /home/heying/darknet/scripts/2021_train.txt] The absolute path of the newly generated 2021_train.txt
[valid = /home/heying/ darknet/scripts/2021_test.txt] the absolute path of the newly generated 2021_test.txt
[names = data/voc2021.names] the file with the label name just set
[backup = /home/heying/darknet/backup/] the weight during training save path

Save and exit after completion


, then modify the cfg/yolov3-voc.cfg file
yolov3-voc.cfg explanation

It is mainly the adjustment of filters and classes. Searching for yolo , there are three places that
need to be modified as follows. Modify, but pay attention to the [max_batches] parameter is the total number of training times, here I take the total number of 5000 training times. Save and exit when done


insert image description here


insert image description here

Guess you like

Origin blog.csdn.net/Xiong2840/article/details/127936760