Production of VOC format data set: detailed production of train.txt, test.txt, trainval.txt, val.txt in ImageSets->Segmentation file

In the process of making the VOC dataset, after labeling, the file name needs to be divided into training, verification set, and test set according to the custom ratio. The following is the detailed division process:

insert image description here
Use the following code to divide, to correctly import the file path where the label is located and the divided save path, and to customize the division ratio of the training set, verification set, and test set, the places that need to be modified are as follows:

insert image description here

See the code below:

import os
import random

random.seed(0)

segfilepath = r'D:\Code\Python\MMsegmentation\Week3_mmseg\mmsegmentation-0.14.0\tools\data\rs128\SegmentationClass'         # 标签文件
saveBasePath = r"D:\Code\Python\MMsegmentation\Week3_mmseg\mmsegmentation-0.14.0\tools\data\rs128\ImageSets\Segmentation/"  # 随机打乱后文件名存放的文件
# ----------------------------------------------------------------------#
#   想要增加测试集修改trainval_percent
#   修改train_percent用于改变验证集的比例
# ----------------------------------------------------------------------#
trainval_percent = 1  # 如果这里赋值为0.1,说明测试集是0.1,不想要测试集就写1
train_percent = 0.8  # 0.8用来做训练集

temp_seg = os.listdir(segfilepath)
total_seg = []
for seg in temp_seg:
    if seg.endswith(".png"):
        total_seg.append(seg)

num = len(total_seg)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

print("train and val size", tv)
print("traub suze", tr)
ftrainval = open(os.path.join(saveBasePath, 'trainval.txt'), 'w')
ftest = open(os.path.join(saveBasePath, 'test.txt'), 'w')
ftrain = open(os.path.join(saveBasePath, 'train.txt'), 'w')
fval = open(os.path.join(saveBasePath, 'val.txt'), 'w')

for i in list:
    name = total_seg[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

The results of the four .txt files generated after running the code are as follows:

insert image description here
insert image description here
insert image description here
The above is the production of the VOC format dataset: the detailed production process of train.txt, test.txt, trainval.txt, val.txt in the ImageSets->Segmentation file, thank you!

Guess you like

Origin blog.csdn.net/qq_40280673/article/details/132143751