In the process of making the VOC dataset, after labeling, the file name needs to be divided into training, verification set, and test set according to the custom ratio. The following is the detailed division process:
Use the following code to divide, to correctly import the file path where the label is located and the divided save path, and to customize the division ratio of the training set, verification set, and test set, the places that need to be modified are as follows:
See the code below:
import os
import random
random.seed(0)
segfilepath = r'D:\Code\Python\MMsegmentation\Week3_mmseg\mmsegmentation-0.14.0\tools\data\rs128\SegmentationClass' # 标签文件
saveBasePath = r"D:\Code\Python\MMsegmentation\Week3_mmseg\mmsegmentation-0.14.0\tools\data\rs128\ImageSets\Segmentation/" # 随机打乱后文件名存放的文件
# ----------------------------------------------------------------------#
# 想要增加测试集修改trainval_percent
# 修改train_percent用于改变验证集的比例
# ----------------------------------------------------------------------#
trainval_percent = 1 # 如果这里赋值为0.1,说明测试集是0.1,不想要测试集就写1
train_percent = 0.8 # 0.8用来做训练集
temp_seg = os.listdir(segfilepath)
total_seg = []
for seg in temp_seg:
if seg.endswith(".png"):
total_seg.append(seg)
num = len(total_seg)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
print("train and val size", tv)
print("traub suze", tr)
ftrainval = open(os.path.join(saveBasePath, 'trainval.txt'), 'w')
ftest = open(os.path.join(saveBasePath, 'test.txt'), 'w')
ftrain = open(os.path.join(saveBasePath, 'train.txt'), 'w')
fval = open(os.path.join(saveBasePath, 'val.txt'), 'w')
for i in list:
name = total_seg[i][:-4] + '\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest.close()
The results of the four .txt files generated after running the code are as follows:
The above is the production of the VOC format dataset: the detailed production process of train.txt, test.txt, trainval.txt, val.txt in the ImageSets->Segmentation file, thank you!