Quickly make your own VOC semantic segmentation dataset

Semantic Segmentation Dataset Production and Conversion Method

提示:这里可以添加系列文章的所有文章的目录,目录需要自己手动添加
Chapter 1 Semantic Segmentation and Labeling Based on PS
Chapter 2 Construction of VOC Semantic Segmentation Dataset



foreword

Introduction to PASCAL VOC2012 Dataset

PASCAL VOC2012 is an extremely important official dataset for semantic segmentation tasks. A large number of excellent semantic segmentation models will refresh this dataset. Therefore, when we use other people's open source code, if we can organize our own dataset into the format of the official dataset , you can quickly verify the performance of the model and reduce your own workload.
The VOC2012 official data set file structure contains five folders.
insert image description here
Semantic segmentation needs to include:
1.ImageSets/Segmentation: txt file that stores the name of the training and verification set file;
insert image description here
2.JPEGImages: folder that stores the original image
insert image description here
3.SegmentationClass: stores The folder of the marked file
insert image description here
Therefore, when making your own data set, you only need to ensure that the contents of these three folders are unified, and the data set production can be completed.


1. Build folder

According to the above folder structure, create the VOC2012 folder, create the ImageSets folder, JPEGImages folder, and SegmentationClass folder in VOC2012, and create the Segmentation folder in the ImageSets folder.
insert image description here
insert image description here
insert image description here

2. Moving pictures

Move the original pictures of your own dataset to the JPEGImages folder, and move the labeled pictures of your own dataset to the SegmentationClass folder.
To use the PS labeling tool, please go to: [PS is a real scientific research tool, helping to quickly divide and label work]

3. Generate TXT file

Here is a piece of code: it can automatically generate train.txt, val.txt, trainval.txt, and check whether the length and width dimensions of the image and the labeled image are consistent, whether the labeled image is a two-dimensional image, and generate the mean, variance and various categories of the data set Pixel Scale :

import mmcv
import os
import os.path as osp
import random
"""1.检查图像维度"""
import numpy as np
from PIL import Image,ImageOps
from torchvision import transforms


def get_Image_dim_len(png_dir: str,jpg_dir:str):
    png = Image.open(png_dir)
    png_w,png_h=png.width,png.height
    #若第十行报错,说明jpg图片没有对应的png图片
    png_dim_len = len(np.array(png).shape)
    assert png_dim_len==2,"提示:存在三维掩码图"
    jpg=Image.open(jpg_dir)
    jpg = ImageOps.exif_transpose(jpg)
    jpg.save(jpg_dir)
    jpg_w,jpg_h=jpg.width,jpg.height
    print(jpg_w,jpg_h,png_w,png_h)
    assert png_w==jpg_w and png_h==jpg_h,print("提示:%s mask图与原图宽高参数不一致"%(png_dir))


"""2.读取单个图像均值和方差"""
def pixel_operation(image_path: str):
    img = cv.imread(image_path, cv.IMREAD_COLOR)
    means, dev = cv.meanStdDev(img)
    return means,dev

"""3.分割数据集,生成label文件"""
# 原始数据集 ann上一级
data_root = 'VOC2012'
#图像地址
image_dir="JPEGImages"
# ann图像文件夹
ann_dir = "SegmentationClass"
# txt文件保存路径
split_dir = 'ImageSets/Segmentation'
mmcv.mkdir_or_exist(osp.join(data_root, split_dir))

png_filename_list = [osp.splitext(filename)[0] for filename in mmcv.scandir(
    osp.join(data_root, ann_dir), suffix='.png')]
jpg_filename_list=[osp.splitext(filename)[0] for filename in mmcv.scandir(
    osp.join(data_root, image_dir), suffix='.jpg')]
assert len(jpg_filename_list)==len(png_filename_list),"提示:原图与掩码图数量不统一"
print("数量检查无误")
for i in range(10):
   random.shuffle(jpg_filename_list)
red_num=0
black_num=0
with open(osp.join(data_root, split_dir, 'trainval.txt'), 'w+') as f:
    length = int(len(jpg_filename_list))
    for line in jpg_filename_list[:length]:
        pngpath=osp.join(data_root,ann_dir,line+'.png')
        jpgpath=osp.join(data_root,image_dir,line+'.jpg')
        get_Image_dim_len(pngpath,jpgpath)
        img=cv.imread(pngpath,cv.IMREAD_GRAYSCALE)
        red_num+=len(img)*len(img[0])-len(img[img==0])
        black_num+=len(img[img==0])
        f.writelines(line + '\n')
    value=red_num/black_num

train_mean,train_dev=[[0.0,0.0,0.0]],[[0.0,0.0,0.0]]
with open(osp.join(data_root, split_dir, 'train.txt'), 'w+') as f:
    train_length = int(len(jpg_filename_list) * 7/ 10)
    for line in jpg_filename_list[:train_length]:
        jpgpath=osp.join(data_root,image_dir,line+'.jpg')
        mean,dev=pixel_operation(jpgpath)
        train_mean+=mean
        train_dev+=dev
        f.writelines(line + '\n')
with open(osp.join(data_root, split_dir, 'val.txt'), 'w+') as f:
    for line in jpg_filename_list[train_length:]:
        jpgpath=osp.join(data_root,image_dir,line+'.jpg')
        mean,dev=pixel_operation(jpgpath)
        train_mean+=mean
        train_dev+=dev
        f.writelines(line + '\n')
    train_mean,train_dev=train_mean/length, train_dev /length

doc=open('均值方差像素比.txt','a+')
doc.write("均值:"+'\n')
for item in train_mean:
    doc.write(str(item[0])+'\n')
doc.write("训练集方差:"+'\n')
for item in train_dev:
    doc.write(str(item[0])+'\n')
doc.write("像素比:"+'\n')
doc.write(str(value))
train_mean,train_dev=[[0.0,0.0,0.0]],[[0.0,0.0,0.0]]

After running the above code, you can see the mean-variance-pixel-ratio.txt file in the VOC2012 folder:
insert image description here
at the same time, three tag index txt files are also generated:
insert image description here

Summarize

提示:这里对文章进行总结:

This article describes how to build your own VOC dataset. For everyone to exchange and discuss!
Past review:
(1) Interpretation of CBAM papers + Pytorch implementation of CBAM-ResNeXt
(2) Interpretation of SENet papers and code examples
(3) Understanding of ShuffleNet-V1 papers and code reproduction
(4) Understanding of ShuffleNet-V2 papers and code reproduction
(5) GhostNet paper comprehension and code reproduction
(6) PS is a real scientific research tool, which helps to quickly segment and label work
Next issue notice:
The use of VOC data sets – how to realize the arbitrary semantic segmentation algorithm of the mmsegmentation algorithm library according to your own data sets

Guess you like

Origin blog.csdn.net/qq_44840741/article/details/127744681