自制DeepLabV3Plus框架训练语义分割的VOC格式数据集

DeepLabV3Plus框架训练语义分割的数据集是VOC格式，单通道的彩色图（Pillow中的一种色彩模式）。

小常识：OpenCV打开图片默认是三通道格式，显示和保存也是只支持三通道的图像。即cv::imshow()只能显示三通道的图片，cv::imwrite()、cv::VideoWriter如果要保存灰度图，必须将调用cv::cvtColor()先将图像转换成三通道的。

我是做二分类，只用得上下面代码中colors_map前面两种颜色，已有的数据集的mask是24位三通道的，要转换成VOC格式。代码如下：

# coding=utf-8

import os
import numpy as np
from tqdm import tqdm
from PIL import Image, ImageOps


# 颜色映射表，当把颜色表的颜色种类增加到17种时（17>2^4），图片的位深度就会变成8。
colors_map = [
    (0, 0, 0),  # 黑色(背景像素) 0
    (128, 0, 0),  # 暗红色 1
    (255, 0, 0),  # 红色 2
    (0, 255, 0),  # 绿色 3
    (0, 0, 255),  # 蓝色 4
    (255, 255, 0),  # 黄色 5
    (0, 255, 255),  # 青色 6
    (255, 0, 255),  # 深红色 7
    (255, 192, 203),  # 粉红色 8
    (0, 199, 140),  # 玉色 9
    (128, 42, 42),  # 棕色 10
    (255, 97, 0),  # 橙色 11
    (238, 130, 238),  # 紫罗兰 12
    (128, 0, 128),  # 紫色 13
    (0, 0, 139),  # 深蓝色 14
    (0, 0, 128),  # 海军蓝 15
    (211, 242, 231),  # 水绿色 16
    (0, 139, 139),  # 深青色 17
    (0, 100, 0)  # 深绿色 18
]


def rgb_to_mask(image_file, out_dir):
    image = Image.open(image_file)
    image_gray = ImageOps.grayscale(image)

    width = image.size[0]
    height = image.size[1]

    for x in range(0, width):
        for y in range(0, height):
            pix = image_gray.getpixel((x, y))
            if pix != 0:
                image_gray.putpixel((x, y), 1)

    image_p = image_gray.convert('P')  # P代表索引图片，
    palette = np.array(colors_map).reshape(-1).tolist()
    image_p.putpalette(palette)

    (filename, extension) = os.path.splitext(image_file)
    filename = os.path.basename(filename)
    image_p_name = os.path.join(out_dir, filename + ".png")
    image_p.save(image_p_name)
    # image_p.show()


if __name__ == '__main__':
    input_path = 'D:\\BaiduNetdiskDownload\\CelebAMask-HQ\\CelebA-HQ-mask\\'
    output_path = 'D:\\BaiduNetdiskDownload\\CelebAMask-HQ\\CelebA-HQ-mask\\result\\'
    if not os.path.exists(output_path):
        os.makedirs(output_path)

    files = os.listdir(input_path)
    image_files = list(filter(lambda x: '.png' in x, files))
    image_files = tqdm(image_files)
    for image_file in image_files:
        image_file = input_path + image_file
        rgb_to_mask(image_file, output_path)

自制DeepLabV3Plus框架训练语义分割的VOC格式数据集

猜你喜欢