Teach you how to implement semantic segmentation with Unet (Pytorch version)

Code source: https://github.com/milesial/Pytorch-UNet

1. Build the environment

Before starting to build the environment, be sure to read the readme carefully

I chose Without Docker, then I will follow the following requirements to configure the environment:
insert image description here

Install CUDA

Official website: https://developer.nvidia.com/cuda-toolkit-archive

You can check the highest version of CUDA that your computer can support by command nvidia-smi

insert image description here

You can see that the highest CUDA version supported by my computer is 11.7, and then go to the official website to select a CUDA version lower than this version to download. I chose 10.2 for the first time, but I encountered problems during installation, so I finally chose the 11.3 version. The reason will be mentioned later. It is recommended to choose the appropriate CUDA version after reading the tutorial.

insert image description here

After selecting the version, choose to download the corresponding exe according to your own configuration.

insert image description here

Run the exe to start the installation, you can customize the installation path

insert image description here

insert image description here

insert image description here

insert image description here

insert image description here

insert image description here

Keep going to the next step until the installation is successful

install cudnn

Official website: https://developer.nvidia.com/rdp/cudnn-archive#a-collapse51b

Select the version corresponding to your own CUDA

insert image description here

To download directly, you need to register an account. We can expand the version to be downloaded, right-click on the version to be downloaded to copy the link address, and then drag it to Thunder and other download software to help download without registering an account.

insert image description here

After the download is complete, unzip it, and copy the three folders after decompression to the folder corresponding to CUDA, and the configuration is completed.

insert image description here

install anaconda

There are many online tutorials in this part so I won’t go into details. (Well, I’m too lazy to take a screenshot)

Because different projects require different environments, we can create virtual environments to run our projects:

conda create -n pytorch python=3.8   #创建名为pytorch,python版本为3.8的虚拟环境
conda activate pytorch #激活虚拟环境
conda deacivate #退出虚拟环境
conda remove -n pytorch --all #删除虚拟环境

Install Pytorch

Note: According to the requirements in the readme, version 1.12 and above need to be installed

Corresponding version installation instructions: https://pytorch.org/get-started/previous-versions/

Enter the virtual environment we just created and enter the corresponding command:

conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch

Test whether the installation is successful: CRTL+R enter cmd and press Enter

insert image description here

If you get True, the installation is successful!

Step on the pit record! ! !

When I first installed this place, it was always false. I thought it was an environmental problem. After deleting and reinstalling it many times, it was still false. I checked many methods on the Internet and found that this may be because the version downloaded by conda is not the gpu version at all!

Enter conda list, you can see that the correct version should be like this:

insert image description here

If you find that pytorch displays the cpu version after downloading, then you have fallen into the pit of conda. Conda defaults to Tsinghua source. It will download pytorch from Tsinghua source. If it can’t find the version you specified for it, it will download a default cpu version. In order to solve this problem, I chose the simplest and rude one. The way is to see which versions are available, and then download the corresponding cuda, which is why I downloaded CUDA 11.3 later.

insert image description here

python3.8 + cuda11.3 + cudnn8_0 are all corresponding versions, so there will be no errors!

Link address: https://mirrors.bfsu.edu.cn/anaconda/cloud/pytorch/win-64/

install dependencies

You can directly pip install -r requirements.txt according to the instructions given by the readme

document content:

matplotlib==3.6.2
numpy==1.23.5
Pillow==9.3.0
tqdm==4.64.1
wandb==0.13.5

But this will be very slow, it is recommended to use the mirror source

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple matplotlib==3.6.2

Note that these commands are executed in the virtual environment we just created.

2. Data preparation

Reference blog: https://blog.csdn.net/ECHOSON/article/details/122914826

Prepare two folders, one is the original image and the other is the marked mask

The labeling software used is labelme

You can use the command line to download and use, activate the virtual environment, enter:

pip install labelme  #同样也可以使用镜像源

Then enter labelme directly on the command line to start.

After getting the json file, it needs to be converted into png format before it can be used. Convert the code:

from __future__ import print_function
import argparse
import glob
import math
import json
import os
import os.path as osp
import shutil
import numpy as np
import PIL.Image
import PIL.ImageDraw
import cv2


def json2png(json_folder, png_save_folder):
    if osp.isdir(png_save_folder):
        shutil.rmtree(png_save_folder)
    os.makedirs(png_save_folder)
    json_files = os.listdir(json_folder)
    for json_file in json_files:
        json_path = osp.join(json_folder, json_file)
        os.system("labelme_json_to_dataset {}".format(json_path))
        label_path = osp.join(json_folder, json_file.split(".")[0] + "_json/label.png")
        png_save_path = osp.join(png_save_folder, json_file.split(".")[0] + ".png")
        label_png = cv2.imread(label_path, 0)
        label_png[label_png > 0] = 255
        cv2.imwrite(png_save_path, label_png)
        # shutil.copy(label_path, png_save_path)
        # break


if __name__ == '__main__':
    # !!!!你的json文件夹下只能有json文件不能有其他文件
    json2png(json_folder="D:/Project/testData/jsons/",png_save_folder="D:/Project/testData/jsons/labels/")

The final file structure is as follows

insert image description here

The original picture is placed in imgs, and the marked mask is placed in masks. Note that the picture names must correspond one-to-one. This part can be seen in the reference blog, which is written in detail by the blogger.

The main thing I want to talk about is data enhancement and the pits encountered.

Due to the small amount of original data, the training effect is not good. It is thought that the number of pictures can be expanded by data enhancement.

Using Augmentor to do data enhancement for semantic segmentation

Create a virtual environment Augmentor, activate the virtual environment and download Augmentor:

conda create -n Augmentor python=3.8  
conda activate Augmentor
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple Augmentor

Create two new folders test1 and test2

import Augmentor
 
 
# 确定原始图像存储路径以及掩码文件存储路径,需要把“\”改成“/”
p = Augmentor.Pipeline("D:/Project/Augmentor/test1") #原图
p.ground_truth("D:/Project/Augmentor/test2")  #标注后的图
  
# 图像左右互换: 按照概率0.5执行
p.flip_left_right(probability=0.5)
p.flip_top_bottom(probability=0.5)
 
#随机亮度增强/减弱,min_factor, max_factor为变化因子,决定亮度变化的程度,可根据效果指定
p.random_brightness(probability=1, min_factor=0.7, max_factor=1.2)
 
#随机颜色/对比度增强/减弱
#p.random_color(probability=1, min_factor=0.0, max_factor=1)
p.random_contrast(probability=1, min_factor=0.7, max_factor=1.2)
 
#随机翻转(flip_random)
p.flip_random(probability=1)
 
# 最终扩充的数据样本数可以更换为100。1000等
p.sample(1000)  

The final image will be output to the output folder, and then the original image and mask will be separated manually.

To prepare for training, we need to modify the name of the picture. One is to ensure that the name of the original picture and the mask are the same, and the other is that there are two names in the generated picture. It is not conducive to splitting the name during training.

The code for modifying image names in batches is as follows. You can modify the code a little according to your own needs:

#批量修改后缀名
path = 'D:/Project/Pytorch-UNet-master/data/imgs' #文件地址
list_path = os.listdir(path)  #读取文件夹里面的名字
for index in list_path:  #list_path返回的是一个列表   通过for循环遍历提取元素
    name = index.split('.')[0] + '.png'
    print(name)
    os.rename(os.path.join(path,index),os.path.join(path,name))

At this point, we have obtained the expanded 1000 pictures and the corresponding masks. A new problem has arisen. I only need two types during training, similar to the one shown in the picture below, with only 0 and 255 pixels:

insert image description here

However, there may be many kinds of image pixel values ​​obtained after data enhancement, so we need to make a simple modification to make our image pixel values ​​meet the training needs (c++ implementation):

void getFiles(string path, vector<string>& files);
int main()
{
    vector<string> files;
    string path = "D:\\Project\\Augmentor\\mask";
    getFiles(path, files);
    // 遍历文件夹下所有文件
    for (int i = 0; i < files.size(); i++)
    {
        Mat src = imread(files[i]);
        for (int i = 0; i < src.rows; i++) {
            for (int j = 0; j < src.cols; j++) {
                if(src.at<cv::Vec3b>(i, j)[0] > 50)
                {
                    src.at<cv::Vec3b>(i, j)[0] = 255;
                    src.at<cv::Vec3b>(i, j)[1] = 255;
                    src.at<cv::Vec3b>(i, j)[2] = 255;
                }
                else
                {
                    src.at<cv::Vec3b>(i, j)[0] = 0;
                    src.at<cv::Vec3b>(i, j)[1] = 0;
                    src.at<cv::Vec3b>(i, j)[2] = 0;
                }
            }
        }
        imwrite(files[i], src);
    }
	return 0;
}
void getFiles(string path, vector<string>& files)
{
    //文件句柄  
    long long hFile = 0;
    //文件信息  
    struct _finddata_t fileinfo;
    string p;
    if ((hFile = _findfirst(p.assign(path).append("\\*").c_str(), &fileinfo)) != -1)
    {
        do
        {
            //如果是目录,迭代之  
            //如果不是,加入列表  
            if ((fileinfo.attrib & _A_SUBDIR))
            {
                if (strcmp(fileinfo.name, ".") != 0 && strcmp(fileinfo.name, "..") != 0)
                    getFiles(p.assign(path).append("\\").append(fileinfo.name), files);
            }
            else
            {
                files.push_back(p.assign(path).append("\\").append(fileinfo.name));
            }
        } while (_findnext(hFile, &fileinfo) == 0);
        _findclose(hFile);
    }
}

At this point, I am very close to success, but I still have a problem when I throw the picture into the training. It reminds me that the dimensions of the two inputs are different. After investigation, it is found that this is because the original mask is an 8-bit image. The enhanced mask is a 24-bit image, so we need to convert the bit depth:

#24位转8位
path = 'D:/Project/Augmentor/mask' #文件地址
path1 = 'D:/Project/Augmentor/masktest'
list_path = os.listdir(path)  #读取文件夹里面的名字
for index in list_path:  #list_path返回的是一个列表   通过for循环遍历提取元素
    print(os.path.join(path,index))
    p1 = os.path.join(path,index)
    p2 = os.path.join(path1,index)
    print(p1)
    print(p2)
    
    img = cv2.imread(os.path.join(path,index)) # 填要转换的图片存储地址
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    cv2.imwrite(os.path.join(path1,index),img) # 填转换后的图片存储地址,若在同一目录,则注意不要重名

At this point, all the processing of the image is completely completed.

3. Start training

Modify the appropriate parameters and the number of categories you want to divide, etc. img_scale is the ratio of the image resize. If the image is too large and there is an error of insufficient video memory during training, you can try to change this value to a smaller value.

insert image description here

Then you can start training! If you use the command line to execute, remember to enter the corresponding disk and virtual environment. If you are not on the same disk, an error will be reported, and if the environment is wrong, you will not be able to execute.

Excuting an order:

insert image description here

Guess you like

Origin blog.csdn.net/Lianhaiyan_zero/article/details/128968702