Column directory: pytorch (image segmentation UNet) quick introduction and actual combat - zero, preface
pytorch quick introduction and actual combat - 1, knowledge preparation (introduction to elements)
pytorch quick introduction and actual combat - 2, deep learning classic network development
pytorch quick introduction And actual combat - three, Unet realizes
pytorch quick introduction and actual combat - four, network training and testing

Continued from the above: pytorch quick start and actual combat - three, Unet implementation
After the network is implemented, read data and return parameters.

Semantic Segmentation Implementation Process

1. Read data
- 1.1 Inherit the Dataset class
- 1.2 Loader DataLoader method
2. Other initialization
3 overall code:
4. Q&A
- ① mkdir of impor in the header function

Semantic Segmentation Implementation Process
Training :

According to the batch size, the training samples and labels in the dataset are read into the convolutional neural network. According to actual needs, the training images and labels should be preprocessed first, such as cropping, data enhancement, etc. This is conducive to the training of deep networks, speeds up the convergence process, avoids over-fitting problems and enhances the generalization ability of the model.

verify:

After training for an epoch, read the verification samples and labels in the dataset into the convolutional neural network and load the training weights. Verify according to the written semantic segmentation index, get the index score in the current training process, and save the corresponding weight. The method of training once and verifying is often used to better supervise the performance of the model.

test:

After all the training is over, read the test samples and labels in the dataset into the convolutional neural network, and load the best saved weight values into the model for testing. The test results are divided into two types, one is to measure the network performance based on common index scores, and the other is to save the prediction results of the network in the form of pictures to intuitively feel the accuracy of the segmentation.

1. Read data

If you are not doing segmentation, but classification problems, look here or see other articles:

ImageFolder
has a ready-made data reading method in PyTorch, which is torchvision.datasets.ImageFolder. This API is written after keras, mainly for classification problems, and puts each type of data in the same folder. For example, there are 10 categories, then create 10 subfolders under a large folder, and each subfolder contains the same type of data.

Image segmentation see below:

1.1 Inherit the Dataset class

PyTorch reads pictures, mainly through the torch.utils.data.Dataset class, and the package guide method is:

from torch.utils.data import Dataset

Then inherit this class to implement our own data reading class: very concise,
mainly two methods: class implementation method __init__ and element fetching method __getitem__

class myImageDataset(Dataset):
    def __init__(self, inputs_root, labels_root):
        self.files = sorted(glob.glob(f"{
      
      inputs_root}\\*.png"))
        self.files_using = sorted(glob.glob(f"{
      
      labels_root}\\*.png"))

    def __getitem__(self, index):
        inputs = plt.imread(self.files[index % len(self.files)]) 
        labels = plt.imread(self.files_using[index % len(self.files_using)])
        return inputs, labels

    def __len__(self):
        return len(self.files)

Simply read:

init

inputs_root is the root directory path of the input data, and labels_root is the path of the label.
glob is a method that comes with Python. glob.glob can traverse the files in the current directory. The above statement traverses all png files in the directory, and sorted is for the one-to-one correspondence between input and label without disorder.
Init's self doesn't matter, just skip it when passing parameters.

get item

index is a self-incrementing index, which is an internal operation of the method.
Use the index to get the path, then read the picture and return to it, and you're done.

1.2 Loader DataLoader method

After Dataset reads the pictures, how can artificial intelligence process them in batches?
Pytorch also provides a method: the DataLoader method in the torch.utils.data class.

from torch.utils.data import DataLoader

train_loader = DataLoader(
    dataset=myImageDataset(inputs_root=in_folder + r"\train\inputs",
                         labels_root=in_folder + r"\train\labels"),
    batch_size=16,  # 一批有几个，一般为2的指数
    shuffle=True,
    num_workers=8  # 使用的CPU核心数量
)

By stuffing the path to our dataset, taking the dataset, and letting the dataloader process it in batches, the number of batches is determined by the batch_size, the number of threads is determined by the parameter num_worker, and other parameters can be consulted and studied by yourself.

The test set is the same:

test_loader = DataLoader(
    dataset=myImageDataset(inputs_root=in_folder + r"\test\inputs",
                         labels_root=in_folder + r"\test\labels"),
    batch_size=1,
    shuffle=True,
    num_workers=opt.n_cpu
)

2. Other initialization

2.1 Global variable settings

Define training batches, how much data per batch, and define learning rate and other parameters (set it yourself according to your needs, I copied it from my brother):

parser = argparse.ArgumentParser()
parser.add_argument("--epoch", type=int, default=0, help="epoch to start training from")
parser.add_argument("--n_epochs", type=int, default=100, help="number of epochs of training")
# parser.add_argument("--dataset_name", type=str, default="img_align_celeba", help="name of the dataset")
parser.add_argument("--batch_size", type=int, default=16, help="size of the batches")
parser.add_argument("--lr", type=float, default=0.0002, help="adam: learning rate")
parser.add_argument("--b1", type=float, default=0.5, help="adam: decay of first order momentum of gradient")
parser.add_argument("--b2", type=float, default=0.999, help="adam: decay of first order momentum of gradient")
parser.add_argument("--decay_epoch", type=int, default=100, help="epoch from which to start lr decay")
parser.add_argument("--n_cpu", type=int, default=4, help="number of cpu threads to use during batch generation")
parser.add_argument("--channels", type=int, default=1, help="number of image channels")
parser.add_argument("--sample_interval", type=int, default=100, help="interval between saving image samples")
parser.add_argument("--checkpoint_interval", type=int, default=-1, help="interval between model checkpoints")
opt = parser.parse_args()

2.2 Set network, loss function, optimizer optimizer, tensor conversion tensor

Don't bother python to talk about this optimizer before, and adam is used here.
As for tensors: tensors are multilinear functions, and matrices are representations of tensors under a specific set of basis vectors.
Go search for others by yourself, and you can also read this article: Talking about what is a tensor tensor

# Initialize net
net = AdUNet()
# Losses
loss = torch.nn.L1Loss()

# 数据传给显卡？
cuda = torch.cuda.is_available()
if cuda:
    net = net.cuda()
    gloss = loss.cuda()

if opt.epoch != 0:
    # Load pretrained models
    net.load_state_dict(torch.load("../saved_models/AdUnet_%d.pkl"))

# Optimizers
optimizer = torch.optim.Adam(net.parameters(), lr=opt.lr, betas=(opt.b1, opt.b2))

Tensor = torch.cuda.FloatTensor if cuda else torch.Tensor

2.3 Training

The general process of training: gradient clearing, backpropagation, updating learning rate

optimizer.zero_grad()  # 梯度归零：step之前要进行梯度归零
loss.backward()  # 进行反向传播求出每个参数的梯度
optimizer.step()  # 更新学习率

Specific process:
Insert a note : In the code below emmm, in addition to input inputs and labels, there is also a path inputs_labels in the data set data loaded from train_loader, right? This is because I want to generate a result map later, in order to give the result map A parameter of named settings, so earlier myThe __getitem__ in that dataset is also a bit different, na this is mine (in fact, it just took a path and passed it out):

    def __getitem__(self, index):
        inputs_path = self.files[index % len(self.files)]
        inputs = plt.imread(inputs_path)
        labels = plt.imread(self.files_using[index % len(self.files_using)])
        inputs_name = inputs_path.split("\\")[-1]
        return inputs, labels, inputs_name

2.3.1 Read training data, convert training data

    for i, data in enumerate(train_loader):
        # 一个batch
        inputs, labels, inputs_path = data
        inputs = inputs.unsqueeze(1)
        labels = labels.unsqueeze(1)
        # 将这些数据转换成Variable类型
        inputs, labels = Variable(inputs), Variable(labels)
        device = torch.device("cuda" if cuda else "cpu")
        inputs = inputs.to(device)
        labels = labels.to(device)

My grayscale image read in is a two-dimensional, that is (120,240), plus a batch_size is only three-dimensional, my batchsize is 16, so the inputs I read are (16,120,240), but actually the network runs It looks four-dimensional , how to do it?
In fact, the grayscale image is a single channel, that is (120,240,1), so we have to manually add this channel 1, in myDuring the debugging process, it was found that the channel in the network is generally placed in the second place, so we need to change it to (16,1,120,240), it can be realized directly through unsqueeze decompression, and it can be directly added in the second place inputs.unsqueeze(1). In the same way, the first plus isinputs.unsqueeze(0)

After the inputs and labels have been converted, they need to be converted into variable types to perform backpropagation (I don’t know the reason, I copied it), and then use the graphics card to transfer the data to the graphics card, and then pass it to the CPU without the graphics card.

2.3.2 Training network

Gradient returns to zero (it is 0 at the beginning, for the next cycle)
network training
get the result netout
Calculate the loss function loss
backpropagation
To update the learning rate,
you don’t need to understand it, just copy it.

        optimizer.zero_grad()  # 梯度归零：step之前要进行梯度归零
        net.train()
        netout = net(inputs)

        # Total loss
        gloss = loss(netout, labels)

        gloss.backward()  # 进行反向传播求出每个参数的梯度
        optimizer.step()  # 更新学习率

2.4 Testing

Initialize the evaluation criteria ssim, psnr and rmse.

    total_s = 0  # ssim
    total_p = 0  # psnr
    total_r = 0  # rmse

skimage comes with these libraries:

from skimage.measure import compare_ssim as ssim
from skimage.measure import compare_psnr as psnr
from skimage.measure import compare_mse as mse

read test data
Unified data format
Variable conversion
send to graphics card
It is marked as a test (it will not participate in backpropagation, it has the same effect as the first line, just add it directly)
Send to the network to get the result netout
Down to two dimensions (grayscale images are two dimensions)
Evaluate images compared to label
over

with torch.no_grad():
        for inputs, labels, inputs_path in test_loader:
            # 一个batch
            inputs = inputs.unsqueeze(1)
            # labels = labels.unsqueeze(0)
            # 将这些数据转换成Variable类型
            inputs, labels = Variable(inputs), Variable(labels)
            device = torch.device("cuda" if cuda else "cpu")
            inputs = inputs.to(device)
            labels = labels.squeeze(0).cpu().numpy()

            # optimizer.zero_grad()
            net.eval()
            netout = net(inputs)
            img_out = netout.squeeze(1)
            img_out = img_out.squeeze(0)
            img_out = img_out.cpu().numpy()

            # Total loss
            # gloss = loss(netout, labels)
            # print(netout.shape)
            s = ssim(labels, img_out)
            p = psnr(labels, img_out)
            r = sqrt(mse(labels, img_out))

            total_s += float(s.item())
            total_p += float(p.item())
            total_r += float(r)

3 overall code:

import argparse
import os
import sys

import torch
from torch.autograd import Variable
from torch.utils.data import DataLoader

from main.AdUNet import AdUNet
from main.datasets import *

from skimage.measure import compare_ssim as ssim
from skimage.measure import compare_psnr as psnr
from skimage.measure import compare_mse as mse
from math import sqrt

from other.mkdir import mkdir

mkdir("../saved_models")
parser = argparse.ArgumentParser()
parser.add_argument("--epoch", type=int, default=0, help="epoch to start training from")
parser.add_argument("--n_epochs", type=int, default=100, help="number of epochs of training")
# parser.add_argument("--dataset_name", type=str, default="img_align_celeba", help="name of the dataset")
parser.add_argument("--batch_size", type=int, default=16, help="size of the batches")
parser.add_argument("--lr", type=float, default=0.0002, help="adam: learning rate")
parser.add_argument("--b1", type=float, default=0.5, help="adam: decay of first order momentum of gradient")
parser.add_argument("--b2", type=float, default=0.999, help="adam: decay of first order momentum of gradient")
parser.add_argument("--decay_epoch", type=int, default=100, help="epoch from which to start lr decay")
parser.add_argument("--n_cpu", type=int, default=4, help="number of cpu threads to use during batch generation")
parser.add_argument("--channels", type=int, default=1, help="number of image channels")
parser.add_argument("--sample_interval", type=int, default=100, help="interval between saving image samples")
parser.add_argument("--checkpoint_interval", type=int, default=-1, help="interval between model checkpoints")
opt = parser.parse_args()

in_folder = r"..\data\pigs"
train_folder = in_folder + "\\train"

# Initialize net
net = AdUNet()
# Losses
loss = torch.nn.L1Loss()

# 数据传给显卡？
cuda = torch.cuda.is_available()
if cuda:
    net = net.cuda()
    gloss = loss.cuda()

if opt.epoch != 0:
    # Load pretrained models
    net.load_state_dict(torch.load("../saved_models/AdUnet_%d.pkl"))

# Optimizers
optimizer = torch.optim.Adam(net.parameters(), lr=opt.lr, betas=(opt.b1, opt.b2))

Tensor = torch.cuda.FloatTensor if cuda else torch.Tensor

train_loader = DataLoader(
    dataset=ImageDataset(inputs_root=in_folder + r"\train\inputs",
                         labels_root=in_folder + r"\train\labels"),
    batch_size=opt.batch_size,
    shuffle=True,
    num_workers=opt.n_cpu
)
test_loader = DataLoader(
    dataset=ImageDataset(inputs_root=in_folder + r"\test\inputs",
                         labels_root=in_folder + r"\test\labels"),
    batch_size=1,
    shuffle=True,
    num_workers=opt.n_cpu
)


# ----------
#  Training
# ----------
def train(epoch):
    for i, data in enumerate(train_loader):
        # 一个batch
        inputs, labels, inputs_path = data
        inputs = inputs.unsqueeze(1)
        labels = labels.unsqueeze(1)
        # 将这些数据转换成Variable类型
        inputs, labels = Variable(inputs), Variable(labels)
        device = torch.device("cuda" if cuda else "cpu")
        inputs = inputs.to(device)
        labels = labels.to(device)
        # ------------------
        #  Train net
        # ------------------

        optimizer.zero_grad()  # 梯度归零：step之前要进行梯度归零
        net.train()
        netout = net(inputs)

        # Total loss
        gloss = loss(netout, labels)

        gloss.backward()  # 进行反向传播求出每个参数的梯度
        optimizer.step()  # 更新学习率
##
        # --------------
        #  Log Progress
        # --------------

        sys.stdout.write(
            "\n[Epoch %d/%d] [Batch %d/%d] [D loss: %f]"
            % (epoch, opt.n_epochs, i, len(train_loader), gloss.item())
        )
        with open(r"../data/pigs/loss.txt", "a") as f1:
            f1.write("\n[Epoch %d/%d] [Batch %d/%d] [D loss: %f]"
                     % (epoch, opt.n_epochs, i, len(train_loader), gloss.item()))
            f1.close()


def test():
    total_s = 0  # ssim
    total_p = 0  # psnr
    total_r = 0  # rmse

    with torch.no_grad():
        for inputs, labels, inputs_path in test_loader:
            # 一个batch
            inputs = inputs.unsqueeze(1)
            # labels = labels.unsqueeze(0)
            # 将这些数据转换成Variable类型
            inputs, labels = Variable(inputs), Variable(labels)
            device = torch.device("cuda" if cuda else "cpu")
            inputs = inputs.to(device)
            labels = labels.squeeze(0).cpu().numpy()

            # optimizer.zero_grad()
            net.eval()
            netout = net(inputs)
            img_out = netout.squeeze(1)
            img_out = img_out.squeeze(0)
            img_out = img_out.cpu().numpy()

            # Total loss
            # gloss = loss(netout, labels)
            # print(netout.shape)
            s = ssim(labels, img_out)
            p = psnr(labels, img_out)
            r = sqrt(mse(labels, img_out))

            total_s += float(s.item())
            total_p += float(p.item())
            total_r += float(r)
    print('\n|Epoch %d/%d| |Average SSIM: %f | |Average PSNR: %f| |Average RMSE: %f| '
          % ((epoch + 1), opt.n_epochs, total_s / len(test_loader), total_p / len(test_loader),
             total_r / len(test_loader)))

    with open(r"../data/pigs/test_parameters.txt", "a") as f:
        f.write("|Epoch %d/%d| |Average SSIM: %f | |Average PSNR: %f| |Average RMSE: %f| \r\n"
                % ((epoch + 1), opt.n_epochs, total_s / len(test_loader), total_p / len(test_loader),
                   total_r / len(test_loader)))
        f.close()
    return total_s / len(test_loader), total_p / len(test_loader), total_r / len(test_loader)


if __name__ == '__main__':
    for epoch in range(opt.epoch, opt.n_epochs):
        # 一个epoch
        train(epoch)
        # Save model checkpoints
        # torch.save(net.state_dict(), "saved_models/generator_%d.pkl" % epoch)
        torch.save(net, "../saved_models/AdUnet_%d.pkl" % epoch)
        epoch_s, epoch_p, epoch_r = test()

In addition to the previous net class and dataset class, there are three files in total, just run this file, and you're done!
"Hehe, that's it"

4. Q&A

① mkdir of impor in the header function

Sorry forgot to explain, it is the mkdir.py I wrote under the other folder, the content is as follows:

import os

def mkdir(path):
    if os.path.exists(path):
        return
    else:
        os.mkdir(path)

The main reason is that os.mkdir does not check the file, so when using it generally, you must first check whether the file exists with os.path.exists, and create it if it does not exist. I simply integrated
this function and put it in In my self-edited toolkit other, if you like to make your own wheels, it is good to write your own toolkit, and you can also give your own toolkit a name such as myUtils.

Pytorch quick start and actual combat - four, network training and testing