(4) About the implementation of loss and some pitfalls encountered in the process of building the network

Let’s talk about loss first. In the previous code, we used the official calculation loss function, the cross entropy loss function.

criterion = torch.nn.CrossEntropyLoss()

The cross-entropy loss function mainly consists of three parts, softmax--->log---->nll_loss. If I have time, I will write an article to introduce it. The implementation code is as follows: What I define is a class, but actually defining a function will do.

class Compute_Loss(nn.Module):
    def __init__(self):
        super(Compute_Loss, self).__init__()

    def forward(self, pred, target):
        pred = pred.to(device)
        target = target.to(device)
        log_soft = F.log_softmax(pred, dim=1)
        loss = F.nll_loss(log_soft, target)
        return loss

Some pitfalls:

1. The last layer of the resnet18 network built before is the fc fully connected layer. I added a softmax layer behind the fc layer and found that the loss could not be reduced. I checked it online and combined it with the composition of the nn.crosentropy loss function. Discover:

The input confidence score ( input) for each class should be raw, not softmaxed or normalized. The reason is that this function will first perform softmax on the original score of the input, so it must be ensured that the input is the original score of each category. And the input target cannot be in the form of one_hot encoding.

2. In the same way, I added a Relu layer behind the FC layer, but the loss also cannot be reduced. It may be that the data after relu will have an impact when calculating the loss.

3. When building the network, use the nn.Linear() layer directly in the forward. When using the GPU to train, an error will be reported, showing that the data is on the GPU but the model is not on the GPU, even though I set model.to("cuda") It's useless, so you need to define nn.Linear() when defining the built-in variables of the class, and then call it in forward.

Define your own loss function. Although this is imitated nn.crossentropy(), you can also build your own loss function class to calculate loss.

Code:

import torch
from torchvision import datasets, transforms, models
import os
import matplotlib.pyplot as plt
import torch.optim as optim
import torch.nn.functional as F
import numpy as np
from PIL import Image, ImageFile
from my_resnet import MainNet
import torch.nn.functional as F
import torch.nn as nn
ImageFile.LOAD_TRUNCATED_IMAGES = True
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"


def train():
    running_loss = 0
    for batch_idx, (data, target) in enumerate(train_data):
        data, target = data.to(device), target.to(device)
        out = net(data)
        loss = criterion(out, target)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        running_loss += loss.item()

    return running_loss


def test():
    correct, total = 0, 0
    with torch.no_grad():
        for _, (data, target) in enumerate(val_data):
            data, target = data.to(device), target.to(device)
            out = net(data)
            out = F.softmax(out, dim=1)
            prediction = out.argmax(dim=1)
            # prediction = torch.max(out.data, dim=1)[1]
            total += target.size(0)
            correct += (prediction == target).sum().item()
        print('Accuracy on test set: (%d/%d)=%d %%' % (correct, total, 100 * correct / total))


class Compute_Loss(nn.Module):
    def __init__(self):
        super(Compute_Loss, self).__init__()

    def forward(self, pred, target):
        pred = pred.to(device)
        target = target.to(device)
        log_soft = F.log_softmax(pred, dim=1)
        loss = F.nll_loss(log_soft, target)
        return loss




if __name__ == '__main__':
    loss_list = []
    Epoches = 200
    Batch_Size = 4
    Image_Size = [256, 256]

    # 1.数据加载
    data_dir = r'D:\Code\python\完整项目放置\classify_project\multi_classification\my_dataset1'
    # 1.1 定义要对数据进行的处理
    data_transform = {x: transforms.Compose([transforms.Resize(Image_Size), transforms.ToTensor()]) for x in
                      ["train", "valid"]}
    image_datasets = {x: datasets.ImageFolder(root=os.path.join(data_dir, x), transform=data_transform[x]) for x in
                      ["train", "valid"]}
    dataloader = {x: torch.utils.data.DataLoader(dataset=image_datasets[x], batch_size=Batch_Size, shuffle=True) for x in
                  ["train", "valid"]}
    train_data, val_data = dataloader["train"], dataloader["valid"]

    index_classes = image_datasets["train"].class_to_idx
    print(index_classes)
    example_classes = image_datasets["train"].classes
    print(example_classes)

    num_classes = 3
    net = MainNet(num_classes)

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    net.to(device)

    # 5.定义损失函数，以及优化器
    LR = 0.0001
    criterion = Compute_Loss()
    optimizer = optim.Adam(net.parameters(), lr=LR)

    best_loss = 100
    for epoch in range(Epoches):
        loss = train()
        loss_list.append(loss)
        print("第%d轮的loss为：%5f:" % (epoch, loss))
        test()

        if loss < best_loss:
            best_loss = loss
            torch.save(net, "best1.pth")
        torch.save(net, "last1.pth")


    plt.title("Graph")
    plt.plot(range(Epoches), loss_list)
    plt.ylabel("loss")
    plt.xlabel("epoch")
    plt.show()

(4) About the implementation of loss and some pitfalls encountered in the process of building the network

Guess you like