pytorch advanced learning (eight): use the trained neural network model for picture prediction

Course resources: 

[Pytorch for primary school students] 9. Use your model to make predictions (1)_哔哩哔哩_bilibili

 notes:

Pytorch advanced learning (four): use different classification models for data training (alexnet, resnet, vgg, etc.) - Programmer Sought


 Table of contents

1. Principle introduction

1. Load the model and parameters

2. Read the picture

3. Image preprocessing

4. Convert image to tensor

5. Increase the dimension of batch_size

6. Model Validation

6.1 Preliminary output of the model

 6.2 Output the value and location with the highest probability of predicted value

 6.3 Convert tensor to numpy

6.4 Prediction categories

Two, the code

1. Make predictions on a single image

2. Predict the entire folder picture


        After the model has been trained in the previous sections, it imports its own data for prediction, and the process is similar to the training process. The project directory is as follows, and the pic is the photo taken during the prediction.

1. Principle introduction

1. Load the model and parameters

The model skeleton is trained using resnet18, and the pre-trained weight file "model_resnet18_100.pth" is used to load parameters.

# 如果显卡可用,则用显卡进行训练
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")

'''
    加载模型与参数
'''

# 加载模型
model = resnet18(pretrained=False, num_classes=5).to(device)  # 43.6%

# 加载模型参数
if device == "cpu":
    # 加载模型参数,权重文件传过来
    model.load_state_dict(torch.load("model_resnet18_100.pth", map_location=torch.device('cpu')))
else:
    model.load_state_dict(torch.load("model_resnet18_100.pth"))

2. Read the picture

What we want to predict is the picture of sunflower1.

img_path = './pic/sunflower1.jpg'

 

3. Image preprocessing

Image.open opens the image, converts it to RGB format, padding_black expands the image

img = Image.open(img_path)#打开图片
img = img.convert('RGB')#转换为RGB 格式
# 扩充
img = padding_black(img)

4. Convert image to tensor

val_tf = transforms.Compose([
                transforms.Resize(224),
                transforms.ToTensor(),
                transform_BZ    # 标准化操作
            ])

# 图片转换为tensor
img_tensor = val_tf(img)

5. Increase the dimension of batch_size

If the image is directly passed into the model, the following error will occur:

 reason:

The model receives four-dimensional input, but the input of our picture is only three-dimensional. The first dimension of the required four-dimensional input is batch_size. In our trained model, batch_size=64, but a picture does not have this dimension, so we need to give this Add another channel to the incoming image.

  • dim=0 represents increasing the dimension in the first dimension
# 增加batch_size维度
img_tensor = Variable(torch.unsqueeze(img_tensor, dim=0).float(), requires_grad=False).to(device)

6. Model Validation

6.1 Preliminary output of the model

 After the model is output, you can see the following results. There are 5 numbers in the tensor.

model.eval()
# 不进行梯度更新
with torch.no_grad():
    output_tensor = model(img_tensor)
    print(output_tensor)

 But they are not between 0-1, not the probability value of each class we need, so we need to use softmax for normalization . Use softmax for normalization.

# 将输出通过softmax变为概率值
    output = torch.softmax(output_tensor,dim=1)
    print(output)

It can be seen that after the softmax operation is performed, the results appear using scientific notation , and the sum of 5 numbers is 1. 

 6.2 Output the value and location with the highest probability of predicted value

# 输出可能性最大的那位
    pred_value, pred_index = torch.max(output, 1)
    print(pred_value)
    print(pred_index)

Output You can see that the output probability is 1, which is 100%, and the position subscript is 3, which is the fourth category, the sunflower category.

 6.3 Convert tensor to numpy

The data output in the previous step is in tensor format, so we need to convert the numbers to numpy first, and then convert the subsequent label subscripts to label classes.

# 将数据从cuda转回cpu
pred_value = pred_value.detach().cpu().numpy()
pred_index = pred_index.detach().cpu().numpy()
    
print(pred_value)
print(pred_index)

Print the result and you can see that it has been successfully converted to the numpy class without the tensor flag 

6.4 Prediction categories

Write out the Chinese list of categories, which must correspond to the order in the test training set labels.

classes = ["daisy", "dandelion", "rose", "sunflower", "tulip"]

print("预测类别为: ",classes[pred_index[0]]," 可能性为: ",pred_value[0]*100,"%")

The printout shows that the prediction is correct and the accuracy is high.

Two, the code

1. Make predictions on a single image

'''
    功能:按着路径,导入单张图片做预测
    作者: Leo在这

'''
from torchvision.models import resnet18
import torch
from PIL import Image
import torchvision.transforms as transforms
from torch.autograd import Variable

# 如果显卡可用,则用显卡进行训练
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")

'''
    加载模型与参数
'''

# 加载模型
model = resnet18(weights=False, num_classes=5).to(device)  # 43.6%

# 加载模型参数
if device == "cpu":
    # 加载模型参数,权重文件传过来
    model.load_state_dict(torch.load("model_resnet18_100.pth", map_location=torch.device('cpu')))
else:
    model.load_state_dict(torch.load("model_resnet18_100.pth"))

'''
    加载图片与格式转化
'''
img_path = './pic/sunflower1.jpg'

'''
    图片进行预处理
'''
# 图片标准化
transform_BZ= transforms.Normalize(
    mean=[0.5, 0.5, 0.5],# 取决于数据集
    std=[0.5, 0.5, 0.5]
)

val_tf = transforms.Compose([
                transforms.Resize(224),
                transforms.ToTensor(),
                transform_BZ    # 标准化操作
            ])


def padding_black(img):  # 如果尺寸太小可以扩充
    w, h = img.size
    scale = 224. / max(w, h)
    img_fg = img.resize([int(x) for x in [w * scale, h * scale]])
    size_fg = img_fg.size
    size_bg = 224
    img_bg = Image.new("RGB", (size_bg, size_bg))
    img_bg.paste(img_fg, ((size_bg - size_fg[0]) // 2,
                          (size_bg - size_fg[1]) // 2))
    img = img_bg
    return img

# 打开图片,转换为RGB

img = Image.open(img_path)#打开图片
img = img.convert('RGB')#转换为RGB 格式
# 扩充
img = padding_black(img)
# print(type(img))

# 图片转换为tensor
img_tensor = val_tf(img)
# print(type(img_tensor))

# 增加batch_size维度
img_tensor = Variable(torch.unsqueeze(img_tensor, dim=0).float(), requires_grad=False).to(device)


'''
    数据输入与模型输出转换
'''
model.eval()
# 不进行梯度更新
with torch.no_grad():
    output_tensor = model(img_tensor)
    print(output_tensor)
    #
    # 将输出通过softmax变为概率值
    output = torch.softmax(output_tensor,dim=1)
    print(output)

    # 输出可能性最大的那位
    pred_value, pred_index = torch.max(output, 1)
    print(pred_value)
    print(pred_index)

    # 将数据从cuda转回cpu
    pred_value = pred_value.detach().cpu().numpy()
    pred_index = pred_index.detach().cpu().numpy()
    print(pred_value)
    print(pred_index)
    # #
    # 增加类别标签
    classes = ["daisy", "dandelion", "rose", "sunflower", "tulip"]
    print("预测类别为: ",classes[pred_index[0]]," 可能性为: ",pred_value[0]*100,"%")

2. Predict the entire folder picture

To make picture prediction for the folder whose root directory is pic, the steps are similar to single picture prediction, and use the for loop to traverse the files.

'''
    功能:导入文件夹做预测
    作者:Leo在这
'''

from torchvision.models import resnet18
import torch
from PIL import Image
import torchvision.transforms as transforms
from torch.autograd import Variable

import os

# 如果显卡可用,则用显卡进行训练
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")

'''
    加载模型与参数
'''

# 加载模型
model = resnet18(pretrained=False, num_classes=5).to(device)  # 43.6%

if device == "cpu":
    # 加载模型参数
    model.load_state_dict(torch.load("model_resnet18_100.pth", map_location=torch.device('cpu')))
else:
    model.load_state_dict(torch.load("model_resnet18_100.pth"))
'''
    加载图片与格式转化
'''

# 图片标准化
transform_BZ= transforms.Normalize(
    mean=[0.5, 0.5, 0.5],# 取决于数据集
    std=[0.5, 0.5, 0.5]
)

val_tf = transforms.Compose([##简单把图片压缩了变成Tensor模式
                transforms.Resize(224),
                transforms.ToTensor(),
                transform_BZ#标准化操作
            ])


def padding_black(img):  # 如果尺寸太小可以扩充
    w, h = img.size
    scale = 224. / max(w, h)
    img_fg = img.resize([int(x) for x in [w * scale, h * scale]])
    size_fg = img_fg.size
    size_bg = 224
    img_bg = Image.new("RGB", (size_bg, size_bg))
    img_bg.paste(img_fg, ((size_bg - size_fg[0]) // 2,
                          (size_bg - size_fg[1]) // 2))
    img = img_bg
    return img


dir_loc = r"./pic"
model.eval()
with torch.no_grad():
    for a,b,c in os.walk(dir_loc):
        for filei in c:
            full_path = os.path.join(a,filei)
            # print(full_path)
            # img_path = './pic/sunflower3.jpg'

            img = Image.open(full_path)#打开图片
            img = img.convert('RGB')#转换为RGB 格式
            img = padding_black(img)
            # print(type(img))

            img_tensor = val_tf(img)
            # print(type(img_tensor))

            # 增加batch_size维度
            img_tensor = Variable(torch.unsqueeze(img_tensor, dim=0).float(), requires_grad=False).to(device)


            '''
                数据输入与模型输出转换
            '''

            output_tensor = model(img_tensor)
            # 将输出通过softmax变为概率值
            output = torch.softmax(output_tensor,dim=1)

            # 输出可能性最大的那位
            pred_value, pred_index = torch.max(output, 1)

            pred_value = pred_value.detach().cpu().numpy()
            pred_index = pred_index.detach().cpu().numpy()

            # 增加类别标签
            classes = ["daisy", "dandelion", "rose", "sunflower", "tulip"]

            print("预测类别为: ",classes[pred_index[0]]," 可能性为: ",pred_value[0]*100,"%")

The result looks like this:

Guess you like

Origin blog.csdn.net/weixin_45662399/article/details/130146237