6. The complete training process of the neural network (MNIST data set as an example)

1. Download the dataset

MNIST dataset
Unzip the downloaded dataset and put it in the path of the project at the same level
insert image description here

Second, guide package

import torch
import torch.nn as nnn
import torch.optim as optim
import torch.nn.functional as F
import matplotlib.pyplot as plt
import numpy as np
from torchvision import datasets,transforms
%matplotlib inline

%matplotlib inlineThe drawing can be embedded in the notebook, and can be omittedplt.show()

3. Load the dataset

Set some parameter information, datasets and dataloader
MNIST dataset image size is 28*28 pixels, a total of 10 categories

#输入图像大小为28*28,10个类别,全部图像训练循环3次,每次训练64张
input_size = 28
num_classes = 10
num_epochs = 3
batch_size = 64

train_dataset = datasets.MNIST(root="./data/",train=True,transform=transforms.ToTensor(),download=True)
test_dataset = datasets.MNIST(root="./data/",train=False,transform=transforms.ToTensor(),download=True)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset,batch_size=batch_size,shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset,batch_size=batch_size,shuffle=True)

4. Architecture Model

The model is divided into 2 layers and 1 output in total

1, layer1 includes convolution layer, activation function, maximum pooling layer

Because the data set sample is a single color channel, the size is 28*28 pixels, and the shape is:[1,28,28]

Convolution layer:
The size of the convolution kernel is 5*5, the number of convolution kernels is 16, the sliding step size is 1, and the number of edge circles is 2
insert image description here
Calculated from the formula, the shape of the feature map output through the convolution layer is [16,28,28]
pooling layer: Pooling The kernel size is 2*2, and only the size of the feature map is cut in half, that is, [14,14]
the shape of the feature map after finally passing through layer1 is[16,14,14]
insert image description here

2, layer2 includes convolutional layer, activation function, maximum pooling layer

The shape of the feature map obtained after passing through layer1 is[16,14,14]

Convolution layer:
The size of the convolution kernel is 5*5, the number of convolution kernels is 32, the sliding step is 1, and the number of edge circles is . 2
Similarly, the shape of the feature map is obtained by calculation.[32,14,14]

Pooling layer: The size of the pooling kernel is 2*2, and only the size of the feature map is cut in half, that is, [7,7]
the shape of the feature map after passing through layer2 is[32,7,7]

3. The output layer is fully connected to the linear layer

The shape of the feature map obtained after passing through layer2 is: [32,7,7]
First, the feature map is expanded into one line, and x.view(x.size(0),-1)
the feature elements of each image occupy one line, and there are multiple images, so we get(batch_size, 32*7*7)

Later, because the final task is a ten-category task, it is linearly transformed through the matrix [32*7*7,10], and finally converted into ten output values, that is [batch_size,10], each picture has ten values, corresponding to the ten numbers predicted to be 0-9. Probability, total batch_size sheets

class yy_model(nn.Module):
    def __init__(self):
        super(yy_model,self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(in_channels=1,out_channels=16,kernel_size=5,stride=1,padding=2),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )
        self.layer2 = nn.Sequential(
            nn.Conv2d(in_channels=16,out_channels=32,kernel_size=5,stride=1,padding=2),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )
        self.out = nn.Linear(in_features=7*7*32,out_features=10)
        
    def forward(self,x):
        x = self.layer1(x)
        x = self.layer2(x)
        x = x.view(x.size(0),-1) #(batch_size, 32 * 7 * 7)
        output = self.out(x)
        return output

5. Accuracy

Pass in the model prediction result and the actual real result, and compare them to find the largest value in the model prediction result, which is the number predicted by the model

def accuracy(predictions, labels):
    pred = torch.max(predictions.data, 1)[1] 
    rights = pred.eq(labels.data.view_as(pred)).sum() 
    return rights, len(labels) 

6. Model Training

# 实例化
net = yy_model() 
#损失函数
criterion = nn.CrossEntropyLoss() 
#优化器
optimizer = optim.Adam(net.parameters(), lr=0.001) #定义优化器,普通的随机梯度下降算法

#开始训练循环
for epoch in range(num_epochs):
    #当前epoch的结果保存下来
    train_rights = [] 
    
    for batch_idx, (data, target) in enumerate(train_loader):  #针对容器中的每一个批进行循环
        net.train()                             
        output = net(data) 
        loss = criterion(output, target) 
        optimizer.zero_grad() 
        loss.backward() 
        optimizer.step() 
        right = accuracy(output, target) 
        train_rights.append(right) 

    
        if batch_idx % 100 == 0: 
            
            net.eval() 
            val_rights = [] 
            
            for (data, target) in test_loader:
                output = net(data) 
                right = accuracy(output, target) 
                val_rights.append(right)
                
            #准确率计算
            train_r = (sum([tup[0] for tup in train_rights]), sum([tup[1] for tup in train_rights]))
            val_r = (sum([tup[0] for tup in val_rights]), sum([tup[1] for tup in val_rights]))

            print('当前epoch: {} [{}/{} ({:.0f}%)]\t损失: {:.6f}\t训练集准确率: {:.2f}%\t测试集正确率: {:.2f}%'.format(
                epoch, batch_idx * batch_size, len(train_loader.dataset),
                100. * batch_idx / len(train_loader), 
                loss.data, 
                100. * train_r[0].numpy() / train_r[1], 
                100. * val_r[0].numpy() / val_r[1]))

Seven, test model training effect

I feel that the effect of model training is not bad, the actual label is 4, and the predicted result is also 4, hehe

x,y = train_dataset[9]#第9个数据x为图片,对应的结果为4
x.shape # torch.Size([1, 28, 28])
y # 4
x = x.view(-1,1,28,28) # 因为投喂网络需要格式为[B,C,W,H],需要变成相应的格式
x.shape # torch.Size([1, 1, 28, 28])
y_hat = net(x) # 模型预测
y_hat
"""
tensor([[ -7.2561,  -3.0549,  -3.3932,  -6.8128,  10.3861, -11.8726,  -7.2241,
          -0.6564,  -2.5825, -10.9693]], grad_fn=<AddmmBackward0>)
"""
pred_maxvalue, pred_maxindex = torch.max(y_hat,dim=1) # 得到最大的值和索引下标
pred_maxvalue # tensor([10.3861], grad_fn=<MaxBackward0>)
pred_maxindex # tensor([4])
pred_maxindex.item() # 4 

Guess you like

Origin blog.csdn.net/qq_41264055/article/details/131430177