PyTorch builds a convolutional neural network (CNN) to realize handwritten digit recognition

1. Introduction to convolutional neural networks

Convolutional Neural Networks (CNN) is a type of neural network that containsconvolution calculations and has a deep structure Feedforward Neural Networks (Feedforward Neural Networks) is deep learning (deep learning) One of the representative algorithms. Convolutional neural networks have representation learning capabilities and can perform shift-invariant classification of input information according to its hierarchical structure. , so it is also called "Shift-Invariant Artificial Neural Networks (SIANN)"

2. Convolutional neural network architecture 

Convolutional neural networks mainly include convolutional layers, sampling layers (usually max pooling) and fully connected layers (FC layers).

3.Pytorch implements convolutional neural network 

  • Convolution layer: nn.Conv2d() 

Its parameters are as follows:

  • Pooling layer: nn.MaxPool2d()

Its parameters are as follows: 

4. Implement MINST handwritten digit recognition

A total of five layers are defined, including two convolutional layers, two pooling layers, and the last layer is the FC layer for classification output. Its network structure is as follows:

 The specific image size is calculated as follows:

 5. Code implementation

import torch
from torchvision import transforms  # 是一个常用的图片变换类
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.nn.functional as F

batch_size = 64
transform = transforms.Compose(
    [
        transforms.ToTensor(),  # 把数据转换成张量
        transforms.Normalize((0.1307,), (0.3081,))  # 0.1307是均值,0.3081是标准差
    ]
)
train_dataset = datasets.MNIST(root='../dataset/mnist',
                               train=True,
                               download=True,
                               transform=transform)
train_loader = DataLoader(train_dataset,
                          shuffle=True,
                          batch_size=batch_size)
test_dataset = datasets.MNIST(root='../dataset/mnist',
                              train=False,
                              download=True,
                              transform=transform)
test_loader = DataLoader(test_dataset,
                         shuffle=True,
                         batch_size=batch_size)


class CNN(torch.nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.layer1 = torch.nn.Sequential(
            torch.nn.Conv2d(1, 25, kernel_size=3),
            torch.nn.BatchNorm2d(25),
            torch.nn.ReLU(inplace=True)
        )

        self.layer2 = torch.nn.Sequential(
            torch.nn.MaxPool2d(kernel_size=2, stride=2)
        )

        self.layer3 = torch.nn.Sequential(
            torch.nn.Conv2d(25, 50, kernel_size=3),
            torch.nn.BatchNorm2d(50),
            torch.nn.ReLU(inplace=True)
        )

        self.layer4 = torch.nn.Sequential(
            torch.nn.MaxPool2d(kernel_size=2, stride=2)
        )

        self.fc = torch.nn.Sequential(
            torch.nn.Linear(50 * 5 * 5, 1024),
            torch.nn.ReLU(inplace=True),
            torch.nn.Linear(1024, 128),
            torch.nn.ReLU(inplace=True),
            torch.nn.Linear(128, 10)
        )

    def forward(self, x):
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        x = x.view(x.size(0), -1)  # 在进入全连接层之前需要把数据拉直Flatten
        x = self.fc(x)
        return x


model = CNN()
# 下面两行代码主要是如果有GPU那么就使用GPU跑代码,否则就使用cpu。cuda:0表示第1块显卡
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")  # 将数据放在GPU上跑所需要的代码
model.to(device)  # 将数据放在GPU上跑所需要的代码
criterion = torch.nn.CrossEntropyLoss()  # 使用交叉熵损失
optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.5)  # momentum表示冲量,冲出局部最小


def train(epochs):
    running_loss = 0.0
    for batch_idx, data in enumerate(train_loader, 0):
        inputs, target = data
        inputs, target = inputs.to(device), target.to(device)  # 将数据放在GPU上跑所需要的代码
        optimizer.zero_grad()
        # 前馈+反馈+更新
        outputs = model(inputs)
        loss = criterion(outputs, target)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if batch_idx % 300 == 299:  # 不让他每一次小的迭代就输出,而是300次小迭代再输出一次
            print('[%d,%5d] loss:%.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
            running_loss = 0.0
    torch.save(model, 'model_{}.pth'.format(epochs))


def test():
    correct = 0
    total = 0
    with torch.no_grad():  # 下面的代码就不会再计算梯度
        for data in test_loader:
            inputs, target = data
            inputs, target = inputs.to(device), target.to(device)  # 将数据放在GPU上跑所需要的代码
            outputs = model(inputs)
            _, predicted = torch.max(outputs.data, dim=1)  # _为每一行的最大值,predicted表示每一行最大值的下标
            total += target.size(0)
            correct += (predicted == target).sum().item()
    print('Accuracy on test set:%d %%' % (100 * correct / total))


if __name__ == '__main__':
    for epoch in range(10):
        train(epoch)
        test()

 6.Results

         

Guess you like

Origin blog.csdn.net/weixin_41477928/article/details/123385000