PyTorch-based CNN Convolutional Neural Network Recognition of MNIST Handwritten Digits

This blog mainly introduces the implementation of MNIST classic handwritten digits based on the PyTorch deep learning framework, using CNN convolutional neural network.

The MNIST data set comes from the National Institute of Standards and Technology, which has 60,000 training data and 10,000 test data, and the size of each image is 28*28 pixels

We can directly download this dataset based on PyTorch. The recognition program first uses a convolutional layer (the number of convolution kernels is 16, the convolution kernel size is 5*5, the stride is 1, and edge expansion is allowed), and then the activation layer uses the ReLU function, followed by a max pooling layer, the size is 2*2. Then set the same convolution layer (two kinds of convolution kernels), activation layer, downsampling layer, and finally a fully connected layer, the output is 10 neurons, indicating that there are ten categories, 0, 1, 2, 3...9

The scale of data in each layer of the network changes as follows:

Initial situation: 1*28*28 (indicates that there is only one color channel, and the pixel size of an image is 28*28)

After the first layer of convolution: 16*28*28 (16 kinds of convolution kernels are used, the stride is 1, and each side is expanded by 1 grid, and the size is unchanged 28*28)

After the first layer of max pooling: 16*14*14 (use the size of 2*2 for downsampling, and the size of the graph is quickly reduced by half)

After the second layer of convolution: 32*14*14

After the second layer of max pooling: 32*7*7

Fully Connected Layers: 10 (Ten categories in total)


Code display:

import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.utils.data as Data
import torchvision
import matplotlib.pyplot as plt

torch.manual_seed(1) #results can be reproduced

#Set hyperparameters
EPOCH = 5
BATCH_SIZE= 50
LR = 0.01
DOWNLOAD_MNIST = True #Whether to download data

train_data = torchvision.datasets.MNIST(
    root = './mnist/', #Save the location
    train = True, #represents training data
    transform=torchvision.transforms.ToTensor(),
    download = DOWNLOAD_MNIST,
)

test_data = torchvision.datasets.MNIST(root='./mnist/',train=False)

train_loader = Data.DataLoader(dataset=train_data,batch_size=BATCH_SIZE,shuffle=True)

test_x = Variable(torch.unsqueeze(test_data.test_data,dim=1),volatile=True).type(torch.FloatTensor)[:2000]/255
test_y = test_data.test_labels[:2000]

class CNN(nn.Module):
    def __init__(self):
        super(CNN,self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv2d(
                in_channels=1, #The height is 1, just one channel
                out_channels=16, #Number of convolution kernels
                kernel_size=5, #convolution kernel size
                stride=1, #Set the step size to 1
                padding=2, #The edge is expanded by two grids
            ),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2),
        )
        self.conv2 = nn.Sequential(
            nn.Conv2d(16,32,5,1,2),
            nn.ReLU(),
            nn.MaxPool2d(2),
        )
        self.out = nn.Linear(32*7*7,10)

    def forward(self,x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = x.view(x.size(0),-1) #Flatten multi-dimensional convolution graph
        output = self.out(x)
        return output

cnn = CNN()
print(cnn)

optimizer = torch.optim.Adam(cnn.parameters(),lr=LR)
loss_func = nn.CrossEntropyLoss()

for epoch in range(EPOCH):
    for step,(x,y) in enumerate(train_loader):
        b_x = Variable(x)
        b_y = Variable(y)

        output = cnn(b_x)
        loss = loss_func(output,b_y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    test_output=cnn(test_x)
    pred_y = torch.max(test_output,1)[1].data.squeeze()
    accuracy = sum(pred_y==test_y) / float(test_y.size(0))
    print('Epoch: ',epoch, '| train loss: %.4f' %loss.data[0],'| test accuracy: %.2f' %accuracy)

test_output = cnn(test_x[:10])
pre_y = torch.max(test_output,1)[1].data.numpy().squeeze()
print(pre_y,'prediction number')
print(test_y[:10].numpy(),'real number')

The shape of the output CNN convolutional network:


Output the accuracy of the first 2000 test data:


Output the predicted data and true values ​​for the first 10 test data:


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326038422&siteId=291194637