LeNet reproduces pytorch -- the construction and training of a simple neural network

Introduction to LeNet network:
First, let's introduce LeNet, which defines the basic components of CNN and is the originator of CNN.
LeNet consists of seven layers of CNN,
C1 convolutional layer: using six 5*5 convolution kernels, six feature maps are obtained. The input image is 32*32, so the feature map size is 28*28.
Number of parameters: The parameters used for each neuron of the same convolution kernel are the same. Therefore, the number of parameters is (5*5+1)*6=156, where 5*5 is the convolution kernel parameter and 1 is the bias parameter.
Number of connections: 28*28 neurons per feature map, so the number of connections is the product of the two: (5*5+1)*6*28*28=122304


S2 downsampling layer: pooling unit 2*2, so the size of the feature map becomes 14*14.
The calculation process is to add the values ​​in 2*2 units, multiply by the training parameter w, add the bias parameter, and then take the sigmoid value as the value of the corresponding unit.
Number of parameters: Each feature map shares the two parameters w and b, requiring 2*6=12 parameters.
Number of connections: The number of connections for each pooling unit is 2*2+1, so the number of connections is (2*2+1)*14*14*6.


C3 convolution layer: The input is 6 feature maps of 14*14, and 16 feature maps of 10*10 are output through 16 convolution kernels of 5*5.
How do the 16 feature planes correspond to the previous pooling layer? The convolution kernel corresponding to each feature plane is convolved with multiple planes of the pooling layer. The convolution plane numbers of C3 are 0, 1, 2, ... 15, and the pooling layer S2 is numbered 0, 1, ... 5. The corresponding relationship is as follows:

It can be observed that the first six planes are the three planes corresponding to the pooling layer, and 6-14 correspond to the four planes of the pooling layer.
Therefore the number of connections:
(5*5*3+1)*10*10*6+(5*5*4+1)*10*10*9+(5*5*6+1)*10*10= 151600
Number of parameters: (5x5x3+1)x6 + (5x5x4+1)x9 + 5x5x6+1 = 456 + 909+151 = 1516 The purpose of
this is to break the symmetry and extract deep features.


S4 pooling layer: Pooling unit 2*2, get 16 feature maps of 5*5
Number of connections: (2*2+1)*5*5*16
Trainable parameters: 2*16=32 (weight of sum + bias)


C5 convolutional layer: The input is 16 feature maps of 5*5, after passing through 120 convolution kernels of 5*5, the output is 120 feature maps of 1*1. Because the size of the feature map output by S4 is 5*5, the
first The five layers can be regarded as a fully connected layer, outputting 120 1*1 feature maps, and
each convolution kernel corresponds to only one neuron, so there are only 120 neurons in this layer arranged side by side, and each neuron is connected to all of the pooling layers. layer. The connection number of each neuron in the C5 layer is 5x5x16+1
connection number = parameter: (16*5*5+1)*120


F6 fully connected layer: 84 neurons, each neuron is fully connected with 120 neurons of C5, the number of connections is (120+1)*84 Output fully connected layer: a total of 10 nodes, representing numbers
0
~ 9.

pytorch reproduction code:
build model: model.py

from torch.nn import Module
from torch import nn
 
 
class Model(Module):
    def __init__(self):
        super(Model, self).__init__()
        # 卷积层C1 通道数为1 6个大小为5*5的卷积核
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.relu1 = nn.ReLU()
        # 下采样层 size为2*2,步长为1的池化层
        self.pool1 = nn.MaxPool2d(2)
        # 卷积层
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.relu2 = nn.ReLU()
        # 池化层
        self.pool2 = nn.MaxPool2d(2)
        # 全连接层
        self.fc1 = nn.Linear(256, 120)
        self.relu3 = nn.ReLU()
        # 全连接层
        self.fc2 = nn.Linear(120, 84)
        self.relu4 = nn.ReLU()
        # 输出层
        self.fc3 = nn.Linear(84, 10)
        self.relu5 = nn.ReLU()
 
    # 搭建网络结构
    def forward(self, x):
        y = self.conv1(x)
        y = self.relu1(y)
        y = self.pool1(y)
        y = self.conv2(y)
        y = self.relu2(y)
        y = self.pool2(y)
        # y.size(0)指batchsize的值
        # view函数用来改变tensor的形状的
        # 在CNN卷积或者池化之后需要连接全连接层,需要把多维度的tensor展平称一维
        # -1表示会自适应的调整剩余的维度
        y = y.view(y.shape[0], -1)
        y = self.fc1(y)
        y = self.relu3(y)
        y = self.fc2(y)
        y = self.relu4(y)
        y = self.fc3(y)
        y = self.relu5(y)
        return y
 
 
# model = Model()
# print(model)
 
# Model(
#   (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
#   (relu1): ReLU()
#   (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
#   (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
#   (relu2): ReLU()
#   (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
#   (fc1): Linear(in_features=256, out_features=120, bias=True)
#   (relu3): ReLU()
#   (fc2): Linear(in_features=120, out_features=84, bias=True)
#   (relu4): ReLU()
#   (fc3): Linear(in_features=84, out_features=10, bias=True)
#   (relu5): ReLU()
# )
 
 


training: train.py

from model import Model
import numpy as np
 
import torch
from torchvision.datasets import mnist
from torch.nn import CrossEntropyLoss
# 包含优化算法的包
from torch.optim import SGD
# 可迭代的数据集
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor
 
if __name__ == '__main__':
    batch_size = 256
    # 定义数据集
    train_dataset = mnist.MNIST(root='./train', train=True, transform=ToTensor())
    test_dataset = mnist.MNIST(root='./test', train=False, transform=ToTensor())
    # load数据集
    train_loader = DataLoader(train_dataset, batch_size=batch_size)
    test_loader = DataLoader(test_dataset, batch_size=batch_size)
    # 定义模型
    model = Model()
    # 优化方法
    sgd = SGD(model.parameters(), lr=1e-1)from model import Model
import numpy as np
 
import torch
from torchvision.datasets import mnist
from torch.nn import CrossEntropyLoss
# 包含优化算法的包
from torch.optim import SGD
# 可迭代的数据集
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor
 
if __name__ == '__main__':
    batch_size = 256
    # 定义数据集
    train_dataset = mnist.MNIST(root='./train', train=True, transform=ToTensor())
    test_dataset = mnist.MNIST(root='./test', train=False, transform=ToTensor())
    # load数据集
    train_loader = DataLoader(train_dataset, batch_size=batch_size)
    test_loader = DataLoader(test_dataset, batch_size=batch_size)
    # 定义模型
    model = Model()
    # 优化方法
    sgd = SGD(model.parameters(), lr=1e-1)
    # 交叉熵损失函数
    cost = CrossEntropyLoss()
    # 迭代次数
    epoch = 40
 
    # 开始迭代
    for _epoch in range(epoch):
        for idx, (train_x, train_label) in enumerate(train_loader):
            # 返回一个给定形状和类型的用0填充的数组
            label_np = np.zeros((train_label.shape[0], 10))
            # 将模型中参数的梯度置为0
            sgd.zero_grad()
            # 预测的y值
            predict_y = model(train_x.float())
            # 计算误差
            loss = cost(predict_y, train_label.long())
            if idx % 10 == 0:
                print('idx: {}, loss: {}'.format(idx, loss.sum().item()))
            # pytorch里反向传播里基于一个loss标量 从中获取grad
            loss.backward()
            sgd.step()
 
        # 预测正确的数目
        correct = 0
        # 总数
        _sum = 0
 
        # 计算准确率
        for idx, (test_x, test_label) in enumerate(test_loader):
            predict_y = model(test_x.float()).detach()
            predict_ys = np.argmax(predict_y, axis=-1)
            label_np = test_label.numpy()
 
            _ = predict_ys == test_label
            correct += np.sum(_.numpy(), axis=-1)
            _sum += _.shape[0]
 
        # 输出精度
        print('accuracy: {:.2f}'.format(correct / _sum))
        # 保存模型
        torch.save(model, 'models/mnist_{:.2f}.pkl'.format(correct / _sum))
 
    # 交叉熵损失函数
    cost = CrossEntropyLoss()
    # 迭代次数
    epoch = 40
 
    # 开始迭代
    for _epoch in range(epoch):
        for idx, (train_x, train_label) in enumerate(train_loader):
            label_np = np.zeros((train_label.shape[0], 10))
            # 将模型中参数的梯度置为0
            sgd.zero_grad()
            # 预测的y值
            predict_y = model(train_x.float())
            # 计算误差
            loss = cost(predict_y, train_label.long())
            if idx % 10 == 0:
                print('idx: {}, loss: {}'.format(idx, loss.sum().item()))
            # pytorch里反向传播里基于一个loss标量 从中获取grad
            loss.backward()
            sgd.step()
 
        correct = 0
        _sum = 0
 
        # 计算准确率
        for idx, (test_x, test_label) in enumerate(test_loader):
            predict_y = model(test_x.float()).detach()
            predict_ys = np.argmax(predict_y, axis=-1)
            label_np = test_label.numpy()
            _ = predict_ys == test_label
            correct += np.sum(_.numpy(), axis=-1)
            _sum += _.shape[0]
 
        # 输出准确率
        print('accuracy: {:.2f}'.format(correct / _sum))
        # 保存模型
        torch.save(model, 'models/mnist_{:.2f}.pkl'.format(correct / _sum))


Training result:

Guess you like

Origin blog.csdn.net/candice5566/article/details/121686696