Building neural networks using Pytorch

Typical process for building a neural network

  • Define a neural network with learnable parameters
  • Iterate over the training data set
  • Process input data to flow through a neural network
  • Calculate loss value
  • Backpropagate the gradients of network parameters
  • Update the weight of the network according to certain rules

We first define a neural network implemented in Pytorch:

# 导入若干工具包
import torch
import torch.nn as nn
import torch.nn.functional as F


# 定义一个简单的网络类
class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 定义第一层卷积神经网络, 输入通道维度=1, 输出通道维度=6, 卷积核大小3*3
        self.conv1 = nn.Conv2d(1, 6, 3)
        # 定义第二层卷积神经网络, 输入通道维度=6, 输出通道维度=16, 卷积核大小3*3
        self.conv2 = nn.Conv2d(6, 16, 3)
        # 定义三层全连接网络
        self.fc1 = nn.Linear(16 * 6 * 6, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # (2, 2)的池化窗口下执行最大池化操作
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        # 计算size, 除了第0个维度上的batch_size
        size = x.size()[1:]
        num_features = 1
        for s in size:
            num_features *= s
        return num_features


net = Net()
print(net)

Running results
Insert image description here
Note:
All trainable parameters in the model can be obtained through net.parameters().

params = list(net.parameters())
print(len(params))
print(params[0].size())

operation result:
Insert image description here

  • Assume the input size of the image is 32 * 32:
input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)

operation result
Insert image description here

  • Once you have the output tensor, you can perform gradient zeroing and backpropagation operations.
net.zero_grad()
out.backward(torch.randn(1, 10))
  • Note
    - The neural network built by torch.nn only supports the input of mini-batches and does not support the input of a single sample. -
    For example: nn.Conv2d requires a 4D Tensor with a shape of (nSamples, nChannels, Height, Width). If your If the input is only in the form of a single sample, you need to execute input.unsqueeze(0) to actively expand the 3D Tensor into a 4D Tensor.

loss function

  • The input of the loss function is an input pair: (output, target), and then a numerical value is calculated to evaluate the gap between output and target.
  • There are several different loss functions available in torch.nn. For example, nn.MSELoss evaluates the difference between the input and the target value by calculating the mean square error loss.
  • An example of applying nn.MSELoss to calculate loss:
output = net(input)
target = torch.randn(10)

# 改变target的形状为二维张量, 为了和output匹配
target = target.view(1, -1)
criterion = nn.MSELoss()

loss = criterion(output, target)
print(loss)

operation result:
Insert image description here

  • Regarding the chain of directional propagation: If we track the direction of loss backpropagation and print it using the .grad_fn attribute, we will see a complete calculation graph as follows:
input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d
     -> view -> linear -> relu -> linear -> relu -> linear
     -> MSELoss
     -> loss
  • When loss.backward() is called, the entire calculation graph will automatically derive the loss. All Tensors with the attribute requires_grad=True will participate in the gradient derivation operation, and the gradient will be accumulated into the .grad attribute in Tensors.
print(loss.grad_fn)  # MSELoss
print(loss.grad_fn.next_functions[0][0])  # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0])  # ReLU

Running results:
Insert image description here
backpropagation

  • Performing backpropagation in Pytorch is very simple, all the operation is loss.backward().
  • Before performing backpropagation, the gradients must be cleared to zero, otherwise the gradients will be accumulated between different batches of data.
    Perform a small example of backpropagation:
# Pytorch中执行梯度清零的代码
net.zero_grad()

print('conv1.bias.grad before backward')
print(net.conv1.bias.grad)

# Pytorch中执行反向传播的代码
loss.backward()

print('conv1.bias.grad after backward')
print(net.conv1.bias.grad)

Running results:
Insert image description here
Update network parameters

  • The simplest algorithm for updating parameters is SGD (stochastic gradient descent).
  • The specific algorithm formula expression is: weight = weight - learning_rate
    gradient. First, use traditional Python code to implement SGD as follows:
learning_rate = 0.01
for f in net.parameters():
    f.data.sub_(f.grad.data * learning_rate)

Then use the standard code officially recommended by Pytorch as follows:

# 首先导入优化器的包, optim中包含若干常用的优化算法, 比如SGD, Adam等
import torch.optim as optim

# 通过optim创建优化器对象
optimizer = optim.SGD(net.parameters(), lr=0.01)

# 将优化器执行梯度清零的操作
optimizer.zero_grad()

output = net(input)
loss = criterion(output, target)

# 对损失值执行反向传播的操作
loss.backward()
# 参数的更新通过一行标准代码来执行
optimizer.step()

Section Summary
: Learned the typical process of building a neural network:

  • Define a neural network with learnable parameters
  • Iterate over the training data set
  • Process input data to flow through a neural network
  • Calculate loss value
  • Backpropagate the gradients of network parameters
  • Update the weight of the network according to certain rules

Learned the definition of loss function:

  • Use torch.nn.MSELoss() to calculate the mean square error.
  • When performing backpropagation calculation through loss.backward(), the entire calculation graph will automatically derive the loss.
    All Tensors with the attribute requires_grad=True will participate in the gradient derivation operation, and the gradient will be accumulated into the Tensors. in the grad attribute.

Learned the calculation method of backpropagation:

  • Performing backpropagation in Pytorch is very simple, all the operation is loss.backward().
  • Before performing backpropagation, the gradients must be cleared to zero, otherwise the gradients will be accumulated between different batches of data.
  • net.zero_grad()
  • loss.backward()

Learned how to update parameters:

  • Define an optimizer to perform optimization and update of parameters.

    optimizer = optim.SGD(net.parameters(), lr=0.01)

  • Specific parameter updates are performed through the optimizer.

    optimizer.step()

Guess you like

Origin blog.csdn.net/qq_41309350/article/details/133561373