PyTorch study notes - the use of the basic skeleton of the neural network Torch.NN and the principle of convolution

1. Introduction to torch.nn.Module

torch.nnIt can help us train the neural network more elegantly and make the neural network code more concise and flexible. Official documentation: Torch.NN .

In the document, you can see that the first piece of content is called Container (container), which is equivalent to the skeleton of the neural network. The things after the Container are used to fill the skeleton, such as Convolution Layers (convolution layer), Pooling Layers (pooling layers). layer), those who have the foundation of convolutional neural network should be familiar with these words.

There are six modules in Container: Module, Sequential, ModuleList, ModuleDict, ParameterList, ParameterDict, the most commonly used one is Module, which is the most basic class of all neural networks, and its basic construction method is as follows:

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):  # 初始化
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):  # 前向传播
        x = F.relu(self.conv1(x))  # 将 x 进行第一层卷积后用 ReLU 激活函数输出
        return F.relu(self.conv2(x))  # 将处理后的 x 再进行第二层卷积后用 ReLU 处理后返回最后结果

Now we try to create a simple neural network ourselves, and output the result of forward propagation:

import torch
import torch.nn as nn

class Network(nn.Module):
    def __init__(self):  # 初始化
        super(Network, self).__init__()

    def forward(self, input):
        output = input + 1
        return output

network = Network()
x = torch.tensor(1.0)  # x 为 tensor 类型
output = network(x)  # Module 中的 __call__ 函数会调用 forward 函数
print(output)  # tensor(2.)

2. Principle of convolutional neural network

The convolution operation in a convolutional neural network (CNN) is equivalent to the "filter operation" in image processing. For input data, the convolution operation slides a window of filters (convolution kernels) at a certain interval (step size) and applies them. As shown in the figure below, the elements of the filter at each position are multiplied by the corresponding elements of the input, and then summed (this calculation is sometimes called a multiply-accumulate operation ). Then, save this result to the corresponding location of the output. Perform this process at all positions to get the output of the convolution operation:

insert image description here

In a fully connected neural network, in addition to weight parameters, there are also biases. In CNN, the parameters of the filter correspond to the previous weights. Also, there are biases in CNNs. The processing flow of the convolution operation including bias is shown in the following figure:

insert image description here

For a more detailed explanation of convolutional neural networks, you can go to: [Study Notes] Introduction to Deep Learning: Theory and Implementation Based on Python .

Now, let's go back to PyTorch, let's take Conv2dthe function as an example, the official document of the function: TORCH.NN.FUNCTIONAL.CONV2D .

This function has the following parameters:

  • input: The input image, of size (mini_batch, in_channels, height, width).
  • weight: The size of the convolution kernel, size is (out_channels, in_channels/groups, height, width).
  • bias: bias, defaults to None.
  • stride: Step size, which is used to control the movement interval of the convolution kernel. If xit is , the step size in both horizontal and vertical directions is x. If (x, y)it the step size in the vertical direction is x, and the step size in the horizontal direction is y.
  • padding: The edge expansion operation is performed on the edge of the input image to ensure that the size of the image before and after input and output remains unchanged. In the definition of the convolution layer of PyTorch, the default is paddingzero padding, that is, padding 0 at the edge.
  • padding_mode: The way of expanding the edge.
  • dilation: Set the interval between fetches.

For example:

import torch
import torch.nn.functional as F

input = torch.tensor([
    [1, 2, 3, 0],
    [0, 1, 2, 3],
    [3, 0, 1, 2],
    [2, 3, 0, 1]
])

kernel = torch.tensor([
    [2, 0, 1],
    [0, 1, 2],
    [1, 0, 2]
])

input = torch.reshape(input, (1, 1, 4, 4))  # batch_size = 1,channel = 1
kernel = torch.reshape(kernel, (1, 1, 3, 3))

output = F.conv2d(input, kernel, stride=1)
print(output)
# tensor([[[[15, 16],
#           [ 6, 15]]]])

output = F.conv2d(input, kernel, stride=1, bias=torch.tensor([3]))  # 注意 bias 必须也是矩阵
print(output)
# tensor([[[[18, 19],
#           [ 9, 18]]]])

Guess you like

Origin blog.csdn.net/m0_51755720/article/details/128065712