Pytorch Neural Network Practical Study Notes_15 Illustration of Convolutional Neural Network Implementation + Convolutional Computing

1 Convolutional Neural Network Interface

1.1 Introduction to the convolution interface

  • torch.nn.functional.conv1d: implements the convolution operation according to 1 dimension, which is often used to process sequence data.
  • torch.nn.functional.conv2d: implements the convolution operation in 2 dimensions, which is often used to process two-dimensional plane images.
  • torch.nn.functional.conv3d: Implements convolution operations in 3 dimensions, often used to process 3D graphics data.

1.2 Definition of Convolution Function

torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) 
  • input: the size of the input image (minibatch, in_channels, H, W), which is a four-dimensional tensor
  • filters: The size of the convolution kernel (out_channels, in_channel/groups, H, W), which is a four-dimensional tensor
  • Bias: The bias of each channel is a tensor with a dimension equal to out_channels
  • stride: a number or a two-tuple (SH, SW), representing the vertical and horizontal stride
  • padding: a number or a two-tuple (PH, PW), representing the vertical and horizontal padding values
  • dilation: a number representing the number of spaced elements between each element inside the convolution kernel (not commonly used, defaults to 0)
  • groups: A number that represents the number of groups when grouping convolution, especially when groups = in_channel, it is doing depth-wise convolution (depth-wise conv).

1.2 Class implementation of convolution function

class torch.nn.Conv2d(in_channels, out_channels, kennel_size, stride=1, padding=0, dilation=1, groups=1, bias=true)
  • in_channels(int) The number of channels of the input feature map
  • out_channels(int) The number of channels of the output feature map
  • kenal_size(int or tuple) convolution kernel size
  • stride(int or tuple, optional) The stride of the convolution kernel, the default is 1
  • padding (int or tuple, optional) The number of layers to add 0 to each side of the input, the default is 0
  • dilation(int or tuple, optional) The distance between convolution kernel elements, the default is 1
  • groups(int, optional) The number of groups into which the original input channel is divided, the default is 1
  • bias (bool, optional) defaults to True, indicating that the output bias can be learned
     

1.3 The difference between the two

torch.nn.Conv2D is a class, and torch.nn.functional.conv2d is a function. Only nn.xxx can be placed in Sequential, and nn.functional.xxx cannot be placed in Sequential.

The layer implemented by nn.Module is a special class defined by class Layer(nn.Module), and the functions in nn.functional are pure functions, defined by def function(input).

nn.functional.xxx needs to define its own weight, and needs to manually pass in the weight each time it is called, while nn.xxx does not.

1.4 Operation steps of convolution function

1.5 Types of convolution operations

1.5.1 Narrow convolution (vaild convolution)

That is, the generated feature map is smaller than the original original image. Its step size is variable. Suppose, the sliding step size is S, and the dimension of the original image is N1×N1. The size of the convolution kernel is [(N1-N2)/S + 1] after the convolution image.


1.5.2 Same convolution (same convolution),

The size of the convolved image is the same as the original one, the step size of the co-convolution is fixed, and the sliding step size is 1. The padding operation should be used in general operations (add 0 to the periphery of the original image to ensure that the generated size remains unchanged).


1.5.3 Full convolution (full convolution), also known as deconvolution, is mainly used in deconvolution networks for image restoration and restoration.

Expand each pixel in the original image with a convolution operation. As shown in Figure 7-16, the white block is the original image, the light-colored block is the convolution kernel, and the dark-colored block is the pixel that is being convolved. In the process of full convolution operation, it is also necessary to perform padding operation on the original image, and the generated result will be larger than the original image size. The stride is fixed to 1, and the size of the convolution kernel is the image size after convolution [N1-N2-1]

2 The use of the convolution function

2.1 Defining convolution input variables --- CNN_New.py (Part 01)

import torch

### 1.1 定义输入变量
# [batch,in_channels,in_height,in_width]
# [训练时一个batch的图片数量,图像通道数,图片高度,图片宽度]
input1 = torch.ones([1,1,5,5])
input2 = torch.ones([1,2,5,5])
input3 = torch.ones([1,1,4,4])

2.2 Validating the 0-complementary rule of convolution --- CNN_New.py (Part 02)

### 1.2 验证补0规则
# 设置padding为1,在输入数据上补1排0
padding1 = torch.nn.functional.conv2d(input1,torch.ones([1,1,1,1]),stride=1,padding=1)
print(padding1)
# 设置padding为1,在输入数据上补2行0
padding2 = torch.nn.functional.conv2d(input1,torch.ones([1,1,1,1]),stride=1,padding=(1,2))
print(padding2)

tensor([[[[0., 0., 0., 0., 0., 0., 0.],
          [0., 1., 1., 1., 1., 1., 0.],
          [0., 1., 1., 1., 1., 1., 0.],
          [0., 1., 1., 1., 1., 1., 0.],
          [0., 1., 1., 1., 1., 1., 0.],
          [0., 1., 1., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0., 0., 0.]]]])
tensor([[[[0., 0., 0., 0., 0., 0., 0., 0., 0.],
          [0., 0., 1., 1., 1., 1., 1., 0., 0.],
          [0., 0., 1., 1., 1., 1., 1., 0., 0.],
          [0., 0., 1., 1., 1., 1., 1., 0., 0.],
          [0., 0., 1., 1., 1., 1., 1., 0., 0.],
          [0., 0., 1., 1., 1., 1., 1., 0., 0.],
          [0., 0., 0., 0., 0., 0., 0., 0., 0.]]]])

2.3 Definition of convolution kernel --- CNN_New.py (Part 03)

### 1.3 定义卷积核变量
# [out_channels,in_channels,filter_height,filter_width]
# [卷积核个数,图像通道数,卷积核的高度,卷积核的宽度
filter1 = torch.tensor([-1.0,0,0,-1]).reshape([1,1,2,2]) # 1通道输入和1通道输出的2X2矩阵
filter2 = torch.tensor([-1.0,0,0,-1,-1.0,0,0,-1]).reshape([2,1,2,2])# 1通道输入和2通道输出的2X2矩阵
filter3 = torch.tensor([-1.0,0,0,-1,-1.0,0,0,-1,-1.0,0,0,-1]).reshape([3,1,2,2])# 1通道输入和3通道输出的2X2矩阵
filter4 = torch.tensor([-1.0,0,0,-1,-1.0,0,0,-1,-1.0,0,0,-1,-1.0,0,0,-1]).reshape([2,2,2,2])# 2通道输入和2通道输出的2X2矩阵
filter5 = torch.tensor([-1.0,0,0,-1,-1.0,0,0,-1]).reshape([1,2,2,2]) # 2通道输入和1通道输出的2X2矩阵

2.4 Convolution operation and its result --- CNN_New.py (Part 04)

### 1.4 卷积操作
## 1个通道输入,生成1个特征图(卷积核个数)
pl1 = torch.nn.functional.conv2d(input1,filter1,stride=2,padding=1)
print("p1",pl1)
## 1个通道输入,生成2个特征图(卷积核个数)
pl2 = torch.nn.functional.conv2d(input1,filter2,stride=2,padding=1)
print("p2",pl2)
## 1个通道输入,生成3个特征图(卷积核个数)
pl3 = torch.nn.functional.conv2d(input1,filter3,stride=2,padding=1)
print("p3",pl3)
## 2个通道输入,生成2个特征图(卷积核个数)
pl4 = torch.nn.functional.conv2d(input2,filter4,stride=2,padding=1)
print("p4",pl4)
## 2个通道输入,生成1个特征图(卷积核个数)====》对于卷积核对多通道输入的卷积处理,多通道的结果的叠加
pl5 = torch.nn.functional.conv2d(input2,filter5,stride=2,padding=1)
print("p5",pl5)
## padding不同,生成的结果也不同
pl6 = torch.nn.functional.conv2d(input1,filter1,stride=2,padding=0)
print("p6",pl6)

p1 tensor([[[[-1., -1., -1.],
          [-1., -2., -2.],
          [-1., -2., -2.]]]])
p2 tensor([[[[-1., -1., -1.],
          [-1., -2., -2.],
          [-1., -2., -2.]],

         [[-1., -1., -1.],
          [-1., -2., -2.],
          [-1., -2., -2.]]]])
p3 tensor([[[[-1., -1., -1.],
          [-1., -2., -2.],
          [-1., -2., -2.]],

         [[-1., -1., -1.],
          [-1., -2., -2.],
          [-1., -2., -2.]],

         [[-1., -1., -1.],
          [-1., -2., -2.],
          [-1., -2., -2.]]]])
p4 tensor([[[[-2., -2., -2.],
          [-2., -4., -4.],
          [-2., -4., -4.]],

         [[-2., -2., -2.],
          [-2., -4., -4.],
          [-2., -4., -4.]]]])
p5 tensor([[[[-2., -2., -2.],
          [-2., -4., -4.],
          [-2., -4., -4.]]]])
p6 tensor([[[[-2., -2.],
          [-2., -2.]]]])

Tip: Diagram of multi-channel convolution

2.5 Code Summary

import torch

### 1.1 定义输入变量
# [batch,in_channels,in_height,in_width]
# [训练时一个batch的图片数量,图像通道数,图片高度,图片宽度]
input1 = torch.ones([1,1,5,5])
input2 = torch.ones([1,2,5,5])
input3 = torch.ones([1,1,4,4])

### 1.2 验证补0规则
# 设置padding为1,在输入数据上补1排0
padding1 = torch.nn.functional.conv2d(input1,torch.ones([1,1,1,1]),stride=1,padding=1)
print(padding1)
# 设置padding为1,在输入数据上补2行0
padding2 = torch.nn.functional.conv2d(input1,torch.ones([1,1,1,1]),stride=1,padding=(1,2))
print(padding2)

### 1.3 定义卷积核变量
# [out_channels,in_channels,filter_height,filter_width]
# [卷积核个数,图像通道数,卷积核的高度,卷积核的宽度
filter1 = torch.tensor([-1.0,0,0,-1]).reshape([1,1,2,2]) # 1通道输入和1通道输出的2X2矩阵
filter2 = torch.tensor([-1.0,0,0,-1,-1.0,0,0,-1]).reshape([2,1,2,2])# 1通道输入和2通道输出的2X2矩阵
filter3 = torch.tensor([-1.0,0,0,-1,-1.0,0,0,-1,-1.0,0,0,-1]).reshape([3,1,2,2])# 1通道输入和3通道输出的2X2矩阵
filter4 = torch.tensor([-1.0,0,0,-1,-1.0,0,0,-1,-1.0,0,0,-1,-1.0,0,0,-1]).reshape([2,2,2,2])# 2通道输入和2通道输出的2X2矩阵
filter5 = torch.tensor([-1.0,0,0,-1,-1.0,0,0,-1]).reshape([1,2,2,2]) # 2通道输入和1通道输出的2X2矩阵

### 1.4 卷积操作
## 1个通道输入,生成1个特征图(卷积核个数)
pl1 = torch.nn.functional.conv2d(input1,filter1,stride=2,padding=1)
print("p1",pl1)
## 1个通道输入,生成2个特征图(卷积核个数)
pl2 = torch.nn.functional.conv2d(input1,filter2,stride=2,padding=1)
print("p2",pl2)
## 1个通道输入,生成3个特征图(卷积核个数)
pl3 = torch.nn.functional.conv2d(input1,filter3,stride=2,padding=1)
print("p3",pl3)
## 2个通道输入,生成2个特征图(卷积核个数)
pl4 = torch.nn.functional.conv2d(input2,filter4,stride=2,padding=1)
print("p4",pl4)
## 2个通道输入,生成1个特征图(卷积核个数)====》对于卷积核对多通道输入的卷积处理,多通道的结果的叠加
pl5 = torch.nn.functional.conv2d(input2,filter5,stride=2,padding=1)
print("p5",pl5)
## padding不同,生成的结果也不同
pl6 = torch.nn.functional.conv2d(input1,filter1,stride=2,padding=0)
print("p6",pl6)

Guess you like

Origin blog.csdn.net/qq_39237205/article/details/123443301