pytorch 卷积操作

文章

什么是卷积？
在这里插入图片描述
就是卷积核在输入图像上移动，然后将卷积核上与输入图像上对应位置上的值相乘求和。Stride=1使用来控制卷积核的移动步长的。

卷积操作示例代码：

import torch.nn.functional as F
import torch

# 输入图像(5X5)
input = torch.tensor([[1,2,0,3,1],
                      [0,1,2,3,1],
                      [1,2,1,0,0],
                      [5,2,3,1,1],
                      [2,1,0,1,1]])

# 卷积核(3X3)
kernel = torch.tensor([[1,2,1],
                       [0,1,0],
                       [2,1,0]])


# input: torch.Size([5, 5])
print("input:\n",input.shape)
# kernel:torch.Size([3, 3])
print("kernel:\n",kernel.shape)

input = torch.reshape(input,(1,1,5,5))
kernel = torch.reshape(kernel,(1,1,3,3))


# input:torch.Size([1, 1, 5, 5])
print("input:\n",input.shape)
# kernel:torch.Size([1, 1, 3, 3])
print("kernel:\n",kernel.shape)

# 进行卷积操作 观察stride对卷积结果的影响
output = F.conv2d(input,kernel,stride=1)
print('output\n',output)


output2 = F.conv2d(input,kernel,stride=2)
print('output2\n',output2)

# 进行卷及操作 对输入图像进行边界进行扩展填充 观察padding对卷积结果的影响
output3 = F.conv2d(input,kernel,stride=1,padding=1)
print("output\n",output3)

部分代码解释：

1.reshape的作用

# reshape前
# input: torch.Size([5, 5])  kernel:torch.Size([3, 3])
input = torch.reshape(input,(1,1,5,5))
kernel = torch.reshape(kernel,(1,1,3,3))
# reshape后
# input:torch.Size([1, 1, 5, 5])  kernel:torch.Size([1, 1, 3, 3])

为什么需要对input和kernel进行reshape这个操作呢？
因为使用torch.nn.functional.conv2d对输入的参数进行了限制，可以看到conv2d对输入参数的要求,要求input的输入是(minibatch,in_channels,iH,iW)，其中in_channels表示通道数，iH表示输入图像的高，iW表示输入图像的宽。weigt的输入是kernel(卷积核)，可以看到对weight的参数要求类似于input，其中outchannels表示输出通道数，in_channels表示输入通道数(groups默认等于1)，kH表示卷积核的高，kW表示卷积核的宽。所以需要对input和kernel进行reshape操作。
在这里插入图片描述
2.stride参数

# 进行卷积操作 观察stride对卷积结果的影响
output = F.conv2d(input,kernel,stride=1)
print('output\n',output)

output2 = F.conv2d(input,kernel,stride=2)
print('output2\n',output2)

运行结果：
在这里插入图片描述

可以看到官方文档对Stride的解释：

stride – the stride of the convolving kernel. Can be a single number or a tuple (sH, sW). Default: 1
当stride输入的是一个数时，则这个数为卷积核横向和纵向的移动步长，当stride输入的是一个元组的时候，可以分别设置卷积核横纵向移动的步长。

3.padding参数

# 进行卷及操作 对输入图像进行边界进行扩展填充 观察padding对卷积结果的影响
output3 = F.conv2d(input,kernel,stride=1,padding=1)
print("output\n",output3)

在上述代码中padding参数的作用相当于就是，对输入图像横向和纵向边界进行扩展1个长度并填充0，再进行卷积操作。
在这里插入图片描述
可以看到官方文档对Padding的解释：

扫描二维码关注公众号，回复： 14322890 查看本文章

padding – implicit paddings on both sides of the input. Can be a string {‘valid’, ‘same’}, single number or a tuple (padH, padW). Default: 0 padding=‘valid’ is the same as no padding. padding=‘same’ pads the input so the output has the same shape as the input. However, this mode doesn’t support any stride values other than 1.
当padding输入的是一个数时，则这个数为图像横向和纵向边界扩展填充(默认填充值为0)的长度，当padding输入的是一个元组的时候，可以分别设置图像横纵向边界扩展填充的长度。

文章

猜你喜欢