Mushen's version of "Learning Deep Learning by Hands" study notes, recording the learning process, please buy books for detailed content.
B station video link
open source tutorial link
Padding and strides in convolutional layers
Apply a convolution kernel of size 5x5, input 32x32, the output will become 28x28.
Larger convolution kernels reduce the output size faster.
As a result, the network is not deep enough and can only reach 7 layers.
The effect of padding, the output can be made larger than the input: the
padding value is generally 0: the input
size of the padding reduction is linearly related to the number of layers, in order to avoid a large number of calculations, the stride needs to be increased:
the height and width are multiples of 2, convolution The kernel size is 2, and the stride is 2, which can divide the input height and width by 2 each time:
Summary
Filling is the method used when you want to make the model deeper, and the stride can double the output shape.
hands-on learning
filling
import torch
from torch import nn
# 为了方便起见,我们定义了一个计算卷积层的函数。
# 此函数初始化卷积层权重,并对输入和输出提高和缩减相应的维数
def comp_conv2d(conv2d, X):
# 这里的(1,1)表示批量大小和通道数都是1
X = X.reshape((1, 1) + X.shape)
Y = conv2d(X)
# 省略前两个维度:批量大小和通道
return Y.reshape(Y.shape[2:])
# 请注意,这里每边都填充了1行或1列,因此总共添加了2行或2列
conv2d = nn.Conv2d(1, 1, kernel_size=3, padding=1)
X = torch.rand(size=(8, 8))
comp_conv2d(conv2d, X).shape
torch.Size([8, 8])
conv2d = nn.Conv2d(1, 1, kernel_size=(5, 3), padding=(2, 1)) # 非对称卷积核
comp_conv2d(conv2d, X).shape
torch.Size([8, 8])
stride
conv2d = nn.Conv2d(1, 1, kernel_size=3, padding=1, stride=2)
comp_conv2d(conv2d, X).shape
torch.Size([4, 4])
conv2d = nn.Conv2d(1, 1, kernel_size=(3, 5), padding=(0, 1), stride=(3, 4))
comp_conv2d(conv2d, X).shape
torch.Size([2, 2])