pytorch下采样上采样维度无法对齐的问题

pytorch下采样上采样维度无法对齐的问题

问题如上,比方说有如下一段pytorch网络代码

model += [nn.Conv2d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=1),
                      norm_layer(ngf * mult * 2), activation]
....

# model += [nn.ConvTranspose2d(ngf * mult, int(ngf * mult / 2), kernel_size=3, stride=2, padding=1,
            #                              output_padding=1),
            #           norm_layer(int(ngf * mult / 2)), activation]
model += [nn.ConvTranspose2d(ngf * mult, int(ngf * mult / 2), kernel_size=3, stride=2, padding=1),
                      norm_layer(int(ngf * mult / 2)), activation]

这是一段非常常见的pytorch网络代码,其中对应的就是 卷积核为3 ,步长为2,pading为 1的卷积和对应的反卷积 。

如果图像输入是二的幂次方,那么上面的注释代码完全没问题,但是如果是随意的维度,那么可能就会对应的维度对不上

就要注意你的反卷集是不是加了output_padding这个字段

手册上是这么说的

 The padding argument effectively adds kernel_size - 1 - padding amount of zero padding to both sizes of the input. This is set so that when a Conv2d and a ConvTranspose2d are initialized with same parameters, they are inverses of each other in regard to the input and output shapes. However, when stride > 1, Conv2d maps multiple input shapes to the same output shape. output_padding is provided to resolve this ambiguity by effectively increasing the calculated output shape on one side. Note that output_padding is only used to find output shape, but does not actually add zero-padding to output.

原文的意思大致是说,当步长为2的时候,输入的维度和对应下采样再上采样的维度可能会对不上,对应的output_padding就是为了解决这个对不上的问题而再另一边在加上对应的维度,使得维数相同。那么这种做法,对于输入维度和输出维度本身就没有偏差的计算的时候,就会对不上。没有偏差,就是指的是,卷积维度计算,每次都刚好整除步长,这样就不会有问题。

猜你喜欢

转载自blog.csdn.net/Willen_/article/details/88618604