Use upsampling and convolution instead of deconvolution

URL: https://distill.pub/2016/deconv-checkerboard/

PAPER: Deconvolution and Checkerboard Artifacts

Overlap&Learning

The image generated by the neural network generally has the phenomenon of checkerboard pattern of artifacts (high frequency), especially in the image with strong color. (checkerboard). Because the neural network has a bias preference, it is easier to generate the average color, and the more it deviates from the average, the more difficult it is to generate.

Deconvolution produces an even overlap, especially when the convolution kernel size is not divisible by the stride.

In the two-dimensional case, the uneven overlaps of different axes are superimposed, resulting in checkerboard-like features of different intensities.

Today, multi-layer deconvolution is generally used to generate images, and although the effect of artifacts can theoretically be eliminated, in practice deconvolution layers are stacked to produce artifacts at different scales.

In theory, the model can learn to deal with even overlapping positions, so the output is still balanced, which is difficult to deal with in practice.

Limiting filters to avoid severe artifacts and reduce model capacity.

In fact, it is not only the uneven overlap that has the above problems, but the even overlap model also learns nuclei that cause similar artifacts. Due to the default behavior of even overlap, even the even overlap model can easily produce artifacts during deconvolution.

A large number of factors cause this phenomenon. Artifacting is the default behavior of deconvolution, and deconvolution is prone to artefacts even if the size is carefully chosen.

Better Upsampling

The upsampling part is split from the convolution. Use the difference method to resize the image and then perform convolution.

Compared to deconvolution with only one value per output window, resize-convolution is implicitly weighted, suppressing high-frequency artifacts.

--->nearest-neighbor interpolation

--->bilinear interpolation

Artifacts in Gradients

Whenever we use a convolutional layer, we do deconvolution during backpropagation, which results in a checkerboard feature in the convolution.

max pooling leads to some degree of high frequency artifacts. [Geodesics of learned representations]

Ladder artifacts affect GANs. It is not clear what the general meaning of the ladder phenomenon is, maybe some neurons get the gradient of neighbors multiple times, maybe some pixels in the input have too much influence.