4 boundary padding methods of pytorch

As the core module of the convolutional neural network, the convolution operation must consider the convolution method of the "edge pixels" of the image during its calculation process . After consulting the data, we found that we can use " boundary filling before convolution " or " boundary filling after convolution ", and the specific methods of boundary filling include constant filling, zero filling, mirror filling and repeated filling.

Before analyzing various Pads, create a 2dTensor to test the subsequent filling operation:

x = torch.Tensor([[1, 2], [3, 4]])

The created Tensor values ​​are 1, 2, 3, 4 respectively:

1. Zero-fill ZeroPad2d

The most commonly used one is nn.ZeroPad2d, which is to fill the Tensor with 0. We can specify the number of padding in the four directions of the tensor , such as adding 1dim to the left, 2dim to the right, 3dim to the top, and 4dim to the bottom, that is Specify the paddin parameter as (1, 2, 3, 4), as follows:

pad = nn.ZeroPad2d(padding=(1, 2, 3, 4))
y = pad(x)

The obtained y is the zero padding operation of x in four directions according to (1, 2, 3, 4), as shown in the figure below:

2. Fill ConstantPad2d with constants 

Zero padding is a special case of constant padding. Constant padding nn.ConstantPad2d() requires us to specify the constant value value padding number padding used for padding . Here we choose 1dim padding in all four directions, that is, padding is (1, 1, 1 , 1), the code is as follows:

pad = nn.ConstantPad2d(padding=(1, 1, 1, 1), value=666)
y = pad(x)

The obtained y is filled with 666 on all sides:

3. Mirror fill ReflectionPad2d

Compared with the previous method of filling with a fixed value, the mirror filling method may obtain better convolution results. The mirror filling is encapsulated in nn.ReflectionPad2d, and its filling method is that the new dim value uses the value of the bottommost element in the opposite direction . The code is as follows:

pad = nn.ReflectionPad2d(padding=(1, 1, 1, 1))
y = pad(x)

From the results in the figure below, we can see that the 4 in the first row and the first column is the original 4 in the lower right corner, and the 3 in the first row and second column is the original 3 in the lower left corner:

 4. Fill ReplicationPad2d repeatedly

Repeat padding is to repeat the edge pixel value of the image, expand the new border pixel value with the edge pixel value , and encapsulate it in nn.ReplicationPad2d(). You can also specify the number of padding in 4 directions:

pad = nn.ReplicationPad2d(padding=(1, 1, 1, 1))
y = pad(x)

 As can be seen from the results obtained in the figure below, the boundary pixel values ​​after filling are copies of the original 1, 2, 3, and 4:

Summarize:

Regarding the question of whether pytorch performs padding before or after convolution, according to [1], it should be padding before convolution; dim does not mean dimension in the above, and I did not find a suitable word to describe it. The word "dim" is used for the "row or column"; the choice of filling method is more important when the image is small, and the impact of filling may not be great for images with larger sizes.

reference:

【1】https://discuss.pytorch.org/t/d

Guess you like

Origin blog.csdn.net/BigDream123/article/details/122483573