4 boundary padding methods that pytorch must master

 

As the core module of convolutional neural network, convolution operation must consider the convolution method of image "edge pixels" in its calculation process . Looking at the information, we can find that we can use " boundary filling before convolution " or " boundary filling after convolution ". At the same time, the specific methods of boundary filling include constant filling, zero filling, mirror filling, and repeated filling.

Before analyzing various Pads in detail, create a 2dTensor to test the subsequent filling operations:

x = torch.Tensor([[1, 2], [3, 4]])

The created Tensor values ​​are 1, 2, 3, and 4:

 

1. Zero padding ZeroPad2d

The most commonly used one is nn.ZeroPad2d, which is to use 0 for boundary padding for Tensor. We can specify the padding number in the four directions of the tensor . For example, add 1dim to the left, 2dim to the right, 3dim to the top, and 4dim to the bottom, that is Specify the paddin parameter as (1, 2, 3, 4), as follows:

pad = nn.ZeroPad2d(padding=(1, 2, 3, 4))
y = pad(x)

The obtained y is the zero-padded operation of x in the four directions according to (1, 2, 3, 4), as shown in the following figure:

 

2. Constant filling ConstantPad2d

Zero padding is a special case of constant padding. Constant padding nn.ConstantPad2d() requires us to specify the constant value used for padding. The padding is the number of padding. Here, the padding in all four directions is selected as 1dim, that is, the padding is (1, 1, 1 , 1), the code is as follows:

pad = nn.ConstantPad2d(padding=(1, 1, 1, 1), value=666)
y = pad(x)

The obtained y is filled with 666 in all four weeks:

 

3. Mirror filling ReflectionPad2d

Compared with the previous filling with a fixed value, the mirror filling method may obtain a better convolution result. Mirror filling is encapsulated in nn.ReflectionPad2d, and its filling method is the new dim value using the value of the bottom element in the opposite direction. The code is as follows:

pad = nn.ReflectionPad2d(padding=(1, 1, 1, 1))
y = pad(x)

From the result of the following figure, it can be seen that the 4 in the first row and the first column is the original 4 in the lower right corner, and the 3 in the first row and second column is the original 3 in the lower left corner:

 

4. Repeat to fill ReplicationPad2d

Repeated filling means repeating the edge pixel value of the image, and the new boundary pixel value is expanded with the edge pixel value and encapsulated in nn.ReplicationPad2d(). The filling quantity in 4 directions can also be specified:

pad = nn.ReplicationPad2d(padding=(1, 1, 1, 1))
y = pad(x)

As can be seen from the results obtained in the figure below, the boundary pixel values ​​after filling are a copy of the original 1, 2, 3, and 4:

 

to sum up:

Regarding the question of whether pytorch is padding before convolution or after convolution, according to [1], it should be the filling before convolution; the above dim does not mean dimensionality, and I did not find a suitable word description to add The word "row or column" is used dim; the choice of filling method is more important for small images, but for larger images, how to fill may not have a big impact.

reference:

【1】https://discuss.pytorch.org/t/d

Guess you like

Origin blog.csdn.net/ch206265/article/details/107161456