nn.Conv2d explanation in pytorch

1. Fill mode in pytorch nn.Con2d

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode=‘zeros’, device=None, dtype=None)

1.1 The meaning of padding parameters

First, padd = N, which means that N values ​​are filled in the four directions of up, down, left and right respectively;

For example, if padd = N = 1, it means that 1 value is filled in the top, bottom, left and right. Then the original input matrix will add 2* N rows and 2* N columns. Here, 2 rows and 2 columns will be added. ;

In this way, we will understand why when calculating the output of 2-dimensional convolution,

yes

[ i + 2 ∗ p a d d i n g − k e r n e l s i z e ] 下取整 / s t r i d e + 1 ; [ i + 2*padding -kernel_{size} ]下取整 / stride + 1; [i+2paddingkernelwithifrom]下取整/stride+1;

1.2 padding_mode parameter

This parameter specifies how to generate the specific values ​​of these paddings during padding.
That is, the method used to generate the padding values;

PyTorch two-dimensional convolution function torch.nn.Conv2d() has a "padding_mode" parameter, and there are 4 options: 'zeros', 'reflect',
' replicate' or 'circular', the default option is 'zeros', which is zero padding. What exactly are these four filling methods?

padding_mode (string, optional): `'zeros'`, `'reflect'`,  
        `'replicate'` or `'circular'`. Default: `'zeros'` 

In order to intuitively observe these four filling methods, we define a 1*1 convolution and set the convolution kernel weight to 1, so that after performing convolution calculations with different filling methods, we can obtain the filled matrix. In this example, we generate a 4*4 matrix composed of 1~16, and perform convolution calculations with different filling methods on it.

 In [51]: x = torch.nn.Parameter(torch.reshape(torch.range(1,16),(1,1,4,4)))

In [52]: x
Out[52]:
Parameter containing:
tensor([[[[ 1.,  2.,  3.,  4.],
          [ 5.,  6.,  7.,  8.],
          [ 9., 10., 11., 12.],
          [13., 14., 15., 16.]]]], requires_grad=True) 
1.‘zeros’

'zeros' is the most common zero padding, that is, using 0 in the two dimensions of the matrixheight and width Pad, padding will be done on both sides of a dimension.

 In [53]: conv_zeros = torch.nn.Conv2d(1,1,1,1,padding=1,padding_mode='zeros',bias=False)

In [54]: conv_zeros
Out[54]: Conv2d(1, 1, kernel_size=(1, 1), stride=(1, 1), padding=(1, 1), bias=False)

In [55]: conv_zeros.weight = torch.nn.Parameter(torch.ones(1,1,1,1))

In [56]: conv_zeros.weight
Out[56]:
Parameter containing:
tensor([[[[1.]]]], requires_grad=True)

In [57]: conv_zeros(x)
Out[57]:
tensor([[[[ 0.,  0.,  0.,  0.,  0.,  0.],
          [ 0.,  1.,  2.,  3.,  4.,  0.],
          [ 0.,  5.,  6.,  7.,  8.,  0.],
          [ 0.,  9., 10., 11., 12.,  0.],
          [ 0., 13., 14., 15., 16.,  0.],
          [ 0.,  0.,  0.,  0.,  0.,  0.]]]], grad_fn=<ThnnConv2DBackward>) 

As a result bias 参数设置 为 True:
Insert image description here

x = torch.nn.Parameter(torch.reshape(torch.range(1,16),(1,1,4,4)))
conv_zeros = torch.nn.Conv2d(1,1,1,1,padding=1,padding_mode='zeros',bias=False)
conv_zeros_bias = torch.nn.Conv2d(1,1,1,1,padding=1,padding_mode='zeros',bias=True)
conv_zeros.weight = torch.nn.Parameter(torch.ones(1,1,1,1))
conv_zeros(x)
tensor([[[[ 0.,  0.,  0.,  0.,  0.,  0.],
          [ 0.,  1.,  2.,  3.,  4.,  0.],
          [ 0.,  5.,  6.,  7.,  8.,  0.],
          [ 0.,  9., 10., 11., 12.,  0.],
          [ 0., 13., 14., 15., 16.,  0.],
          [ 0.,  0.,  0.,  0.,  0.,  0.]]]],
       grad_fn=<MkldnnConvolutionBackward>)
conv_zeros_bias(x)
tensor([[[[ 0.5259,  0.5259,  0.5259,  0.5259,  0.5259,  0.5259],
          [ 0.5259,  0.4084,  0.2909,  0.1734,  0.0559,  0.5259],
          [ 0.5259, -0.0616, -0.1791, -0.2966, -0.4141,  0.5259],
          [ 0.5259, -0.5316, -0.6492, -0.7667, -0.8842,  0.5259],
          [ 0.5259, -1.0017, -1.1192, -1.2367, -1.3542,  0.5259],
          [ 0.5259,  0.5259,  0.5259,  0.5259,  0.5259,  0.5259]]]],
       grad_fn=<MkldnnConvolutionBackward>)


Insert image description here

Then the question is, whether to set the bias to True,
The same input, the same learnable parameter weight,
As long as the bias is set, will different results be obtained?

那么 bias 到底 起到什么作用呢?

2.‘reflect’

'Reflect' takes the edge of the matrix as the axis of symmetry and fills the elements in the matrix symmetrically to the outermost periphery.

 In [58]: conv_reflect = torch.nn.Conv2d(1,1,1,1,padding=1,padding_mode='reflect',bias=False)

In [59]: conv_reflect.weight = torch.nn.Parameter(torch.ones(1,1,1,1))

In [60]: conv_reflect(x)
Out[60]:
tensor([[[[ 6.,  5.,  6.,  7.,  8.,  7.],
          [ 2.,  1.,  2.,  3.,  4.,  3.],
          [ 6.,  5.,  6.,  7.,  8.,  7.],
          [10.,  9., 10., 11., 12., 11.],
          [14., 13., 14., 15., 16., 15.],
          [10.,  9., 10., 11., 12., 11.]]]], grad_fn=<ThnnConv2DBackward>) 
3.‘replicate’

'replicate' copies the edges of the matrix and fills it to the outside of the matrix.

 In [61]: conv_reflect = torch.nn.Conv2d(1,1,1,1,padding=1,padding_mode='replicate',bias=False)

In [62]: conv_reflect.weight = torch.nn.Parameter(torch.ones(1,1,1,1))

In [63]: conv_replicate(x)
Out[63]:
tensor([[[[ 1.,  1.,  2.,  3.,  4.,  4.],
          [ 1.,  1.,  2.,  3.,  4.,  4.],
          [ 5.,  5.,  6.,  7.,  8.,  8.],
          [ 9.,  9., 10., 11., 12., 12.],
          [13., 13., 14., 15., 16., 16.],
          [13., 13., 14., 15., 16., 16.]]]], grad_fn=<ThnnConv2DBackward>) 
4.‘circular’

As the name suggests, 'circular' means filling in a loop. How does it loop? Let’s look at the example first:

 In [64]: conv_reflect = torch.nn.Conv2d(1,1,1,1,padding=1,padding_mode='circular',bias=False)

In [65]: conv_reflect.weight = torch.nn.Parameter(torch.ones(1,1,1,1))

In [66]: conv_circular(x)
Out[66]:
tensor([[[[16., 13., 14., 15., 16., 13.],
          [ 4.,  1.,  2.,  3.,  4.,  1.],
          [ 8.,  5.,  6.,  7.,  8.,  5.],
          [12.,  9., 10., 11., 12.,  9.],
          [16., 13., 14., 15., 16., 13.],
          [ 4.,  1.,  2.,  3.,  4.,  1.]]]], grad_fn=<ThnnConv2DBackward>) 

If the input matrix is ​​extended infinitely from left to right and from top to bottom, it will be in the following form:

tensor([[[[ 1.,  2.,  3.,  4.,  1.,  2.,  3.,  4.,  1.,  2.,  3.,  4.],
          [ 5.,  6.,  7.,  8.,  5.,  6.,  7.,  8.,  5.,  6.,  7.,  8.],
          [ 9., 10., 11., 12.,  9., 10., 11., 12.,  9., 10., 11., 12.],
          [13., 14., 15., 16., 13., 14., 15., 16., 13., 14., 15., 16.],
          [ 1.,  2.,  3.,  4.,  1.,  2.,  3.,  4.,  1.,  2.,  3.,  4.],
          [ 5.,  6.,  7.,  8.,  5.,  6.,  7.,  8.,  5.,  6.,  7.,  8.],
          [ 9., 10., 11., 12.,  9., 10., 11., 12.,  9., 10., 11., 12.],
          [13., 14., 15., 16., 13., 14., 15., 16., 13., 14., 15., 16.],
          [ 1.,  2.,  3.,  4.,  1.,  2.,  3.,  4.,  1.,  2.,  3.,  4.],
          [ 5.,  6.,  7.,  8.,  5.,  6.,  7.,  8.,  5.,  6.,  7.,  8.],
          [ 9., 10., 11., 12.,  9., 10., 11., 12.,  9., 10., 11., 12.],
          [13., 14., 15., 16., 13., 14., 15., 16., 13., 14., 15., 16.]]]]) 

image.png

Did you see it? If extended infinitely, this is a loop of the original 4*4 matrix. The above matrix is ​​the result of filling 4 units in both the height and width dimensions. If only 1 unit is filled, then only the result after filling one unit is intercepted. matrix:

image.png

This is the result of filling only 1 unit in the example.

refer

https://www.jianshu.com/p/a6da4ad8e8e7
Recommendation: https://blog.csdn.net/g11d111/article/details/82665265

Guess you like

Origin blog.csdn.net/chumingqian/article/details/134222819