The size, number, and number of channels of the convolution kernel

First make two points clear :

CNN的卷积核通道数 = 卷积输入层的通道数;
CNN的卷积输出层通道数 = 卷积核的个数;

1. Input matrix x format: four dimensions, in order: number of samples, image height, image width, and number of image channels

  • Input x: [batch, height, width, in_channel] four dimensions
  • Weight w: [height, width, in_channel, out_channel]
  • Output y: [batch, height, width, out_channel]

Insert image description here

as the picture shows:


Input: batch=1, height=8, width=8, in_channel=3 (four-dimensional matrix)


Kernels: convolution kernel size 3x3 (determines the output layer feature size)

Number of convolution kernel channels: 3 (RGB)

Number of convolution kernels: 5 (determines the number of output channels)


Output size calculation formula:

Insert image description here

2. Schematic diagram of the convolution process (note the paranoid size)

Insert image description here
The input size is 7x7, the number of channels is 3, and the two convolution kernels w0 and w1 are both 3x3 in size, so the convolution kernel function form is 3x3x3x2

Pytorch checks the channel number and dimension size method

查看tensor

x.shape # 尺寸
x.size() # 形状
x.ndim # 维数

Several functions related to dimension/transformation in pytorch


torch.size (): can be used to view the dimensions of the transformed matrix

>>>import torch
>>>a = torch.Tensor([[[1, 2, 3], [4, 5, 6]]])
>>>a.size()
torch.Size([1, 2, 3])

torch.view(): Convert the original tensor size to the size you want (-1 represents adaptive)

>>>b=a.view(-1, 3, 2)
>>>b
tensor([[[1., 2.],
         [3., 4.],
         [5., 6.]]])
>>>b.size()
torch.Size([1, 3, 2])

torch.squeeze()和torch.unsqueeze()

torch.squeeze() This function mainly compresses the dimensions of the data, removing dimensions with a dimension of 1, such as one row or one column. A number with one row and three columns (1,3) removes the first dimension. The dimension of one then becomes row (3).

squeeze(a) deletes all dimensions in a that are 1. Dimensions other than 1 have no effect. a.squeeze(N) is to remove the dimension specified in a with one dimension.

There is also a form b=torch.squeeze(a, N), that is, if the N dimension of a is one, it is removed.

>>> b.squeeze(2).size()
torch.Size([1, 3, 2])
>>> b.squeeze(0).size()
torch.Size([3, 2]))

The torch.unsqueeze() function mainly expands the data dimensions. Add a dimension of one to the specified position. For example, there are originally three rows of data (3). Adding one dimension to the 0 position becomes one row and three columns (1,3).

a.squeeze(N) is to specify position N in a plus a dimension of 1. Another form is b=torch.squeeze(a, N) a is to specify the position N in a plus a dimension of 1.

>>> b.unsqueeze(2).size()
torch.Size([1, 3, 1, 2])
>>> b.unsqueeze(2)
tensor([[[[1., 2.]],
  [[3., 4.]],
  [[5., 6.]]]])

torch.permute()

This function means to reorder the original tensor according to the desired position. For example, the 0th, 1st, and 2nd dimensions of the original tensor are 1, 3, and 2 respectively. Then when I execute permute(2, 0, 1), then Put the third dimension at the front, the first dimension in the middle, and the second dimension at the end, which becomes 2 * 1 * 3. Pay attention to the index of the dimension represented here, not the specific dimensions:

>>>b
tensor([[[1., 2.],
         [3., 4.],
         [5., 6.]]])
>>> b.permute(2, 0, 1)
tensor([[[1., 2.],
         [3., 4.],
         [5., 6.]]])
>>>b.size()
torch.Size([2, 1, 3])

Guess you like

Origin blog.csdn.net/weixin_46466198/article/details/130520446