Convolution calculates the size before and after convolution

Input image parameters:

  • Image size: W*W
  • Convolution kernel size: F * F
  • Step size: S
  • The size of zero padding: P

The calculation formula is:

N = \frac{W - F + 2P}{S}+1

 Output image size:

N * N

Example:

# coding:utf-8
import torch
import torch.nn as nn


class CNNTestModel(nn.Module):
    def __init__(self):
        super().__init__()
        in_channels = 3
        out_channels = 64
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=7, stride=2, padding=3, bias=False)

    def forward(self, x):
        out = self.conv(x)
        return out


if __name__ == '__main__':
    img = torch.rand(1, 3, 224, 224)
    model = CNNTestModel()
    out = model(img)
    print(img.shape)
    print(out.shape)

Print result:

torch.Size([1, 3, 224, 224])
torch.Size([1, 64, 112, 112])

N = (W − F + 2P )/S+1 = (224-7+2x3)/2 + 1 = 112

Detailed explanation:

The number of channels will generate the same number of convolution kernels as the set number of output channels to convolve the image, that is, the number of convolution kernels is equal to the number of channels of the output feature map.
Get the final output size as [1,64,112,112]

-------------------------------------------
(W − F + 2P ) It is equivalent to calculating the size that can be used for convolution after the first convolution

-------------------------------------------
(W − F + 2P ) /S is how many times you can move backward according to the step size of S size, that is, you can do several convolutions
because the first convolution is not included, so add a 1,
that is, N = ( W − F + 2P )/S+1

-------------------------------------------
output size = (picture width or Height - convolution kernel size + padding size) / step size + 1
For pictures with different width and height, the above formula can be used to calculate the final output size.

  • The channel of the convolution kernel is the same as the channel of the input feature layer

  • The output feature matrix channel has the same number of convolution kernels

     

Visual display:

Convolution dynamic graph display

Guess you like

Origin blog.csdn.net/WakingStone/article/details/129418362