MaxPool2d

CLASS
torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)

功能：在由几个输入平面组成的输入信号上应用2D最大池化。

举个简单的例子：当输入大小是 $(N, C, H, W)$ , 输出大小是 $N,C,H_{out},W_{out})$ ，并且卷积核的大小是 $(k H, kW)$ 。输出的大小可以描述为:

$\begin{aligned}\operatorname{out}\left(N_i, C_j, h, w\right)= & \max _{m=0, \ldots, k H-1} \max _{n=0, \ldots, k W-1} \\& \operatorname{input}\left(N_i, C_j, \operatorname{stride}[0] \times h+m, \text { stride }[1] \times w+n\right)\end{aligned}$
注意：
- 如果padding非零，则输入的两边隐式填充为负无穷大。
- 参数kernel_size, stride, padding, dilation 可以是如下两种形式：****
  - 一个int :在这种情况下，高度和宽度维度使用相同的值
  - 包含两个int的元组：在这种情况下，第一个int用于表示高度维度，第二个int用于表示宽度维度
参数：
- kernel_size(Union[int,Tuple[int,int]]) : 执行最大池化的窗口大小
- stride(Union[int,Tuple[int,int]]) : 执行最大池化窗口的步长，默认值是窗口的大小。
- padding(Union[int,Tuple[int,int]]) : 两边隐式添加零填充
- dilation(Union[int,Tuple[int,int]]) ：控制窗口中元素步幅的参数
- return_indices(bool) : 如果为True，将返回输出的最大索引。再后续会对torch.nn.MaxUnpool2d有用。****
- ceil_mode : 当为True时，将使用ceil而不是floor来计算输出形状。
形状：
- 输入： $N,C,H_{in},W_{in})$ 或者 $C,H_{in},W_{in})$
- 输出： $N,C,H_{out},W_{out})$ 或者 $C,H_{out},W_{out})$

例子

# pool of square window of size=3, stride=2
m = nn.MaxPool2d(3, stride=2)
# pool of non-square window
m = nn.MaxPool2d((3, 2), stride=(2, 1))
input = torch.randn(20, 16, 50, 32)
output = m(input)

AvgPool2d

CLASS
torch.nn.AvgPool2d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)

功能

在由几个输入平面组成的输入信号上应用二维平均池化。

当输入的大小为 $(N, C, H, W)$ , 输出 $N,C,H_{out},W_{out})$ , 卷积核大小 $(k H, kW)$

$\operatorname{out}\left(N_i, C_j, h, w\right)=\frac{1}{k H * k W} \sum_{m=0}^{k H-1} \sum_{n=0}^{k W-1} \operatorname{input}\left(N_i, C_j, \text { stride }[0] \times h+m, \text { stride }[1] \times w+n\right)$

如果padding非零，则输入两边隐式填充零，以填充点数。

当ceil_mode=True时，如果滑动窗口从左内边距或输入框内开始，则允许滑动窗口越界。从右侧填充区域开始的滑动窗口将被忽略。

参数kernel_size、stride、padding可以是:

单个int——在这种情况下，高度和宽度维度使用相同的值
由两个int组成的元组——在这种情况下，第一个int用于表示高度维度，第二个int用于表示宽度维度

参数
- kernel_size(Union[int,Tuple[int,int]]) : 执行池化的窗口大小
- stride(Union[int,Tuple[int,int]]) : 执行最大池化窗口的步长，默认值是窗口的大小。
- padding(Union[int,Tuple[int,int]]) : 两边隐式添加零填充
- ceil_mode : 当为True时，将使用ceil而不是floor来计算输出形状
- count_include_pad(bool) : 当为True时，将在平均计算中包括零填充
- divisor_override(optional[int]) : 如果指定，它将被用作除数，否则将使用池化区域的大小。
形状
- 输入： $\left(N, C, H_{i n}, W_{i n}\right)$ 或者 $\left(C, H_{i n}, W_{i n}\right)$
- 输出： $\left(N, C, H_{\text {out }}, W_{\text {out }}\right)$ 或者 $\left(C, H_{\text {out }}, W_{\text {out }}\right)$ , 其中
  
  $\begin{aligned} H_{\text {out }} & =\left\lfloor\frac{H_{\text {in }}+2 \times \text { padding }[0]-\text { kernelsize }[0]}{\operatorname{stride}[0]}+1\right\rfloor \\ W_{\text {out }} & =\left\lfloor\frac{W_{\text {in }}+2 \times \text { padding }[1]-\text { kernelsize }[1]}{\text { stride }[1]}+1\right\rfloor \end{aligned}$
例子

# pool of square window of size=3, stride=2
m = nn.AvgPool2d(3, stride=2)
# pool of non-square window
m = nn.AvgPool2d((3, 2), stride=(2, 1))
input = torch.randn(20, 16, 50, 32)
output = m(input)

AdaptiveAvgPool2d

CLASS
torch.nn.AdaptiveAvgPool2d(output_size)

功能：在由几个输入平面组成的输入信号上应用二维自适应平均池化。对于任何输入大小，输出的大小均是 $H\times W$ 。输出的特征数等于与输入平面的数量。
参数
- output_size(Union[int,None,Tuple[]Optional[int],Optional[int]]) : H x W形式的图像的目标输出大小可以是一个元组(H, W)或一个单独的H，对于正方形图像H x H H和W可以是一个int，或者None，这意味着大小将与输入相同。
形状
- 输入： $N,C,H_{in},W_{in})$ 或者 $C,H_{in},W_{in})$
- 输出： $N,C,S_0,S_1)$ 或者 $C,S_0,S_1)$ , 其中 $S=output\_size$

例子

# target output size of 5x7
m = nn.AdaptiveAvgPool2d((5,7))
input = torch.randn(1, 64, 8, 9)
output = m(input)
# target output size of 7x7 (square)
m = nn.AdaptiveAvgPool2d(7)
input = torch.randn(1, 64, 10, 9)
output = m(input)
# target output size of 10x7
m = nn.AdaptiveAvgPool2d((None, 7))
input = torch.randn(1, 64, 10, 9)
output = m(input)

MaxUnpool2d

CLASS
torch.nn.MaxUnpool2d(kernel_size, stride=None, padding=0)

功能：计算MaxPool2d的逆运算（MaxPool2d并不是完全可逆的，因为部分非最大的信息是丢失的）
参数
- kernel_size(int or tuple) : 最大池化的窗口大小
- stride(int or tuple) : 最大池化窗口的步长，默认为kernel_size。
- padding(int or tuple) : 添加在input上的填充
输入
- input: 输入的待转换的的张量
- indices: MaxPool2d中给出的索引
- output_size(optional): 输出的目标大小
形状
- 输入： $N,C,H_{in},W_{in})$ 或 $C,H_{in},W_{in})$
- 输出： $N,C,H_{out},W_{out})$ 或 $C,H_{out},W_{out})$
  
  $kernel_size [ 1 ] \begin{gathered}H_{\text {out }}=\left(H_{\text {in }}-1\right) \times \operatorname{stride}[0]-2 \times \operatorname{padding}[0]+\operatorname{kernel} \_ \text {size }[0] \\W_{\text {out }}=\left(W_{\text {in }}-1\right) \times \operatorname{stride}[1]-2 \times \text { padding }[1]+\text { kernel\_size }[1]\end{gathered}$
  
  或者在output_size中的call operator中给出
例子

>>> pool = nn.MaxPool2d(2, stride=2, return_indices=True)
>>> unpool = nn.MaxUnpool2d(2, stride=2)
>>> input = torch.tensor([[[[ 1.,  2.,  3.,  4.],
                            [ 5.,  6.,  7.,  8.],
                            [ 9., 10., 11., 12.],
                            [13., 14., 15., 16.]]]])
>>> output, indices = pool(input)
>>> unpool(output, indices)
tensor([[[[  0.,   0.,   0.,   0.],
          [  0.,   6.,   0.,   8.],
          [  0.,   0.,   0.,   0.],
          [  0.,  14.,   0.,  16.]]]])
>>> # Now using output_size to resolve an ambiguous size for the inverse
>>> input = torch.torch.tensor([[[[ 1.,  2.,  3., 4., 5.],
                                  [ 6.,  7.,  8., 9., 10.],
                                  [11., 12., 13., 14., 15.],
                                  [16., 17., 18., 19., 20.]]]])
>>> output, indices = pool(input)
>>> # This call will not work without specifying output_size
>>> unpool(output, indices, output_size=input.size())
tensor([[[[ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  7.,  0.,  9.,  0.],
          [ 0.,  0.,  0.,  0.,  0.],
          [ 0., 17.,  0., 19.,  0.]]]])

【torch.nn : Pooling Layers】

文章目录

MaxPool2d

AvgPool2d

AdaptiveAvgPool2d

MaxUnpool2d

猜你喜欢