[pytorch, learning]-5.4 Pooling layer

reference

5.4 Pooling layer

In this section we introduce the pooling layer, which is proposed to alleviate the excessive sensitivity of the convolutional layer to position.

5.4.1 Two-dimensional maximum pooling layer and average pooling layer

The pooling layer directly calculates the maximum or average value of the elements in the pooling window. This operation is also called the maximum pooling layer or the average pooling layer.
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-zFMDFZFa-1594174772079)(attachment:image.png)]

Next, the forward calculation of the pooling layer is implemented in the pool2d function.

import torch
import torch.nn as nn

def pool2d(X, pool_size, mode="max"):
    X = X.float()
    p_h, p_w = pool_size
    Y = torch.zeros(X.shape[0] - p_h + 1, X.shape[1] - p_w + 1)
    for i in range(Y.shape[0]):
        for j in range(Y.shape[1]):
            if mode == 'max':
                Y[i, j] = X[i: i + p_h, j: j + p_w].max()
            elif mode == 'avg':
                Y[i, j] = X[i: i + p_h, j: j + p_w].mean()
        
    return Y
X = torch.tensor([[0,1,2], [3,4,5], [6,7,8]])
pool2d(X, (2, 2))

Insert picture description here
Let's verify the average pooling layer

pool2d(X, (2,2),'avg')

Insert picture description here

5.4.2 Padding and stride

The pooling layer can also define fill and stride

X = torch.arange(16, dtype=torch.float).view((1, 1, 4, 4))
X

Insert picture description here
By default, the stride in the MaxPool2d instance has the same shape as the pooling window. The pooling window with shape (3, 3) is used below, and the stride with shape (3, 3) is obtained by default.

pool2d = nn.MaxPool2d(3)
pool2d(X)

Insert picture description here
We can manually specify the stride and padding.

pool2d = nn.MaxPool2d(3, padding=1, stride=2)
pool2d(X)

Insert picture description here
Of course, we can also specify a non-square pooling window, and specify the padding and stride on the height and width respectively.

pool2d = nn.MaxPool2d((2, 4), padding=(1, 2), stride=(2, 3))
pool2d(X)

Insert picture description here

5.4.3 Multi-channel

The pooling layer pools each input channel separately, instead of adding the inputs of each channel by channel like the convolutional layer.

X = torch.cat((X, X + 1), dim=1)
X

Insert picture description here

After pooling, we find that the number of output channels is still 2.

Guess you like

Origin blog.csdn.net/piano9425/article/details/107199451