SPP: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

为了适应不同大小的图片, 该篇论文使用池化层将卷积得到的特征图池化成固定大小

例如使用三层池化的SPP采用了三个卷积核大小,卷积核大小为图像大小除以1,4,16,这样就可以获得固定特征大小为1,4,16.

然后将特征图累加成长度为21一维特征.最后放入fc,就解决了输入大小不一致的问题

class SPPLayer(nn.Module):

    def __init__(self, num_levels, pool_type='max_pool'):
        super(SPPLayer, self).__init__()

        self.num_levels = num_levels
        self.pool_type = pool_type

    def forward(self, x):
        bs, c, h, w = x.size()
        # print x.size()
        pooling_layers = []
        #num_levels 是层数,对应n*n
        for i in range(self.num_levels):
            #分别对应长和宽
            kernel_size = h // (2 ** i)
            kernel_size2 = w // (2 ** i)
            if self.pool_type == 'max_pool':
                #步长和池化核
                tensor = F.max_pool2d(x, kernel_size=(kernel_size,kernel_size2),
                                      stride=(kernel_size,kernel_size2)).view(bs, -1)
            else:
                tensor = F.avg_pool2d(x, kernel_size=kernel_size,
                                      stride=kernel_size).view(bs, -1)
            # print tensor.shape
            pooling_layers.append(tensor)

        x = th.cat(pooling_layers, dim=-1)
        # print x.shape
        return x

猜你喜欢

转载自blog.csdn.net/a362682954/article/details/85173693