[Attention] mechanism CV in the most simple and easy to implement SE module

Squeeze-and-Excitation Networks

SENet is Squeeze-and-Excitation Networks for short, got ImageNet2017 championship classification, the effect has been recognized, SE module made its idea is simple, easy to implement, and can be easily loaded into an existing network model framework. SENet mainly studied the correlation between the channel, the channel screened out for attention, a little bit to increase the amount of calculation, but the results were better.

It will be appreciated by implementation on his view through the convolution of the feature map, to give the same number of channels, and a one-dimensional vector as an evaluation score for each channel, and then change the scores are applied to the corresponding the channel, to get the result, on the basis of the original added only one module, below we use pytorch realize this very simple module.

class SELayer(nn.Module):
    def __init__(self, channel, reduction=16):
        super(SELayer, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Sequential(
            nn.Linear(channel, channel // reduction, bias=False),
            nn.ReLU(inplace=True),
            nn.Linear(channel // reduction, channel, bias=False),
            nn.Sigmoid()
        )

    def forward(self, x):
        b, c, _, _ = x.size()
        y = self.avg_pool(x).view(b, c)
        y = self.fc(y).view(b, c, 1, 1)
        return x * y.expand_as(x)

Although the core is more than content, but can not simply end, we need to look at the following points:

  • As the article is an important mechanism of attention, this article describes how attention, how to organize related work?

    attention mechanism was already a certain degree of research and development, but also focus on the sequence of learning, image captioning, understanding in images of these work, there are already a lot of good work is to explore the mechanism of attention. senet This article explores the expressive power by modeling the relationship between channels to improve the model. related work is mainly of the sort from the network architecture, architecture search, three attentional mechanisms angle deeper, indeed very comprehensive.

  • How to explain the SE module?

    Sequeeze of C × H × W for global average pooling, resulting wherein FIG 1 × 1 × C size, this feature may be understood as having a global FIG receptive field.

    Excitation : the use of a fully connected neural network, the result after Sequeeze make a nonlinear transformation.

    特征重标定:使用Excitation 得到的结果作为权重,乘到输入特征上。

  • SE模块如何加到分类网络,效果如何?

    分类网络现在一般都是成一个block一个block,se模块就可以加到一个block结束的位置,进行一个信息refine。这里用了一些STOA的分类模型如:resnet50,resnext50,bn-inception等网络。通过添加SE模块,能使模型提升0.5-1.5%,效果还可以,增加的计算量也可以忽略不计。在轻量级网络MobileNet,ShuffleNet上也进行了实验,可以提升的点更多一点大概在1.5-2%。

  • SE模块如何加到目标检测网络,效果如何?

    主要还是将SE模块添加到backbone部分,优化学习到的内容。目标检测数据集使用的是benchmark MSCOCO, 使用的Faster R-CNN作为目标检测器,使用backbone从ResNet50替换为SE-ResNet50以后带了了两个点的AP提升,确实有效果。

  • 这篇文章的实验部分是如何设置的?

    这篇文章中也进行了消融实验,来证明SE模块的有效性,也说明了设置reduction=16的原因。

    • squeeze方式:仅仅比较了max和avg,发现avg要好一点。
    • excitation方式:使用了ReLU,Tanh,Sigmoid,发现Sigmoid好。
    • stage: resnet50有不同的阶段,通过实验发现,将se施加到所有的阶段效果最好。
    • 集成策略:将se放在残差单元的前部,后部还是平行于残差单元,最终发现,放到前部比较好。
  • 如何查看每个通道学到的attention信息并证明其有效性?

    作者选取了ImageNet中的四个类别进行了一个实验,测试backbone最后一个SE层的内容,如下图所示:

可以看出这两个类激活出来的内容有一定的差距,起到了一定的作用。

可以看出SE模块确实是有很大的作用,并且其实现起来比较简单,易于集成,对于分类问题来讲自然是效果很好,针对目标检测问题的时候,有人反映可能结果并不是特别如意,如果随意的添加SE模块可能并不能很好地提升模型的性能。添加SE模块能提升大目标的检测效果,但是对小目标检测效果有不利影响,所以可以进行实验,然后决定是否使用该模块。

Guess you like

Origin www.cnblogs.com/pprp/p/12128520.html