SE-net study notes

Squeeze-and-Excitation Networks(2019)

Abstract

Conventional convolutional operation so that the network may aggregate the information and to establish the spatial channel and informative features. Numerous studies seek to reinforce the skills network (representational power) by increasing the spatial encoding (spatial encodings) quality. This paper, the authors focus is on channel relationship and presents a new frame unit, which is named SE block, which can be adaptively adjusted based on inter-channel correlation repeated channel characteristics. This framework has strong generalization ability.

Introduction

SE building block configuration diagram:

Deeper architectures

Deepened network depth (VGGNets, Inception); using the calculation (group convolution) more expressive

Automatically by the algorithm so that the evolution of the network structure

Attention and gating mechanisms

Attention can be interpreted as a means of biasing the allocation of available computational resources towards the most informative components

trunk-and-mask attention had proposed mechanism can be used for space and attention channels, but proposed a more lightweight SeNet

Squeeze-and-excitation Blocks

Squeeze: Global Information Embedding

Since each output unit can only reflect its acceptance wild information can not be obtained background information outside this area, in order to reduce this problem, we propose, the global information space is compressed into a channel descriptor

\[ z_c=F_{sq}(u_c)=\frac {1}{H\times W} \sum_{i=1}^{H} \sum _{j=1}^{W} u_c(i,j) \]

Excitation: Adaptive Recalibration

In order to fully capture the dependencies between channels, excite function must meet the following two points:

  1. flexible (in particular, it must be capable of learning a nonlinear interaction between channels)
  2. learn a non-mutually-exclusive relationship (we would like to ensure that multiple channels are allowed to be emphasised rather than enforcing a one-hot activation)

Therefore, we use a simple door mechanism, using a sigmoid function is activated:

\ [S = F_ {ex} (z, w) = \ sigma (g (x, W)) = \ sigma (W_2 \ delta (W_1z)) \]

Wherein, $ \ delta $ is ReLU function, $ W_1 \ in R ^ {\ frac {C} {r} \ times C} $, $ W_2 \ in R ^ {C \ times \ frac {C} {r}} $.

In order to limit the complexity of the model, the ability to enhance ubiquitination, we parametric door mechanism to form a bottleneck, i.e. to a reduction ratio r as a parameter dimensionality reduction layer is formed using two layers of the non-linear layer is fully connected. The final output of the module is:

\[ \widetilde x_c=F_{scale}(u_c,s_c)=s_cu_c \]

\[ \widetilde X=[\widetilde x_1,\widetilde x_2,...\widetilde x_c] \]

Instantiations

SE block can be integrated into the standard frame (inserted after convolution layer nonlinear layer), and even can be transformed to a standard convolution.

Model and Computational Complexity

Number parameter additionally introduced is $ \ frac {2} {r} \ sum_ {i = 1} ^ {s} N_s \ cdot C_s ^ 2 $, where s is the number of Block, an increased computational complexity and 10% relative than to increase the depth of the high network efficiency.

Glossary

  1. receptive field

    Pixel size of the area on the convolutional neural network characteristic diagram (feature map) mapping each layer of the output image on the original

  2. ablation experiment

    Peel test

Guess you like

Origin www.cnblogs.com/cititude/p/11525634.html