-
Compared with AlexNet and VGG, GoogLeNet (Inception-v1) has multiple branches, and introduces 1×1 convolution to help reduce the amount of network calculations -
Inception-v2
introduces Batch Normalization (BN); 5×5 convolution uses two 3×3 convolutions instead -
Inception-v3
asymmetric convolution (n×n convolution is divided into 1×n and n×1 two convolutions);
new pooling (in order to prevent information loss and not increase the amount of calculation, the series pooling conv is changed to parallel conv, pool and concat)
Label smooth -
Inception-v4
introduces ResNet's shortcut idea -
Xception
Separable Convolution
normal conv (3×3,256)
(1×1,256)
(3×3,1) -
ResNeXt
introduces a new dimension of cardinality256d in-(256,1×1,64)-(64,3×3,64)-(64,1×1,256)-sum x-256-d out
→ change to →
256d in-(256,1× 1,4)*32-(4,3×3,4)*32-(4,1×1,256)*32-concat-sum x-256-d out (32 paths use the same convolution parameters)
block depth Use when >3 -
PreAct ResNet
conv-bn-relu-sum x-relu → 改为 → bn-relu-conv-bn-relu-conv-sum x -
SENet
x→(c×h×w)→Global pooling(c×1×1)→fc(c/16×1×1)→fc(c×1×1)→sigmoid(c×1×1)→scale*x(c×h×w) -
MobileNet V1
Depthwise Separable Convolution
normal conv (et. 3×3,256)
depthwise convolution (3×3,1)
pointwise convolution (1×1,256) -
MobileNet V2
Depthwise Separable Convolution Same as v1
improvement: the inverted residual with linear bottleneck;
Residuals block (1×1→3×3→1×1 channel first compression and then expansion)
Inverted residuals (1 ×1→3×3→1×1 channel is first expanded and then compressed, because DW convolution does not have the ability to change the number of channels, and the effect of feature extraction in low-dimensional space is not good)
linear bottleneck (remove last PW Relu. The activation function is in high-dimensional space It can effectively increase the nonlinearity, but it will destroy the feature in the low-dimensional space. The main function of the second PW is to reduce the dimensionality)v1 in → DW 3 × 3 → Relu → PW 1 × 1 → Relu → out
v2 in → PW 1 × 1 → DW 3 × 3 → Relu → PW 1 × 1 → out -
The lightweight attention model based on the squeeze and excitation structure introduced by MobileNet V3
optimizes the activation function
using NAS -
DPN
High Order RNN structure (HORNN) combines ResNeXt and DenseNet -
ShuffleNet V1
Channel Shuffle for Group Convolutions -
ShuffleNet V2
only uses FLOPs as a measurement standard is not comprehensive. One of the overlooked factors is that MAC (memory access cost)
uses a "balanced" convolutional layer (input and output channels are the same);
use packet convolution carefully;
reduce the use of fragments Operation;
reduce element-level operations;The 1x1 group convolution
Channel Split is abandoned : the feature map is divided into two groups A and B.
Group A is considered a short-cut; group B passes through bottleneck input and output channels, and
finally concat A and B
concat and perform Channel Shuffle
[Interview] GoogLeNet, ResNet, ShuffleNet, MobileNet text version
Guess you like
Origin blog.csdn.net/qq_31622015/article/details/102786825
Recommended
Ranking