【MobileNet】MobileNet V1

Document name: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
Download address: https://arxiv.org/pdf/1704.04861.pdf


1 Introduction

1. MobileNets is based on a streamlined architecture: a streamlined architecture refers to a cross-layer connection structure like vgg, where data flows from beginning to end, without resnet or densenet

2. UseDepthwise Separable Convolution( depthwise separable convolutions) to build a lightweight deep neural network, which greatly reduces the amount of calculation of the network in the case of a certain loss of accuracy, and is suitable for mobile applications and embedded applications with small computing power.

3. The network introduces two simple global hyperparameters:width multiplierandresolution multiplier, which can effectively trade off between latency and accuracy. These hyperparameters allow model builders to choose the right size model for their application based on the constraints of the problem.
\quad \quad \quad- α(Width Multiplier)
\quad \quad \quad- β(Resolution Multiplier)

The above marked yellow is the focus of MobileNet, we will explain in detail below.


2、Depthwise Separable Convolution

Mobile Net is based on Depthwise Separable Convolution, Depthwise Separable Convolution is a form of decomposable convolution, which can decompose standard convolution into depth-wise convolution and point-wise convolution (conv 1×1)
insert image description here
insert image description here

1) Advantages of Depthwise Separable Convolution

This factorization greatly reduces computation and model size.

Suppose the height and width of the input image are both DF D_FDF, the number of channels is MMM ; the height and width of the filter are bothDK D_KDK, the number of channels is also MMM , the number isNNN In the
case of padding=same, stride=1, that is, the output image size is the same as the input image size, which is( DF , DF ) (D_F, D_F)DFDF) ,
the amount of calculation to do a standard convolution is:DF ∗ DF ∗ M ∗ N ∗ DK ∗ DK D_F * D_F * M * N * D_K * D_KDFDFMNDKDK

insert image description here

To achieve the same output size effect, the amount of calculation required to do a Depthwise Separable Convolution is: DF ∗ DF ∗ M ∗ DK ∗ DK + M ∗ N ∗ DF ∗ DF D_F * D_F * M * D_K * D_K + M * N * D_F * D_FDFDFMDKDK+MNDFDF. in:

  • D F ∗ D F ∗ M ∗ D K ∗ D K D_F * D_F * M * D_K * D_K DFDFMDKDKIs the calculation amount of the depth- wish convolution part
  • M ∗ N ∗ D F ∗ D F M * N * D_F * D_F MNDFDFIs the calculation amount of the point-wis convolution part

insert image description here

I divide the calculation amount of the two methods: (Mobile Net basically uses a depth-wise conv with filter=3*3, that is, DK 2 = 3 D_K^2=3DK2=3 )
Calculation amount of D epthwise S eparable C onvolution Calculation amount of standard convolution = DF ∗ DF ∗ M ∗ DK ∗ DK + M ∗ N ∗ DF ∗ DFDF ∗ DF ∗ M ∗ N ∗ DK ∗ DK = 1 N + 1 DK 2 = 1 N + 1 9 \frac{Calculation amount of Depthwise\ Separable\ Convolution}{Calculation amount of standard convolution} \\ \ \\ = \frac{D_F * D_F * M * D_K * D_K + M * N * D_F * D_F}{D_F * D_F * M * N * D_K * D_K} \\ \\ \ \\ = \frac{1}{N} +\frac{1}{D_K^2} \\ \ \\ = \frac{1}{N} +\frac{1}{9}Computational amount of standard convolutionCalculation   amount De pt h w i se S e p a r ab l e C o n v o l u t i o n =DFDFMNDKDKDFDFMDKDK+MNDFDF =N1+DK21 =N1+91

Therefore, it can be found that the calculation amount of Depthwise Separable Convolution is about 1/8 to 1/9 of the standard convolution calculation amount, and the calculation amount is much smaller, so the model speed will be much faster. However, such an operation will cause a certain drop in accuracy, and there is a drop, but not much.


2) Depthwise Separable Convolution network structure

  • On the left is the standard convolutional network structure (a 3x3 convolution followed by a BN layer and a ReLU layer),
  • On the right is the Depthwise Separable Convolution network structure, first a depth-with convolutional layer, followed by a BN layer and a ReLU layer, then a point-with convolutional layer, followed by a BN layer and a ReLU layer.

Please add a picture description


3) pytorch function implements depth-wise convolution

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

The above is the convolution function of pytorch. If you want to do depth-wise convolution,The focus is on the parameter groups

  • In standard regular convolution, group=1, means to group the input into groups.
  • When it is set groups=in_channelsto , it means to take each channel of the input as a group, then convolve them separately, and then concat the output of each group, and finally the number of output channels is equal to the number of input channels. Examples are as follows:
nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=1, padding=0, groups=in_channels, bias=False)

2. Mobile network structure

Please add a picture description

Pytorch implements Mobile network structure (benchmark)

Model implementation of pytorch: github address


3. Hyperparameter α: Width Multiplier

虽然 MobileNet 架构已经很小,且延迟很低,但很多时候特定的场景或应用程序可能需要模型更小更快。为了构造这些更小、计算成本更低的模型,作者引入了一个非常简单的参数 - 宽度乘数 α \alpha α

宽度乘数 α \alpha α 的作用是在每一层上均匀地薄化网络,也就是每一层都按比例 α \alpha α 减小 channel 数量。比如,之前,上一层的输出 channel 为 M,输出 channel 为 N;那么现在使用 宽度乘数 α \alpha α,上一层的输出 channel 为 α \alpha αM,输出 channel 为 α \alpha αN。 这样的话,参数 数量大概是之前的 α 2 \alpha^2 α2。计算量为: D F ∗ D F ∗ α M ∗ D K ∗ D K + α M ∗ α N ∗ D F ∗ D F D_F * D_F * \alpha M * D_K * D_K + \alpha M * \alpha N * D_F * D_F DFDFαMDKDK+αMαNDFDF

α \alpha The value of α is generally: 1, 0.75, 0.5, 0.25


4. Hyperparameter β: Resolution Multiplier

Resolution multiplier β \betaβ is used to reduce the image size, generally useβ \betaβ shrinks the image to (multiples of 32): 224, 192, 160 or 128.
In this way, the number of parameters is about the previousβ 2 \beta^2b2, 设计量的:β DF ∗ β DF ∗ M ∗ DK ∗ DK + M ∗ N ∗ β DF ∗ β DF \beta D_F * \beta D_F * M * D_K * D_K + M * N * \beta D_F * \ beta D_FβDFβDFMDKDK+MNβDFβDF

Guess you like

Origin blog.csdn.net/weixin_37804469/article/details/129230681