MobileNets学习与实现

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/zhongqianli/article/details/86586671

《MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications》提出了MobileNets,它是基于流线型的架构,使用了深度可分离卷积构建轻量级的深度神经网络。引入了两个简单的全局超参数,可以高效平衡计算代价和精度。
网络共28层,1.0 MobileNet-224的精度与GoogleNet和VGG16想当,参数量为4.2M。
在这里插入图片描述

MobileNet架构

计算代价

标准卷积的计算代价

对于 D F D F M D_F * D_F * M 的特征图,需要与 N N D K D K M D_K * D_K * M 的卷积核进行标准卷积,才能得到 D F D F N D_F * D_F * N 特征图。
要保证输出特征图的大小和输入特征图的大小一样,需要进行合适的padding,在步长为1的情况下,需要卷积核在输入特征图上滑动 D F D F D_F * D_F 次,所以一个卷积核进行标准卷积的计算代价为 D K D K M D F D F D_K * D_K * M * D_F * D_F N N 个卷积核的计算代价为 D K D K M D F D F N D_K * D_K * M * D_F * D_F * N

深度可分离的卷积

对于 D F D F M D_F * D_F*M 的特征图,先对用 M M D K D K 1 D_K * D_K*1 卷积核对M个通道的map进行的空间卷积,步长为1,进行合适的padding,得到中间结果 D F D F M D_F * D_F*M 的特征图,再用 N N 1 1 M 1*1*M 的卷积核对中间结果进行标准卷积,得到 D F D F N D_F * D_F * N 特征图。

Depthwise convolution的计算代价:
D K D K M D F D F D_K * D_K * M * D_F * D_F

Pointwise convolution的计算代价:
M N D F D F M * N * D_F * D_F

总计算代价
D K D K M D F D F + M N D F D F D_K * D_K * M * D_F * D_F + M * N * D_F * D_F

计算代价比值

D K D K M D F D F + M N D F D F D K D K M D F D F N = 1 N + 1 D K 2 \frac{D_K * D_K * M * D_F * D_F + M * N * D_F * D_F} {D_K * D_K * M * D_F * D_F * N} = \frac1N + \frac{1}{D_K^2}

MobileNets使用3*3的深度可分离卷积,计算代价是标准卷积的1/8 ~1/9,精度只损失了一点点。

两个超参数

通过width multiplier α和 Resolution Multiplier ρ这两个超参数来平衡模型的计算代价和精度。

width multiplier α

对于给定的层和width multiplier α,输入通道的数量M变成αM,输出通道的数量N变成αN。

计算代价变为: D K D K α M D F D F + α M α N D F D F D_K * D_K * αM * D_F * D_F + αM * αN * D_F * D_F

Resolution Multiplier ρ

在实践中,通过设置输入分辨率来隐式设置ρ

计算代价变为: D K D K α M ρ D F ρ D F + α M α N ρ D F ρ D F D_K * D_K * αM * ρD_F * ρD_F + αM * αN * ρD_F * ρD_F

架构

标准卷积与深度可分离卷积

在这里插入图片描述

MobileNets架构

在这里插入图片描述

实现

来自keras_application/mobilenets.py

深度可分离卷积

def _depthwise_conv_block(inputs, pointwise_conv_filters, alpha,
                          depth_multiplier=1, strides=(1, 1), block_id=1):
  
    channel_axis = 1 if backend.image_data_format() == 'channels_first' else -1
    pointwise_conv_filters = int(pointwise_conv_filters * alpha)

    if strides == (1, 1):
        x = inputs
    else:
        x = layers.ZeroPadding2D(((0, 1), (0, 1)),
                                 name='conv_pad_%d' % block_id)(inputs)
    x = layers.DepthwiseConv2D((3, 3),
                               padding='same' if strides == (1, 1) else 'valid',
                               depth_multiplier=depth_multiplier,
                               strides=strides,
                               use_bias=False,
                               name='conv_dw_%d' % block_id)(x)
    x = layers.BatchNormalization(
        axis=channel_axis, name='conv_dw_%d_bn' % block_id)(x)
    x = layers.ReLU(6., name='conv_dw_%d_relu' % block_id)(x)

    x = layers.Conv2D(pointwise_conv_filters, (1, 1),
                      padding='same',
                      use_bias=False,
                      strides=(1, 1),
                      name='conv_pw_%d' % block_id)(x)
    x = layers.BatchNormalization(axis=channel_axis,
                                  name='conv_pw_%d_bn' % block_id)(x)
    return layers.ReLU(6., name='conv_pw_%d_relu' % block_id)(x)

MobileNets实现

	x = _conv_block(img_input, 32, alpha, strides=(2, 2))
    x = _depthwise_conv_block(x, 64, alpha, depth_multiplier, block_id=1)

    x = _depthwise_conv_block(x, 128, alpha, depth_multiplier,
                              strides=(2, 2), block_id=2)
    x = _depthwise_conv_block(x, 128, alpha, depth_multiplier, block_id=3)

    x = _depthwise_conv_block(x, 256, alpha, depth_multiplier,
                              strides=(2, 2), block_id=4)
    x = _depthwise_conv_block(x, 256, alpha, depth_multiplier, block_id=5)

    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier,
                              strides=(2, 2), block_id=6)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=7)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=8)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=9)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=10)
    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=11)

    x = _depthwise_conv_block(x, 1024, alpha, depth_multiplier,
                              strides=(2, 2), block_id=12)
    x = _depthwise_conv_block(x, 1024, alpha, depth_multiplier, block_id=13)

    shape = (int(1024 * alpha), 1, 1)

	x = layers.GlobalAveragePooling2D()(x)
	x = layers.Reshape(shape, name='reshape_1')(x)
	x = layers.Dropout(dropout, name='dropout')(x)
	x = layers.Conv2D(classes, (1, 1),
                          padding='same',
                          name='conv_preds')(x)
	x = layers.Activation('softmax', name='act_softmax')(x)
	x = layers.Reshape((classes,), name='reshape_2')(x)  

猜你喜欢

转载自blog.csdn.net/zhongqianli/article/details/86586671
今日推荐