深度学习AlexNet模型分析

一、AlexNet背景
二、AlexNet结构

2.1 网络结构
2.2 conv1层
2.3 conv2层
2.4 conv3层
2.5 conv4层
2.6 conv5层
2.7 fc6层
2.8 fc7层
2.9 fc8层

一、AlexNet背景

最初典型的CNN是LeNet5网络结构，也是我们最开始学习的结构，后来文章《ImageNet Classification with Deep Convolutional Neural Networks》介绍的网络结构AlexNet引起世人的注意，这篇文章是在2012年的ImageNet竞赛（Large Scale Visual Recognition Challenge）中取得冠军的网络模型整理后发表的文章。第一作者是多伦多大学的Alex Krizhevsky，团队领导者是其老师Geoffrey Hinton,看到这位大家应该有所耳闻。
Geoffrey Hinton、Yoshua Bengio、Yann LeCun这三位可以说是神经网络领域三巨头，三位共享2018年图灵奖(Turing Award)。其中LeCun就是LeNet5作者，主要贡献是提出卷积神经网络、改进反向传播算法、拓宽神经网络的视角；而Hinton提出反向传播、玻尔兹曼机(第一个能学习神经元内部表征的深度神经网络)以及对卷积神经网络的修正(率先使用修正线性神经元(ReLU)和 Dropout 正则化大大提升了深度卷积神经网络的性能)；Bengio 提出将神经网络与序列的概率建模相结合、高维词嵌入与注意力机制以及生成对抗网络(GAN)。
本文主角Alexnet是由八层网络层组成的，包括5层卷积层和3层全连接层，与此同时在每一个卷积层中包含了激励函数(RELU)以及局部响应归一化(LRN)处理，然后在经过降采样(pool池化处理)，除此之外还利用了两块GPU进行计算，大大提升运算速度。下面开始分析每一层结构。

二、AlexNet结构

2.1 网络结构

在这里插入图片描述

2.2 conv1层

	规格	步长 stride	输出数量 num_output
卷积核	11×11	4	96
pool处理	3×3	2	96

①输入数据:227×227×3 (输入图像规格： 224×224×3(RGB图像),经过预处理变为227×227×3,然后被96个上述规格的卷积核进行特征提取,与其他不同结构的是,该结构将得到的96个卷积核分成两组(采用2×GPU进行处理),每组48个卷积核。
②卷积核对原始图像的每次卷积都会生成一个像素点,由于步长是4，卷积核大小是11×11，根据卷积计算：
wide = height = (224+2×pad-kernel_size)/stride+1=(227-11)/4+1=55
dimention = 96
故得到96个55×55大小的特征图(55×55×96),由于分组,最后得到两组55×55×48像素层数据。
③两组像素层数据分别都通过relu激励函数,以确保特征图的值范围在合理范围之内,之后生成激活像素层，尺寸仍为2组55×55×48的像素层数据。
④经过局部响应归一化处理(LRN),归一化运算尺度为5×5。
⑤最后再通过pool池化处理，根据其规格及步长计算出：(55-3)/2+1=27,最后得到两组27×27×48池化层数据。

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "conv1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "norm1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}

2.3 conv2层

	规格	步长 stride	输出数量 num_output	填充padding
卷积核	5×5	1	256	2
pool处理	3×3	2	256

①输入数据为两组27×27×48,然后使用256个上述规格的卷积核进行特征提取,这是对特征点进一步提取,由于第一层输出27×27×96。为便于后续处理,对每个像素层上下左右都要填充2个像素,和第一层一样，先分成两组27×27×48来处理。
②由于步长是1，卷积核大小是5×5，根据卷积计算：
wide = height = (27-kernel_size+2×pad)/stride+1=(27-5+2×2)/1+1=27
dimention = 256
故得到256个27×27大小的特征图(27×27×256),由于分组,最后得到两组27×27×128像素层数据。
③两组像素层数据分别都通过relu激励函数,以确保特征图的值范围在合理范围之内,之后生成激活像素层，尺寸仍为2组27×27×128的像素层数据。
④再经过局部响应归一化处理(LRN),归一化运算尺度为5×5，共两组卷积核(256个)。
⑤最后通过pool池化处理，根据其规格及步长计算出：(27-3)/2+1=13,最后得到两组13×13×128池化层数据。

layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "conv2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "norm2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}

2.4 conv3层

	规格	步长 stride	输出数量 num_output	填充padding
卷积核	3×3	1	384	1

①第三层卷积层只有一个卷积层与一个激活像素层，没有池化层与归一化。
②输入数据为两组13×13×128像素层,仍旧为了后续处理,对每个像素层上下左右都要填充1个像素。
②由于步长是1，卷积核大小是3×3，根据卷积计算：
wide = height = (13-kernel_size+2×pad)/stride+1=(13-3+1×2)/1+1=13
dimention = 384
故得到384个13×13大小的特征图(13×13×384),由于分组,最后得到两组13×13×192像素层数据。
③两组像素层数据分别都通过relu激励函数,以确保特征图的值范围在合理范围之内,之后生成激活像素层，尺寸仍为两组13×13×192的像素层数据。

layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}

2.5 conv4层

	规格	步长 stride	输出数量 num_output	填充padding
卷积核	3×3	1	384	1

①该卷积层也没有使用池化降采样层，内容与conv3一样

layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}

2.6 conv5层

	规格	步长 stride	输出数量 num_output	填充padding
卷积核	3×3	1	256	1
pool处理	3×3	2	256

①输入数据为两组13×13×192像素层,仍旧为了后续处理,对每个像素层上下左右都要填充1个像素,两组分别送至2个GPU中运算
②由于步长是1，卷积核大小是3×3，根据卷积计算：
wide = height = (13-kernel_size+2×pad)/stride+1=(13-3+2×1)/1+1=13
dimention = 256
故得到256个13×13大小的特征图(13×13×256),由于分组,最后得到两组13×13×128像素层数据。
③两组像素层数据分别都通过relu5激励函数,以确保特征图的值范围在合理范围之内,之后生成激活像素层，尺寸仍为两组13×13×128的像素层数据。
④再通过pool池化处理，根据其规格及步长计算出：(13-3)/2+1=6,最后得到两组6×6×128池化层数据

layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}

2.7 fc6层

在这里插入图片描述
①输入数据为6×6×256，使用6×6×256的滤波器对输入数据进行卷积，最后共有4096个6×6×256的滤波器对输入数据进行运算，通过4096个神经元得到结果。
②这4096个运算结果通过relu6激活函数生成4096个值
③dropout可以比较有效的缓解过拟合的发生,在一定程度上达到正则化的效果,在每个训练批次中,通过忽略一半的特征检测器(让一半的隐层节点值为0),可以明显地减少过拟合现象。在该网络中dropout训练4096个结果，按照1/2概率使得一些neuron节点输出为0，通过dropout后输出4096个结果值。

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}

2.8 fc7层

在这里插入图片描述
①输入的4096个数据和该层的4096个神经元进行全连接，其余和上一层内容一样。

layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}

2.9 fc8层

在这里插入图片描述
①输入的4096个数据和该层的1000个神经元进行全连接，最后得到训练后的数值。

layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 1000
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "fc8"
  top: "prob"
}

Raywit

发布了36 篇原创文章 · 获赞 5 · 访问量 3万+

私信关注

深度学习AlexNet模型分析

深度学习AlexNet模型分析

一、AlexNet背景

二、AlexNet结构

2.1 网络结构

2.2 conv1层

2.3 conv2层

2.4 conv3层

2.5 conv4层

2.6 conv5层

2.7 fc6层

2.8 fc7层

2.9 fc8层

猜你喜欢