卷积神经网络 3 经典的模型

经典的卷积神经网络模型是我们学习CNN的利器，不光是学习原理、架构、而且经典模型的超参数、参数，都是我们做迁移学习最好的源材料之一。

1. LeNet-5 [LeCun et al., 1998]

我们还是从CNN之父，LeCun大神在98年提出的模型看起。
在这里插入图片描述
参数有：Conv filters were 5x5, applied at stride 1
Subsampling (Pooling) layers were 2x2 applied at stride 2
架构是：[CONV-POOL-CONV-POOL-CONV-FC]

2.AlexNet [Krizhevsky et al. 2012]

这个模型是2012年Imagenet 夺冠的模型，从这个模型开始，可以说，深度学习的大幕已经拉开！

在这里插入图片描述
模型参数：
(1) First layer：
Input: 227x227x3 images
(CONV1): 96 11x11 filters applied at stride 4，
Output volume [55x55x96]
这一层的所有的参数是：(11113)*96 = 35K

(2) Second layer(Pooling)：
3x3 filters applied at stride 2
Output volume: 27x27x96
这一层不需要参数
…
(3) 整个架构 (Pooling)：
在这里插入图片描述
(4) 模型特征和超参数设置：

first use of ReLU
used Norm layers (not common anymore)
heavy data augmentation
dropout 0.5
batch size 128
SGD Momentum 0.9
Learning rate 1e-2, reduced by 10
manually when val accuracy plateaus
L2 weight decay 5e-4
7 CNN ensemble: 18.2% -> 15.4%

3.VGGNet [Simonyan and Zisserman, 2014]

模型的特点：
*Only 3x3 CONV stride 1, pad 1 and 2x2 MAX POOL stride 2；
将11.2% top 5 error in ILSVRC 2013降低到 7.3% top 5 error；
TOTAL memory: 24M * 4 bytes ~= 93MB / image (only forward! ~2 for bwd)
TOTAL params: 138M parameters

模型具体的设置和参数：
在这里插入图片描述

4.GoogLeNet[Szegedy et al., 2014]

ILSVRC 2014 冠军(6.7% top 5 error)
在这里插入图片描述
模型特点：

Inception Model

在这里插入图片描述

No FC layer
Only 5 million params!(Removes FC layers completely)
Compared to AlexNet:
12X less params
2x more compute
6.67% (vs. 16.4%)

5. ResNet [He et al., 2015]

ILSVRC 2015 winner (3.6% top 5 error)
迄今为止，我见过的最深的模型！152层！！！！！！！！！！！

2-3 weeks of training on 8 GPU machine
at runtime: faster than a VGGNet! (even though it has 8x more layers)

(1) 系统结构

(2) Res 残差的概念

(3) 超参数设置
Batch Normalization after every CONV layer
Xavier/2 initialization from He et al.
SGD + Momentum (0.9)
Learning rate: 0.1, divided by 10 when validation error plateaus
Mini-batch size 256
Weight decay of 1e-5
No dropout used
(4) 层详情