AlexNet_v1：ImageNet Classification with Deep Convolutional Neural Networks

AlexNet在2012年的ILVSRC上获得了第一名。相比第二名，它的准确率提高了超过了10%。
AlexNet的创新点：
1.数据增强。

对图像进行了随机裁剪，水平翻转。数据增强使得数据增加了(256-224)x(256-224)x2=2048倍。并且改变图片的RGB通道的强度。

2.ReLU激活函数

激活函数采用了ReLU，这样就不会出现tanh和sigmoid两端的饱和现象。能够缓梯度反向传播过程中的梯度消失问题。

3.重叠池化

将正常池化(2x2 stride 2)改为重叠池化(3x3 stride 2)可以将top-1和top-5提高0.4%和0.3%

4.局部响应归一化(Local Response Normalization)

LRN有助于泛化，但是LRN的效果其实是有争议的。

LRN其实是和BatchNorm有点像。

LRN思想主要来自于生物学的“侧抑制”

5.Dropout

在fc中应用dropout可以防止过拟合，它以一种很高效地方式结合多个不同的训练模型。

6.GPU实现

AlexNet的GPU实现，大大加快了模型的训练。

下图是AlexNet的架构图：

下图是ZFNet给出的AlexNet架构图（个人感觉这个更清晰）：
AlexNet结构图
AlexNet一共8层，5个conv层，3个fc层。
以下是TF+Keras混合编写的AlexNet：

# 这里使用TF和TF.keras.layers实现，可以保证绝对的兼容性
# 其实TF内部tf.layers是一个和keras.layers类似的高层API，也很好用
import tensorflow as tf
keras = tf.keras
from tensorflow.python.keras.layers import Conv2D,MaxPool2D,Dropout,Flatten,Dense


def inference(inputs,
              num_classes=1000,
              is_training=True,
              dropout_keep_prob=0.5):
  '''
  Inference

  inputs: a tensor of images
  num_classes: the num of category.
  is_training: set ture when it used for training
  dropout_keep_prob: the rate of dropout during training
  '''

  x = inputs
  # conv1
  x = Conv2D(96, [11,11], 4, activation='relu', name='conv1')(x)
  # lrn1
  x = tf.nn.local_response_normalization(x, name='lrn1')
  # pool1
  x = MaxPool2D([3,3], 2, name='pool1')(x)
  # conv2
  x = Conv2D(256, [5,5], activation='relu', padding='same', name='conv2')(x)
  # lrn2
  x = tf.nn.local_response_normalization(x, name='lrn2')
  # pool2
  x = MaxPool2D([3,3], 2, name='pool2')(x)
  # conv3
  x = Conv2D(384, [3,3], activation='relu', padding='same', name='conv3')(x)
  # conv4
  x = Conv2D(384, [3,3], activation='relu', padding='same', name='conv4')(x)
  # conv5
  x = Conv2D(256, [3,3], activation='relu', padding='same', name='conv5')(x)
  # pool5
  x = MaxPool2D([3,3], 2, name='pool5')(x)
  # flatten
  x = Flatten(name='flatten')(x)
  # dropout
  if is_training:
    x = Dropout(dropout_keep_prob, name='dropout5')(x)
  # fc6
  x = Dense(4096, activation='relu', name='fc6')(x)
  # dropout
  if is_training:
    x = Dropout(dropout_keep_prob, name='dropout6')(x)
  # fc7
  x = Dense(4096, activation='relu', name='fc7')(x)
  # fc8
  logits = Dense(num_classes, name='logit')(x)
  return logits


def build_cost(logits, labels,weight_decay_rate):
  '''
  cost

  logits: predictions
  labels: true labels
  weight_decay_rate: weight_decay_rate
  '''
  with tf.variable_scope('costs'):
    with tf.variable_scope('xent'):
      xent = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
          logits=logits, labels=labels))
    with tf.variable_scope('decay'):
      costs=[]
      for var in tf.trainable_variables():
        costs.append(tf.nn.l2_loss(var))
        tf.summary.histogram(var.op.name, var) # summary
      cost_decay = tf.multiply(weight_decay_rate, tf.add_n(costs))
    cost = tf.add(xent,cost_decay)
    tf.summary.scalar('cost', cost) # summary
  return cost


def build_train_op(cost, lrn_rate, global_step):
  '''
  train_op

  cost: cost
  lrn_rate: learning rate
  global_step: global step
  '''
  with tf.variable_scope('train'):
    lrn_rate = tf.constant(lrn_rate, tf.float32)
    tf.summary.scalar('learning_rate', lrn_rate) # summary

    trainable_variables = tf.trainable_variables()
    grads = tf.gradients(cost, trainable_variables)

    optimizer = tf.train.AdamOptimizer(lrn_rate)

    apply_op = optimizer.apply_gradients(
        zip(grads, trainable_variables),
        global_step=global_step, name='train_step')

    train_op = apply_op
  return train_op


if __name__ == '__main__':
  images = tf.placeholder(tf.float32, [None, 224, 224, 3])
  labels = tf.placeholder(tf.float32, [None, 1000])
  logits = inference(inputs=images,
                     num_classes=1000)
  print('inference: good job')
  cost = build_cost(logits=logits,
                    labels=labels,
                    weight_decay_rate=0.0002)
  print('build_cost: good job')
  global_step = tf.train.get_or_create_global_step()
  train_op = build_train_op(cost=cost,
                            lrn_rate=0.001,
                            global_step=global_step)
  print('build_train_op: good job')

在这里，我们提供inference、build_cost、build_train_op三个函数，这三部分基本上实现上AlexNet的绝大部分。
AlexNet的具体配置：

层	配置
conv1	11x11@96 stride 4, relu
lrn1
pool1	3x3 maxpool, stride 2
conv2	5x5@256 stride 1, relu
lrn2
pool2	3x3 maxpool, stride 2
conv3	3x3@384 stride 1, relu
conv4	3x3@384 stride 1, relu
conv5	3x3@256 stride 1, relu
pool5	3x3 maxpool, stride 2
flatten
dropout rate	0.5
fc6	4096
dropout rate	0.5
fc7	4096
fc8	1000

AlexNet文章复现

注意：使用本博客的代码，请添加引用

猜你喜欢