1. Development

In 1989, Yann LeCun proposed a convolutional neural network that uses backward conduction for updating, called LeNet.

In 1998, Yann LeCun proposed a convolutional neural network that uses backward conduction for updating, called LeNet-5

AlexNet，VGG，GoogleNet，ResNet

2. AlexNet

AlexNet was the champion network in the ISLVRC 2012 ( ImageNet Large Scale Visual Recognition Challenge ) competition in 2012. The classification accuracy increased from the traditional 70 % + to 80 % + . It was designed by Hinton and his student Alex Krizhevsky. It was also after that year that deep learning began to develop rapidly.

ISLVRC 2012

Training set: 1,281,167 labeled images

Validation set : 50,000 labeled images

Test set : 100,000 unlabeled images

The highlights of the network are :

1 For the first time, GPU is used for network accelerated training .

2 uses the R e L U activation function instead of the traditional Sigmo id activation function and Tanh activation function .

3 uses LRN local response normalization . (LPN is a variant of bn)

4 The Dr opout random deactivation neuron operation is used in the first two layers of the fully connected layer to reduce overfitting .

Overfitting : The root cause is that there are too many feature dimensions, too complex model assumptions, too many parameters, too little training data, and too much noise, which results in the fitted function perfectly predicting the training set, but not predicting the test set of new data. Difference. Overfitting the training data without taking into account generalization ability.

1.1 Detailed explanation of AlexNet

The network is divided into upper and lower layers, and two GPUs are used to run at the same time.

The first layer of convolution:

Two GPUs run, the convolution kernel is 48*2

padding [1,2] is 0 in the left column, 0 in the two right columns, 0 in the upper row, and 0 in the lower two rows

The formula for calculating the matrix size after convolution is: N = (W − F + 2P ) / S + 1

①Input the image size W × W

② F il ter size F × F _ _

③Step size S

④The number of pixels P of padd i ng

The second layer maxpool:

The third layer of convolution:

The fourth layer maxpool:

The fifth layer of convolution:

The sixth layer of convolution:

The seventh layer of convolution:

The eighth layer maxpool:

1.2 AlexNet implementation

Model implementation

from tensorflow import keras
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


# 函数式和子类.
def AlexNet(im_height=224, im_width=224, num_classes=1000):
    input_image = keras.layers.Input(shape=(im_height, im_width, 3), dtype=tf.float32)
    # 手动padding
    x = keras.layers.ZeroPadding2D(((1, 2), (1, 2)))(input_image)
    x = keras.layers.Conv2D(48, kernel_size=11, strides=4, activation='relu')(x)
    x = keras.layers.MaxPool2D(pool_size=3, strides=2)(x)

    x = keras.layers.Conv2D(128, kernel_size=5, padding='same', activation='relu')(x)
    x = keras.layers.MaxPool2D(pool_size=3, strides=2)(x)
    x = keras.layers.Conv2D(192, kernel_size=3, padding='same', activation='relu')(x)
    x = keras.layers.Conv2D(192, kernel_size=3, padding='same', activation='relu')(x)
    x = keras.layers.Conv2D(128, kernel_size=3, padding='same', activation='relu')(x)
    x = keras.layers.MaxPool2D(pool_size=3, strides=2)(x)

    #全连接
    # 前面不管几维，都变成2维
    x = keras.layers.Flatten()(x)
    x = keras.layers.Dropout(0.2)(x)   #随机去掉20%神经元
    x = keras.layers.Dense(2048, activation='relu')(x)
    x = keras.layers.Dropout(0.2)(x)  #随机去掉20%神经元
    x = keras.layers.Dense(2048, activation='relu')(x)
    x = keras.layers.Dense(num_classes)(x)  #num_classes 最后输出类别

    # 预测
    predict = keras.layers.Softmax()(x)
    model = keras.models.Model(inputs=input_image, outputs=predict)
    return model

data preparation

train_dir = './training/training/'
valid_dir = './validation/validation/'

# 图片数据生成器
train_datagen = keras.preprocessing.image.ImageDataGenerator(
    rescale = 1. / 255,
    rotation_range = 40,
    width_shift_range = 0.2,
    height_shift_range = 0.2,
    shear_range = 0.2,
    zoom_range = 0.2,
    horizontal_flip = True,
    vertical_flip = True,
    fill_mode = 'nearest'
)

height = 224
width = 224
channels = 3
batch_size = 32
num_classes = 10

train_generator = train_datagen.flow_from_directory(train_dir,
                                 target_size = (height, width),
                                 batch_size = batch_size,
                                 shuffle = True,
                                 seed = 7,
                                 class_mode = 'categorical')

valid_datagen = keras.preprocessing.image.ImageDataGenerator(
    rescale = 1. / 255
)
valid_generator = valid_datagen.flow_from_directory(valid_dir,
                                 target_size = (height, width),
                                 batch_size = batch_size,
                                 shuffle = True,
                                 seed = 7,
                                 class_mode = 'categorical')
print(train_generator.samples)
print(valid_generator.samples)

train

model = AlexNet(im_height=224, im_width=224, num_classes=10)
model.summary()


model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['acc'])


history = model.fit(train_generator,
                   steps_per_epoch=train_generator.samples // batch_size,
                   epochs=10,
                   validation_data=valid_generator,
                   validation_steps = valid_generator.samples // batch_size
                   )

AlexNet 06