【ZJU-Machine Learning】Convolutional Neural Network-AlexNet

Insert image description here
Insert image description here

ImageNet contains more than 1.2 million color images belonging to 1000 different categories, making it the largest image recognition database to date. Alex Krizhevsky and others built a large-scale network containing more than 650,000 neurons and more than 60 million parameters to be estimated. This network is called AlexNet

Improve

(1) Replace the formula sigmoid or tanh function with the ReLU function

Insert image description here
Insert image description here
This can reduce the number of neurons in the network and converge faster. In addition, its derivative is always 1. Compared with sigmoid, it can spread further.

(2) A new name was given to the downsampling operation - Pooling, which means to reconsider adjacent pixels as a "pool". All the red pixel values ​​on the left can be regarded as a "pool". After the pooling operation, they become a blue pixel on the right.
Insert image description here
In AlexNet, the concept of Max Pooling is proposed, that is, for each "pool" composed of adjacent pixels, select pixelsmaximum valueas output. In LeNet, the pooled pixels are non-overlapping; while in AlexNet, overlapping pooling is performed. Practice shows that overlapping maximum pooling can effectively overcome the over-fitting problem and improve system performance.

During backpropagation, a winner-takes-all strategy is adopted, that is, only the places where the maximum value is reached have gradients, and the rest are 0.

Practice has proven that MaxPooling is better than averaging, mainly for the following reasons:
1) MaxPooling willnonlinearFusion with downsampling goes one step further.
2) During backpropagation, the number of activated neurons is greatly reduced, passing the convergence rate.

(3) Dropout randomly. In order to avoid overfitting caused by too fast update of system parameters, every time a training sample is used to update parameters, a certain proportion of neurons are randomly "discarded". The discarded neurons will not participate in the training process, and the weight of the neuron will be input and output. The coefficients are not updated either. In this way, the trained network architecture is different every time during training, but these different network architectures share common weight coefficients. Experiments show that random dropout technology slows down the network convergence speed and avoids overfitting with a high probability.
Insert image description here
The Dropout method is to drop some neurons with probability p for each layer during each training, so that the network trained each time is different.
The testing process after training must use the complete network structure, and all parameters (W, b) of the layer must be multiplied by (1-p).
In this way, it is equivalent to training many networks and finally averaging.

(4) Add training samples. Although the number of training samples of ImageNet has more than 1.2 million images, the training images are still not enough for the 600 million parameters to be estimated. Alex et al. used a variety of methods to increase training samples, including: 1. Flip the original image horizontally; 2. Randomly select 224×224 segments from the 256×256 image as the input image. Using a combination of the above two methods, one image can be turned into 2048 images. You can also introduce a certain amount of noise into each picture to form a new image. This can increase training samples on a larger scale and avoid performance losses caused by insufficient training samples.

(5) Use GPU to accelerate the training process. Two GTX 580 GPUs were used to accelerate the training process. Due to the powerful parallel computing capabilities of the GPU, the training process time was shortened dozens of times. Even so, the training time still took six days.

CAFFE implements ALEXNET

training data
Insert image description here
test data
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here

Guess you like

Origin blog.csdn.net/qq_45654306/article/details/113408893