Study notes: SqueezeNet accuracy is the same as Alexnet (0.47M VS 240M after compression)

(Focus on the conclusion of the experiment and find thinking)
Three classic lightweight networks:
Squeezenet compression expansion Fire Module
MobileNet Depthwise deep separable convolution
ShuffleNet channel cleaning
Read today Squeezenet
Squeeze Net: Alexnet-level accuracy with 50X fewer parameters and <0.5MB model size , Forrest N. Iandoal, Song Han, 2017

Innovation

Squeezenet is a well-designed lightweight network that uses common compression techniques: SVD, Pruning, and quantized 8-bit quantization to further compress the model without losing accuracy. It is widely used for mobile object detection.
(It is much more efficient to train with a small model at the beginning than pruning a large model into a small model)
Insert picture description here

1 Benefits of small models

  • 1 Distribute training more efficiently
  • 2 Update model parameters faster in the cloud
  • 3 Small storage and memory usage is easy to apply to FPGA field programmable gate array

2 Neural network design space exploration

  • 1 Microscopic exploration: the dimension and configuration of each layer
    affect the model size and accuracy of the cnn structure by the
    same amount of parameters: deep and narrow models are better than wide and shallow models
  • 2 Macro exploration: end-to-end organization module or other layers
  • Insert picture description here

3 SqueezeNet high-precision less parameter strategy

3.1 Innovation strategy

  • 1 Decrease parameters: replace 3 3 with 1 1 convolution kernel (9x reduction)
  • 2 Reduce the parameter: reduce the number of input channels use Squeeze layer (1 * 1 convolution)
    • Parameter calculation input chanels * ker-sizes squared * filters nums
  • 3 Post-positioning: making feature maps large and rich
    • Use multiple sizes of convolution kernel calculations to retain more information and improve classification accuracy

3.2 Fire Module

Insert picture description here
Each Fire module has three hyperparameters
1 input channel (96) with fire M (M <96) 1 1 convolution to reduce the dimension
2 Expand Use 1
1 and 3 * 3 convolution in parallel to obtain different receptive field feature map Similar to inception—)
3 Concat: splicing feature maps

3.3Sqeeze net (input 224 * 224)

Insert picture description here
The bypass on the left is not used, the bypass in the middle, and the complex on the right is a
module that has been provided functions in the following boxes
Insert picture description here

4 evaluation

Insert picture description here
Here is a point: Squeezenet is a small model but it can also be quantified and compressed but the accuracy has little effect
Insert picture description here

Published 63 original articles · praised 7 · views 3396

Guess you like

Origin blog.csdn.net/weixin_44523062/article/details/105325984