Pruning model study notes 2 - SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Application

SlimYOLOv3:Narrower,Faster and Better for Real-Time UAV Application

This article is for Yolov3 do the pruning model, given the typical model began pruning process, as shown.
Here Insert Picture Description
Secondly, the model for a given process yolov3 pruning, as

Here Insert Picture Description
What is pruning depth model? Like the name of the paper narrower (Narrower), it is necessary to reduce the number of channel model.
Each layer is removed unimportant convolutional channel characteristics. It is necessary to properly assess the importance of the channel characteristics.
Its nature: the scaling factor by imposing channel L1 regularization and trim smaller amount characteristic of the channel layer is achieved convolutional channel sparsity level to obtain SlimYOLOv3.

Figure above process is explained as follows: YOLOv3 sparse After training, to obtain the scale factor of each channel, then channels that remove small scale factor, the resulting model SlimYOLOv3 pruning further fine-tune the data sets, to obtain a detection result, and then proceed to the next a sparse training. The pruning process is iteratively repeated over until the models meet certain conditions, such as pruning rate model certain requirements.

This paper, the authors use the feature fusion technology, introduction of spatial pyramid (SPP) structure, yolov3 made a small change. SPPmodule comprising four parallel maxpooling layer, respectively, nuclear size 1 * 1, 5 * 5, 9 * 9,13 * 13.SPPmodule extraction features can be different receptive field, and then in the passage where they concat dimension.
Here Insert Picture Description
Between the detection of several up header fifth and sixth convolution convolution added SPPmodule (added to the number of layers between yolo input direction, the fifth and sixth convolution SPPmodule).
Detection header is output N * N * (3 * ( 4 + 1 + C)), where N * N is the feature map size, C is the number of classes.

sparsity training

yolov3 network structure except the input layer yolo convolutional layer is not bn layer, other layers are convolutional layer bn, bn layer formula:
Here Insert Picture Description
Equation (1), Here Insert Picture Descriptionrespectively, a batch of mean and variance, Here Insert Picture Descriptionrespectively, Training scaling (trainable scale factor) and offset (bias). On direct use Here Insert Picture Descriptionto measure the importance of the channel, Here Insert Picture Descriptionthe importance of using L1 regression to measure, sparsity training objectives formula:
Here Insert Picture Description
represents L1 return, Here Insert Picture Descriptionfor balancing the two loss, with a negative gradient method for nonsmooth penalty term L1 optimized author use Here Insert Picture Descriptionvalue size is 0.0001

Channel pruning:

After sparse training, introducing a global threshold Here Insert Picture Descriptionto determine which channels are cut features, using the global threshold control all Here Insert Picture Descriptionwill be minus a few percent. At the same time, the introduction of a local threshold value Here Insert Picture Descriptionin order to protect channels within a convolution layer is excessively pruned, to avoid the integrity of the structure of a network connection is damaged.

Fine-tuning

To fine-tune the structure after pruning, to restore accuracy.

Iteratively pruning

Iterative pruning, to prevent over-fitting.

Results and Discussions

Here Insert Picture Description
Seen from the figure, yolov3-spp3 better than yolov3-spp1, which shows a multi-receptive fields can be effectively extracted features multi-scale depth.
Annex: yolov3-spp3 spp uses three modules, respectively mounted between the 5th and 6th number next three test head.

Published 29 original articles · won praise 12 · views 10000 +

Guess you like

Origin blog.csdn.net/c2250645962/article/details/103870240