Some thoughts on pruning model

Some thoughts on pruning: The
reference project is this: https://github.com/tanluren/yolov3-channel-and-layer-pruning

  1. The reference paper is very instructive. The scale factor is added to the bn layer (actually to compress the gamma coefficient of the bn layer). The smaller the value, the less important to the network, it can be pruned.
  2. Determination of the scale factor range: a regular term is added to the objective function, which is automatically pruned during the training process

Previous thoughts:

  1. The most important thing about pruning is the pruning method and the pruning ratio, which is where and how much to cut
  2. How to ensure that important weights in the large model are not cut off

Reading this paper is enlightening: https://arxiv.org/abs/1803.03635
Thoughts on this paper:

  1. The dense, randomly initialized feedforward network contains sub-networks. When this sub-network is trained separately, the accuracy can reach the accuracy of the original network. This is the core idea proposed by the author
  2. In the pruning technology, the initialization of the sub-network makes the training effective, but the random initialization will have lower accuracy than the original network. Unless proper random initialization, the author proposes that the initialization parameters of the sub-network come from the parameter set of the original network.
  3. The number of iterations of the sub-network is less than the number of iterations of the original network

Some thoughts on pruning related articles recently:

  1. Training an over parameter model is not necessary for the small model obtained by the final pruning
  2. In order to obtain the pruned small model, it is unnecessary to calculate the import parameter in the large model.
  3. The structure obtained by pruning is more important than the weight obtained, so the pruning process is actually the process of searching the network structure

Guess you like

Origin blog.csdn.net/weixin_43868576/article/details/107953064