ICLR 2019最佳论文:The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

Disclaimer: This article is a blogger original article, reproduced, please attach Bowen link! https://blog.csdn.net/m0_37263345/article/details/90292633

A thesis

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

https://arxiv.org/abs/1803.03635

Second, the paper notes

1, Abstract: neural network pruning technique to reduce the amount of network parameters may be 90%, thereby reducing storage requirements without sacrificing accuracy of the premise, performance improvements inferred. However, the existing experience has shown that pruning is difficult to generate analytical framework training from the start, although the same analytical framework can enhance training performance.

We found the standard of pruning will naturally find a sub-network, these sub-networks can be initialized after the effective training. Based on these results, we propose a "lottery hypothesis" (lottery ticket hypothesis): dense, random initialization feed-forward network includes subnetworks ( "winning ticket"), when independent training, these sub-networks can be in the number of iterations similar reached a considerable original network testing accuracy.

"Winning lottery ticket" won "the lottery initialization": their connection has a very efficient training of initial weights. We propose an algorithm to identify winning tickets, and a series of experiments to support the importance of these assumptions and occasional lottery initialized. We found on MNIST and CIFAR10 data sets, the size of the "winning ticket" network is less than fully connected, feed-forward architecture before convolution 10% -20%. Moreover, this "winning ticket" faster, test accuracy rate is also higher than the original network learning speed.

 

2, to determine the winning tickets of the two strategies (the first is better)

(a)、

After each finished pruning, the right to the remaining nodes reset to initialize the weights, and then follow an iterative pruning

(b)、

After each finished pruning, the right to the remaining nodes of the same weight and then training, then follow an iterative pruning

 

 

After the final finished training, both the right of the remaining nodes is reset to the initial weight the most.

 

Pruning strategy:

We use a simple layer-wise pruning heuristic: remove a percentage of the weights with the lowest magnitudes within each layer

Song Han, Jeff Pool, John Tran, and William Dally. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems, pp. 1135–1143,2015.

 

 

3, some of the points

(A), from the outset of training a manicured network than to retrain a manicured network effect is even worse

(B), after the network is pruned, the remaining nodes (nodes remaining model is the winning ticket) to initialize the right to the best use of the beginning of the heavy, rather than re-randomized weights. (This is also a core test of this paper), indicating that the network structure alone can not show the validity of this hypothesis, the implication also requires initialization parameters

(C) new unproven conjecture:

SGD found in all of the initialization weights inside the network structure of the right to a better initialization of heavy subsets, because random initialization of a dense network (complete network) than by a sparse network pruning that is easier to train a dense network more many possible sub-networks can be recovered, "winning lottery ticket"

 

4、Contributions

 

5, ideas can bring the benefits of (virtual)

(A), to improve training strategies and training effect

(B), designed to better network structure

(C), a better understanding of the theory of neural networks

 

6、LIMITATIONS AND FUTURE WORK

 

In deeper network, "winning ticket" is not easy to find, unless a large study of warm-up training. This section also to be explored

 

7, the experiment is very comprehensive and detailed, from the experimental and references can be seen in this area of ​​research is quite deep, rich in theory, this paper can perhaps winning cause.

git paper list: 

 https://github.com/zhiAung/Paper/blob/master/5%E3%80%81Model%20compression/The%20Lottery%20Ticket%20Hypothesis:%20Finding%20Sparse%2C%20Trainable%20Neural%20Networks.md

Guess you like

Origin blog.csdn.net/m0_37263345/article/details/90292633