Highway Networks (Training Very Deep Networks, 2015 NIPS)

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/u010598445/article/details/78956682

Reference

Why Deeper Networks?

The Deeper, The Better. (No Consider Computation Complexity)
Recent evidence [40, 43] reveals that network depth is of crucial importance

Why Train Deeper Networks Harder?

An obstacle to answering this question was the notorious problem of vanishing/exploding gradients [14, 1, 8]

How To Train Very Deep Networks ?

  • Good Init
  • Local competition may help to train deeper networks [20,21]
  • Skip Connection [2,22,23,24]
  • multiple stage train [25]
  • layer-wise train [26,27]

from Highway Net
[2] Going deeper with convolutions
[20] Maxout Networks
[21] Compete to Compute
[22] Deep learning made easier by linear transformations in perceptrons
[23] Generating sequences with recurrent neural networks
[24] Deeply-supervised nets
[25] FitNets: Hints for thin deep nets
[26] Learning complex, extended sequences using the principle of history compression.
[27] A fast learning algorithm for deep belief nets.


from ResNet
[40] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
[43] C. Szegedy,W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Er- han, V. Vanhoucke, and A. Rabinovich. Going deeper with convolu- tions. In CVPR, 2015
[1] Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependen- cies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166, 1994.
[8] X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In AISTATS, 2010
[14] S. Hochreiter. Untersuchungen zu dynamischen neuronalen netzen. Diploma thesis, TU Munich, 1991.

猜你喜欢

转载自blog.csdn.net/u010598445/article/details/78956682