深度学习三巨头

ACM（国际计算机学会）宣布，有“深度学习三巨头”之称的Yoshua Bengio、Yann LeCun、Geoffrey Hinton共同获得了2018年的图灵奖，这是图灵奖1966年建立以来少有的一年颁奖给三位获奖者。ACM同时宣布，将于2019年6月15日在旧金山举行年度颁奖晚宴，届时正式给获奖者颁奖，奖金100万美元。

以表彰他们给人工智能带来的重大突破，这些突破使深度神经网络成为计算的关键组成部分。本吉奥是蒙特利尔大学教授，也是魁北克人工智能研究所Mila的科学主任。辛顿是谷歌副总裁兼工程研究员、Vector研究所首席科学顾问、多伦多大学名誉教授。杨乐昆是纽约大学教授、Facebook副总裁兼人工智能首席科学家。(Google有Hinton，Lecun在Facebook.)

杰弗里·埃弗里斯特·辛顿（Geoffrey Everest Hinton），计算机学家、心理学家，被称为“神经网络之父”、“深度学习鼻祖”。他研究了使用神经网络进行机器学习、记忆、感知和符号处理的方法，并在这些领域发表了超过200篇论文。他是将（Backpropagation）反向传播算法引入多层神经网络训练的学者之一，他还联合发明了波尔兹曼机（Boltzmann machine）。他对于神经网络的其它贡献包括：分散表示（distributed representation）、时延神经网络、专家混合系统（mixtures of experts）、亥姆霍兹机（Helmholtz machines）等。

Yann LeCun，自称中文名“杨立昆”，计算机科学家，被誉为“卷积网络之父”，为卷积神经网络（CNN，Convolutional Neural Networks）和图像识别领域做出了重要贡献，以手写字体识别、图像压缩和人工智能硬件等主题发表过 190 多份论文，研发了很多关于深度学习的项目，并且拥有14项相关的美国专利。他同Léon Bottou和Patrick Haffner等人一起创建了DjVu图像压缩技术，同Léon Bottou一起开发了一种开源的Lush语言，比Matlab功能还要强大，并且也是一位Lisp高手。（Backpropagation，简称BP）反向传播这种现阶段常用来训练人工神经网络的算法，就是 LeCun 和其老师“神经网络之父”Geoffrey Hinton 等科学家于 20 世纪 80 年代中期提出的，而后 LeCun 在贝尔实验室将 BP 应用于卷积神经网络中，并将其实用化，推广到各种图像相关任务中。

Bengio的一篇“A neural probabilistic language model”论文开创了神经网络语言模型的先河。其整体思路影响、启发了之后的很多基于神经网络做NLP的paper，在工业界也得到了广泛使用，还有梯度消失（gradient vanishing）的细致分析，word2vec的雏形，以及现很火的计算机翻译（machine translation）都有Bengio的贡献。

Hinton代表性研究论文

1、反向传播算法的使用

Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors[J]. Cognitive modeling, 1988, 5(3): 1.

2、CNN语音识别开篇TDN网络

Waibel A, Hanazawa T, Hinton G, et al. Phoneme recognition using time-delay neural networks[J]. Backpropagation: Theory, Architectures and Applications, 1995: 35-61.

3、DBN网络的学习

Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets[J]. Neural computation, 2006, 18(7): 1527-1554.

4、深度学习的开篇

Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks[J]. science, 2006, 313(5786): 504-507.

5、数据降维可视化方法t-SNE

Maaten L, Hinton G. Visualizing data using t-SNE[J]. Journal of machine learning research, 2008, 9(Nov): 2579-2605.

6、DBM模型

Salakhutdinov R, Hinton G. Deep boltzmann machines[C]//Artificial intelligence and statistics. 2009: 448-455.

7、ReLU激活函数的使用

Nair V, Hinton G E. Rectified linear units improve restricted boltzmann machines[C]//Proceedings of the 27th international conference on machine learning (ICML-10). 2010: 807-814.

8、RBM模型的训练

Hinton G E. A practical guide to training restricted Boltzmann machines[M]//Neural networks: Tricks of the trade. Springer, Berlin, Heidelberg, 2012: 599-619.

9、深度学习语音识别开篇Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition[J]. IEEE Signal processing magazine, 2012, 29.

10、深度学习图像识别开篇AlexNet

Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. 2012: 1097-1105.

11、权重初始化和Momentum优化方法的研究

Sutskever I, Martens J, Dahl G, et al. On the importance of initialization and momentum in deep learning[C]//International conference on machine learning. 2013: 1139-1147.

12、 Dropout方法提出

Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958.

13、三巨头深度学习综述LeCun Y, Bengio Y, Hinton G. Deep learning[J]. nature, 2015, 521(7553): 436.

14、蒸馏学习算法

Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. arXiv preprint arXiv:1503.02531, 2015.

15、Capsule Network

Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules[C]//Advances in neural information processing systems. 2017: 3856-3866.

Yann LeCun代表性研究论文

1、LeNet5卷积神经网络提出：LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.

2、NLP模型：Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model[J]. Journal of machine learning research, 2003, 3(Feb): 1137-1155.

3、逐层训练方法：Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks[C]//Advances in neural information processing systems. 2007: 153-160.

4、AI架构：Bengio Y. Learning deep architectures for AI[J]. Foundations and trends® in Machine Learning, 2009, 2(1): 1-127.

5、Stacked denoising autoencoders提出：Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion[J]. Journal of machine learning research, 2010, 11(Dec): 3371-3408.

6、Xavier初始化：Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the thirteenth international conference on artificial intelligence and statistics. 2010: 249-256.

7、ReLU激活函数使用：Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks[C]//Proceedings of the fourteenth international conference on artificial intelligence and statistics. 2011: 315-323.

8、Theano框架：Bastien F, Lamblin P, Pascanu R, et al. Theano: new features and speed improvements[J]. arXiv preprint arXiv:1211.5590, 2012.

9、RNN训练问题：Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks[C]//International conference on machine learning. 2013: 1310-1318.

10、Maxout激活函数：Goodfellow I J, Warde-Farley D, Mirza M, et al. Maxout networks[J]. arXiv preprint arXiv:1302.4389, 2013.

11、生成对抗网络GAN：Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in neural information processing systems. 2014: 2672-2680.

12、机器翻译：Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.

13、二值神经网络：Courbariaux M, Bengio Y, David J P. Binaryconnect: Training deep neural networks with binary weights during propagations[C]//Advances in neural information processing systems. 2015: 3123-3131.

14、三巨头深度学习综述：LeCun Y, Bengio Y, Hinton G. Deep learning[J]. nature, 2015, 521(7553): 436.

15、image caption与attention： Xu K, Ba J, Kiros R, et al. Show, attend and tell: Neural image caption generation with visual attention[C]//International conference on machine learning. 2015: 2048-2057.

16、深度学习教材：Goodfellow I, Bengio Y, Courville A. Deep learning[M]. MIT press, 2016.

17、语音生成：Sotelo J, Mehri S, Kumar K, et al. Char2wav: End-to-end speech synthesis[J]. 2017.

Bengio代表性研究论文