深度学习论文汇总（2018.4.21更新）

置顶 2018年02月04日 15:44:19

阅读数：786

好记性不如烂笔头，一直以来都只有写书面学习笔记的习惯，从来没写过博客。如今很荣幸地加入了浙江大学学生人工智能协会，立志在跟随优秀的老师和学长学姐学习AI领域相关技术的同时也为协会的运营和发展贡献力量。9月份入学以来，因为科研需要加上个人浓烈的兴趣，一直坚持着学习机器学习、深度学习相关的知识。如今，我还负责分管协会深度学习论文归档这块的任务，作为协会的资源方便成员的查阅学习。书面笔记不方便资源共享，于是开始写起了博客，刚开始尝试，如博客有不恰当之处还望海涵。希望此博客能够给深度学习有兴趣的人一些论文选读上的参考，少走弯路。此博客将伴随着我的学习历程不定时更新，在如今这个深度学习研究成果爆发产出的时代里，深度学习论文的发表又多又杂，如有错误请及时联系我，当然如果有更好的论文推荐，也请告知，不胜感激。

万事开头难，本博客最初的论文，主要从他人的CSDN、博客园、GitHub等个人博客或主页中整理出来。目前的内容主要来自我们协会会长罗浩学长的博客，在此表示感谢。相关引用的链接我会在文末给出。如下为我读过论文，我会尽量对我读过每篇优秀论文写阅读笔记（整理中），若有错误之处，还望指正。

深度学习的基础

Hecht-Nielsen R. Theory of the backpropagation neural network[J]. Neural Networks, 1988, 1(Supplement-1): 445-448.（BP神经网络）[PDF]
Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets.[J]. Neural Computation, 2006, 18(7): 1527-1554.（深度学习的开端DBN）[PDF]
Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks.[J]. Science, 2006, 313(5786): 504-7.（自编码器降维）[PDF]
Ng A. Sparse autoencoder[J]. CS294A Lecture notes, 2011, 72(2011): 1-19.（稀疏自编码器）[PDF]
Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Research, 2010, 11(Dec): 3371-3408.（堆叠自编码器，SAE）[PDF]

深度学习爆发：从AlexNet到Capsules

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012.（AlexNet）[PDF]
Simonyan, Karen, and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv: 1409.1556 (2014).（VGGNet）[PDF]
Szegedy, Christian, et al. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. （GoogLeNet）[PDF]
Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the Inception Architecture for Computer Vision[J]. Computer Science, 2015: 2818-2826.（Inception-V3）[PDF]
He, Kaiming, et al. Deep residual learning for image recognition. arXiv preprint arXiv: 1512.03385 (2015).（ResNet）[PDF]
Chollet F. Xception: Deep Learning with Depthwise Separable Convolutions[J]. arXiv preprint arXiv: 1610.02357, 2016.（Xception）[PDF]
Huang G, Liu Z, Weinberger K Q, et al. Densely Connected Convolutional Networks[J]. 2016. (DenseNet)[PDF]
Squeeze-and-Excitation Networks. （SeNet）[PDF]
Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[J]. arXiv preprint arXiv: 1707.01083, 2017.（Shufflenet）[PDF]
Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules[C].Advances in Neural Information Processing Systems. 2017: 3859-3869.（Capsules）[PDF]

深度学习中非常有用的Tricks

Srivastava N, Hinton G E, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1): 1929-1958.（Dropout）[PDF]
Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[J]. arXiv preprint arXiv: 1502.03167, 2015.（Batch Normalization）[PDF]
Lin M, Chen Q, Yan S. Network In Network[J]. Computer Science, 2014.（Global average pooling）[PDF]

递归神经网络RNN

Mikolov T, Karafiát M, Burget L, et al. Recurrent neural network based language model[C].Interspeech. 2010, 2: 3.（RNN和语language model结合较经典文章）[PDF]
Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.（LSTM的数学原理）[PDF]
Chung J, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[J]. arXiv preprint arXiv: 1412.3555, 2014.（GRU网络）[PDF]

生成对抗网络GAN

Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C].Advances in neural information processing systems. 2014: 2672-2680.（GAN）[PDF]
Mirza M, Osindero S. Conditional generative adversarial nets[J]. arXiv preprint arXiv: 1411.1784, 2014.（CGAN）[PDF]
Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv preprint arXiv: 1511.06434, 2015.（DCGAN）[PDF]
Denton E L, Chintala S, Fergus R. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks[C].Advances in neural information processing systems. 2015: 1486-1494.（LAPGAN）[PDF]
Chen X, Duan Y, Houthooft R, et al. Infogan: Interpretable representation learning by information maximizing generative adversarial nets[C].Advances in Neural Information Processing Systems. 2016: 2172-2180.（InfoGAN）[PDF]
Arjovsky M, Chintala S, Bottou L. Wasserstein gan[J]. arXiv preprint arXiv: 1701.07875, 2017.（WGAN）[PDF]
Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[J]. arXiv preprint arXiv: 1703.10593, 2017.（CycleGAN）[PDF]
Yi Z, Zhang H, Gong P T. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation[J]. arXiv preprint arXiv: 1704.02510, 2017.（DualGAN）[PDF]
Isola P, Zhu J Y, Zhou T, et al. Image-to-image translation with conditional adversarial networks[J]. arXiv preprint arXiv: 1611.07004, 2016.（pix2pix）[PDF]

迁移学习

Fei-Fei L, Fergus R, Perona P. One-shot learning of object categories[J]. IEEE transactions on pattern analysis and machine intelligence, 2006, 28(4): 594-611.（One shot learning）[PDF]
Larochelle H, Erhan D, Bengio Y. Zero-data learning of new tasks[J]. 2008: 646-651.（Zero shot learning）[PDF]

目标检测

Szegedy C, Toshev A, Erhan D. Deep neural networks for object detection[C].Advances in Neural Information Processing Systems. 2013: 2553-2561.（深度学习早期的物体检测）[PDF]
Girshick, Ross, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.（R-cnn）[PDF]
He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C].European Conference on Computer Vision. Springer International Publishing, 2014: 346-361.（SPPNet）[PDF]
Girshick R. Fast r-cnn[C]. Proceedings of the IEEE International Conference on Computer Vision. 2015: 1440-1448.（Fast R-cnn）[PDF]
Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[C]. Advances in neural information processing systems. 2015: 91-99.（Faster R-cnn）[PDF]
Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 779-788.（YOLO）[PDF]
Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector[C].European Conference on Computer Vision. Springer International Publishing, 2016: 21-37.（SSD）[PDF]
Li Y, He K, Sun J. R-fcn: Object detection via region-based fully convolutional networks[C].Advances in Neural Information Processing Systems. 2016: 379-387.（R-fcn）[PDF]

语义分割

Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C].Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.（最经典的FCN）[PDF]
Chen L C, Papandreou G, Kokkinos I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. arXiv preprint arXiv: 1606.00915, 2016.（DeepLab）[PDF]
Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[J]. arXiv preprint arXiv: 1612.01105, 2016.（PSPNet）[PDF]
He K, Gkioxari G, Dollár P, et al. Mask R-CNN[J]. arXiv preprint arXiv: 1703.06870, 2017.（MASK R-cnn）[PDF]
Hu R, Dollár P, He K, et al. Learning to Segment Every Thing[J]. arXiv preprint arXiv: 1711.10370, 2017.（Mask R-cnn增强版） [PDF]

图像压缩

George Toderici, Sean M. O' Malley, Sung Jin Hwang, Damien Vincent, David Minnen, Shumeet Baluja, Michele Covell, and Rahul Sukthankar. Variable rate image compression with recurrent neural networks. In ICLR, 2016.（深度学习运用在图像压缩上的一篇经典论文，RNN模型）[PDF]
George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, and Michele Covell. Full resolution image compression with recurrent neural networks. arXiv preprint arXiv: 1608.05148, 2016.（提出的RNN网络首次在Kodak数据集上超越JPEG）[PDF]
Mohammad Haris Baig, Vladlen Koltun, Lorenzo Torresani. Learn to Inpaint for Image Compression. In NIPS, 2017.[PDF]
Feng Jiang, Wen Tao, Shaohui Liu, Jie Ren, Xun Guo, Debin Zhao. An End-to-End Compression Framework Based on Convolutional Neural Networks.（CNN在图像压缩中的运用）[PDF]

关键点/姿态检测

Shih-En Wei, Varun Ramakrishna, Takeo Kanade, Yaser Sheikh. Convolutional Pose Machines. CVPR, 2016.（经典的关键点检测的论文,在2016年MPII姿态分析竞赛中位列第二,也是我的第一次参加天池比赛在FashionAI服饰关键点定位赛中用到的模型）[PDF]
Alejandro Newell, Kaiyu Yang, and Jia Deng. Stacked Hourglass Networks for Human Pose Estimation.（非常有名,特征多尺度,速度快,在2016年MPII姿态分析竞赛中位列榜首,在FashionAI天池大赛中中也被很多队伍用到)[PDF]
W. Wang, Y. Xu, J. Shen, and S.-C. Zhu, Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classiﬁcation. CVPR, 2018.(最新的FashionAI领域的大作,提出两种位置关系语法,双向卷积RNN网络信息传递模型,针对不同认为提出的两种attention机制,思想非常fancy,值得一读.)[PDF]

引用链接

http://blog.csdn.net/qq_21190081/article/details/69564634
http://github.com/michuanhaohao/paper
http://github.com/RedditSota/state-of-the-art-result-for-machine-learning-problems
http://github.com/songrotek/Deep-Learning-Papers-Reading-Roadmap
http://github.com/kjw0612/awesome-deep-vision