【计算机科学】【2016.09】深度学习的不确定性

在这里插入图片描述

本文为英国剑桥大学（作者：YarinGal）的博士论文，共174页。

深度学习已经吸引了信息工程各个领域的研究人员，如人工智能、计算机视觉和语言处理等，也吸引了诸如物理、生物学和生产制造等传统科学的极大关注。神经网络、卷积神经网络等图像处理工具、递归神经网络等序列处理模型、以及dropout等正则化工具已经被广泛使用。然而，诸如物理、生物学和生产制造等领域中，模型不确定性的表示是至关重要的。随着最近许多领域的发展变化，深度学习要求使用贝叶斯不确定性的新需求。

在本工作中，我们通过算法来获得深度学习中的实际不确定性估计，将最新的深度学习工具模拟为贝叶斯模型，而不需要改变模型或优化方法。在本文的第一部分，我们研究了这种工具的理论方法，并提供了应用和说明性的例子。我们将贝叶斯模型中的近似推理与dropout等随机正则化技术联系起来，并从经验上评估这些近似方法。我们从现代深度学习和贝叶斯建模之间的联系出发，给出了诸如图像数据的主动学习和数据有效的深度强化学习产生的应用实例。我们通过对语言应用、医学诊断、生物信息学、图像处理和自动驾驶等最新应用中建议技术的应用进行调查，进一步证明了所提出工具的实用性。论文的第二部分探讨了贝叶斯建模与深度学习之间的联系及其理论意义。我们讨论了决定模型不确定特性的因素，分析了线性情况下的近似推理，并在理论上审查了各种先验知识（spike and slab priors）。

Deep learning has attracted tremendousattention from researchers in various fields of information engineering such asAI, computer vision, and language processing [Kalchbrenner and Blunsom, 2013;Krizhevsky et al., 2012; Mnih et al., 2013], but also from more traditionalsciences such as physics, biology, and manufacturing [Anjos et al., 2015; Baldiet al., 2014; Bergmann et al., 2014]. Neural networks, image processing toolssuch as convolutional neural networks, sequence processing models such asrecurrent neural networks, and regularisation tools such as dropout, are usedextensively. However, fields such as physics, biology, and manufacturing areones in which representing model uncertainty is of crucial importance[Ghahramani, 2015; Krzywinski and Altman, 2013]. With the recent shift in manyof these fields towards the use of Bayesian uncertainty [Herzog and Ostwald,2013; Nuzzo, 2014; Trafimow and Marks, 2015], new needs arise from deeplearning. In this work we develop tools to obtain practical uncertaintyestimates in deep learning, casting recent deep learning tools as Bayesianmodels without changing either the models or the optimisation. In the firstpart of this thesis we develop the theory for such tools, providingapplications and illustrative examples. We tie approximate inference inBayesian models to dropout and other stochastic regularisation techniques, andassess the approximations empirically. We give example applications arisingfrom this connection between modern deep learning and Bayesian modelling suchas active learning of image data and data-efficient deep reinforcementlearning. We further demonstrate the tools’ practicality through a survey ofrecent applications making use of the suggested techniques in languageapplications, medical diagnostics, bioinformatics, image processing, andautonomous driving. In the second part of the thesis we explore the insightsstemming from the link between Bayesian modelling and deep learning, and itstheoretical implications. We discuss what determines model uncertaintyproperties, analyse the approximate inference analytically in the linear case,and theoretically examine various priors such as spike and slab priors.

1 引言：了解我们所不知道知识的重要性
2 语言不确定性
3 贝叶斯深度学习
4 不确定性度量
5 具体应用
6 深入分析
7 未来研究展望
附录A KL条件
附录B 图片集
附录C Spike andslab prior KL

下载英文原文地址：

http://page5.dfpan.com/fs/1l8c6j32a2b1a239169/

更多精彩文章请关注微信号：在这里插入图片描述

【计算机科学】【2016.09】深度学习的不确定性

猜你喜欢