【计算机科学】【2015.08】训练深度神经网络的快速图像识别方法

在这里插入图片描述
本文为英国伦敦大学学院(作者:Zbigniew Wojna)的硕士论文,共102页。

本文研究了深度学习领域中的最新发现,本文的引言部分介绍了深度学习的最新进展及目前最好性能的结构组成。由于最新研究成果的不断涌现,深度神经网络在目标识别、分割、字幕、主题建模、翻译和语音识别等任务中的应用引起了学术界和工业界的极大兴趣。一个通用算法能够解决如此多不同领域中的问题,这是很少发生的。因此,了解理论基础和研究方向是十分重要的。在下面的工作中,对过去两年出版的文献进行了回顾,并与公认的知识进行了比较。

本文的大部分内容集中在文献综述上,实验研究旨在探索一种可以快速训练的图像识别模型。神经机器视觉的主要限制是由于训练模型所花费的时间太长。本文通过迁移学习(transfer learning)实验,对不同的数据域使用相同的训练网络结构;采用更优的架构在重要目标感知上实现了最先进的结果,中等规模GPU的训练只需要不到一个小时的时间。本文在ILSVRC2014数据集(目前最大的图像识别挑战)上复现了最快学习网络的实验,修改的算法在不到400000次迭代中实现了验证数据集的68.3%的准确性,这比原始训练网络大约少用了3倍的迭代时间。本文的实现是在torch框架中进行的,torch框架通常用于开发者社区以及诸如Google和Facebook等对开源实现做出广泛贡献的大型行业参与者的深度学习应用程序。

This thesis investigates the recentfindings in the deep learning area. They form an introduction to PhD studies inthis topic and review of the literature concerning the best performingarchitectures. There is a large interest from both the research community andindustry in using deep neural networks for tasks such as object recognition,segmentation, captioning, topic modelling, translation and speech recognitiondue to state-of-the-art results. It rarely happens that one generic algorithmis able to solve problems in so many different domains. Therefore, it isimportant to understand theoretical bases and available directions of research.In the following work, there is a review of the publication that appeared inthe last 2 years with a comparison to well-established knowledge. Most of thethesis focuses on the literature overview. The experiments are designed toexplore a model for image recognition that can be trained relatively quickly.The main limitation in neural machine vision is due to the overwhelming amountof time taken to train models. In this thesis, there are experiments withtransfer learning, that use the same trained network architecture for differentdata domains. It achieves state-of-the-art results in Salient Object Subitizingproblem exploiting better architectures requiring only less than one hour oftraining on mid-class GPU. In this thesis, there are experiments performed replicatingthe fastest learning network on the ILSVRC2014 dataset - the biggest imagerecognition challenge. Presented modifications achieves 68.3% accuracy on thevalidation dataset in less than 400000 iterations, that is around 3 times fewerthan the original network. The implementation is done in torch framework, whichis commonly used for deep learning applications by the research community aswell as big industry players, such as Google and Facebook, that extensivelycontribute to the open source implementation.

1 引言
2 项目背景:深度学习基础
3 基于迁移学习的实验
4 ILSVRC2014挑战实验
5 结论

下载英文原文地址:

http://page5.dfpan.com/fs/2l1c0j82923112b9169/

更多精彩文章请关注微信号:在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/weixin_42825609/article/details/84997707