Playing Atari with Deep Reinforcement Learning论文解读 - 代码天地

Playing Atari with Deep Reinforcement Learning论文解读

其他 2018-11-22 16:46:55 阅读次数: 0

1.Abstract

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learning Environment, with no adjustment of the architecture or learning algorithm. We find that it outperforms all previous approaches on six of the games and surpasses a human expert on three of them.
我们提出了第一个深度学习模型，使用强化学习直接从高维感觉输入成功学习控制策略。该model是一个卷积神经网络，使用Q学习的变体进行训练，
其输入是原始像素raw pixels，其输出是估计未来的值函数奖励value function estimating future rewards。我们将方法应用于Arcade学习环境中的七个Atari 2600游戏，不需要调整架构或学习算法。我们发现它在六场比赛中超越了之前的所有方法并且超越了三个人类专家。

2.算法思路：

本文证明了卷积神经网络可以克服这些挑战，从复杂RL环境中的原始视频数据中学习成功的控制策略。使用Q学习[26]算法的变体训练网络，使用随机梯度下降来更新权重。为了缓解相关数据和非平稳分布的问题，我们使用经验重放机制[13]随机抽样先前的过渡，从而平滑过去许多行为的训练分布。

3.实现目标：

Our goal is to create a single neural network agent that is able to successfully learn to play as many of the games as possible.
（it learned from nothing but the video input, the reward and terminal
signals, and the set of possible actions—just as a human player would.）

Our goal is to connect a reinforcement learning algorithm to a deep neural network which operates directly on RGB images and efficiently process training data by using stochastic gradient updates.

猜你喜欢

转载自blog.csdn.net/weixin_41913844/article/details/84061899

Playing Atari with Deep Reinforcement Learning论文解读

算法笔记：Playing Atari with Deep Reinforcement Learning

【5分钟 Paper】Playing Atari with Deep Reinforcement Learning

《Playing Atari with Deep Reinforcement Learning 》论文阅读笔记和分析（DQN 2013版）

Playing Atari with Deep Reinforcement Learning:打响DRL的第一枪

从Playing Atari with Deep Reinforcement Learning 看神经网络的输入，学习的状态空间

Playing Go using Deep Reinforcement Learning without Hu

DRL在计算机视觉、机器学习等领域的应用 Deep Reinforcement Learning for Atari Games

Human-Level Control Through Deep Reinforcement Learning论文解读

Relational Deep Reinforcement Learning

022 Deep Reinforcement Learning

解读continuous control with deep reinforcement learning（DDPG）

Deep Reinforcement Learning is a waste of time

Random Thoughts on Deep Reinforcement Learning

# Asynchronous Methods for Deep Reinforcement Learning

Asynchronous Methods for Deep Reinforcement Learning

论文笔记：Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning

论文阅读——《Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning》

论文阅读笔记——《Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning》

论文笔记：Dueling Network Architectures for Deep Reinforcement Learning

AMiner推荐论文：Exploration in Deep Reinforcement Learning: A Comprehensive Survey

Asynchronous methods for deep reinforcement learning论文--学习笔记

（论文阅读笔记）Network planning with deep reinforcement learning

Deep Reinforcement Learning with Double Q-learning

Deep Reinforcement Learning: Pong from Pixels

Deep Reinforcement Learning 深度增强学习资源

Deep Reinforcement Learning 基础知识

Deep Reinforcement Learning （paper reading notes）

Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning

Deep Reinforcement Learning with Iterative Shift for Visual Tracking

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)