Visual Reinforcement Learning with Imagined Goals - 代码天地

Visual Reinforcement Learning with Imagined Goals

编程语言 2018-08-19 01:46:07 阅读次数: 0

这篇文章的核心使用Variational Autoencoder配合高斯分布将图像转换到另一个空间下。使用编码器encoder的输出结果作为状态和目标。这种编码方式优于欧式空间的度量方法，称之为latent space。使用Variational Autoencoder的好处如下：

Provides a space where distances are more meaningful, and thus allows use of a well-structured reward function (ex. distance between encodings)
Inputs to the reinforcement learning network are structured (不理解）
New states can be sampled from the decoder output, allowing automated synthetic goal creation during training to allow the goal-conditioned policy to practice diverse policies

算法的流程如下：

state observations are collected by random exploration of the environment （使用随机的策略收集状态观测）
a variational autoencoder is trained from these observations （训练VA）
latent encodings for each state are obtained from the variational autoencoder（得到在laten space下的状态和目标）
(goal, state) encodings are sampled from existing set （采样（s，a，r，s‘，g））
a reinforcement learning algorithm is trained on latent encodings （基于Q-learning的都可以）
repeat steps 4–5 with the following conditions:
6.1) periodically retrain the autoencoder with newly generated image spaces.（间断性的重新训练VA，不同的状态下目标是有所变化的）
6.2)Generate new goals by feeding goal images through variational autoencoder.（生成新的目标）

https://towardsdatascience.com/ai-research-deep-dive-visual-reinforcement-learning-with-imagined-goals-862115d122a6

猜你喜欢

转载自blog.csdn.net/liyaohhh/article/details/81807105

Visual Reinforcement Learning with Imagined Goals

Deep Reinforcement Learning with Iterative Shift for Visual Tracking

reinforcement-learning-1

Reinforcement Learning(001)

Introduction to Reinforcement Learning

Reinforcement Learning——MDP

Tutorials on Inverse Reinforcement Learning

A Distributional Perspective on Reinforcement Learning

Reinforcement Learning 增强学习

Robust Adversarial Reinforcement Learning

Control of a Quadrotor with Reinforcement Learning

Reinforcement Learning NOTE

Policy in Reinforcement Learning

Reinforcement Learning Cheatsheet

Reinforcement Learning 笔记（1）

Reinforcement Learning 笔记（4）

【ML】Reinforcement Learning

Reinforcement Learning 笔记（3）

Discovering Reinforcement Learning Algorithms

Relational Reinforcement Learning: An Overview

Relational Deep Reinforcement Learning

Reinforcement Learning, Fast and Slow

Theory of Reinforcement Learning

Reinforcement learning + OR的论文

022 Deep Reinforcement Learning

Supervised, Unsupervised, and Reinforcement Learning

Reinforcement-Learning

Introduction to Learning to Trade with Reinforcement Learning

Exploration and Apprenticeship Learning in Reinforcement Learning

强化学习（Reinforcement Learning）

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)