Direct State-to-Action Mapping for High DOF Robots Using ELM

Goal

Related Work

To generate a general state-to-action mapping for a high dimensional system, optimizing the mapping using ‘trial and error’ basis has limitations due to the fact that a single trajectory (i.e. one episode) only visits a very small part of the state space.

To address this issue, methods of learning a policy based on optimized trajectories have been

Contribution

猜你喜欢

转载自blog.csdn.net/weixin_42018112/article/details/88299703