Reinforcement Learning Algorithm

DQN: discrete low latitude action space

DPPG: deterministic strategy depth gradient algorithm can be used to solve depth study on the continuous strengthening of the action space issues

Q-learing: discrete, low latitude of action space


1, the basic reinforcement learning algorithm

  •  Markov Decision Process
  •  Policy iteration
  •  Iteration value
  •  Generalization iteration

2, reinforcement learning method based on the value of the function

  • Reinforcement learning method based on Monte Carlo method
  • Based on the time difference reinforcement learning method
  • Reinforcement learning (DQN, Q-learing, Double Q_Learing) based on the value of the function

3, based on the direct search strategy of reinforcement learning method

  •  Based on reinforcement learning strategies gradient (Actor-Ctritic, A3C,)
  •  Based on reinforcement learning confidence Domain Policy (TRPO)
  •  Reinforcement learning method based on deterministic strategy
  •  Based on reinforcement learning strategies to guide search (ADMM)

4, strengthen research and learning Foreword

  •  Reverse reinforcement learning
  •  Combination strategy gradient method and the value function
  •  Value function network
  •  Based on reinforcement learning model: PILCO and its extension

 

Guess you like

Origin blog.csdn.net/lxlong89940101/article/details/90476096