Reinforcement Learning 笔记（3） - Code World

Reinforcement Learning 笔记（3）

Others 2020-03-04 01:05:43 views: null

Sarsa & Q-Learning

The public table, with respect to the difference Q Sarsa seems that the learning of the update process Sarsa,

Q (s ', a') and the manner of obtaining Q (s, a) the same, are obtained by analogous methods of epsilon-greedy.

And Q-Learning idea is directly grab s ', all corresponding action, select the maximum as the source of learning, i.e., max (a') [Q (s ', a')]

For Q:

We have s, we chose a, and then the resulting s ', using the existing q_table s' information selected as the largest of a 'learning, so that the end may be the fastest approach but when the s.' -> s when, a 'not retained, that is to say, a new step, and may not be selected once a', but re-use policy

For sarsa:

We have s, we chose a, then got s ', directly next round of policy chosen a', as learning content to modify the current weight, and the next state and action have been identified.

So the result is, Q of learning is to go straight to the end, if it is a single reward system, then, is obvious from the forward stepwise transfer, and for sarsa, is gradually test the experience for the first time may not get it soon a second time to spend, the final result is more than enough to try the path, eventually gradually find the optimal solution.

Guess you like

Origin www.cnblogs.com/aitashi/p/12405944.html

Reinforcement Learning 笔记（3）

Reinforcement Learning 笔记（4）

Deep Reinforcement Learning - Policy Learning (3)

(3) The basis of deep reinforcement learning [strategy learning]

Reinforcement Learning

Tensorflow reinforcement learning (Reinforcement learning)

Reinforcement Learning & Dynamic Programming 3 | Policy Iteration

"Reinforcement Learning and Optimal Control" Study Notes (3): Overview of Reinforcement Learning Median Space Approximation and Policy Space Approximation

[Deep learning] Reinforcement learning

【Learning】Deep Reinforcement Learning

Understanding of RL (reinforcement learning)-reinforcement learning

Chapter 2 Reinforcement Learning and Deep Reinforcement Learning

【Reinforcement Learning Knowledge】Introduction to Reinforcement Learning

Reinforcement learning-Basics of Reinforcement Learning

[Reinforcement Learning] 01 - Introduction to Reinforcement Learning

Reinforcement Learning - Concept 05: Inverse Reinforcement Learning

[Zero-Basic Machine Learning 3] Introduction to Machine Learning Types: Supervised Learning - Unsupervised Learning - Reinforcement Learning

[Reinforcement learning paper notes (6)]: A3C

MATLAB Reinforcement Learning Toolbox (3)-Create Simulink Environment and Train Agent

[Reinforcement Learning] Asynchronous Advantage Actor-Critic (A3C)

DeepMind releases DreamerV3, a general algorithm for reinforcement learning

3. Reinforcement learning--model free decision-making

Reinforcement learning code practice (3) --- Looking for the true self

Policy in Reinforcement Learning

Reinforcement Learning Algorithm

Reinforcement Learning Cheatsheet

Reinforcement Learning - Getting Started

Reinforcement Learning Chapter VI

Reinforcement learning third chapters

Reinforcement learning Chapter VII

Recommended

Arc Browser for Windows 1.0 officially GA

A programmer born in the 1990s developed a video porting software and made over 7 million in less than a year. The ending was very punishing!

Ranking

1. Select Sort

Create a thread thread

3 press to play ball that reach 6

Programmation CUDA (4) : gestion de la mémoire

SpringBoot database connection pool Druid error

E Diudiu App redesign summary

4EVERLAND Hosting now supports SNS+IPFS

About HTTPS

[vue3+vite+ts+element-plu+sass] uses bug records in sass

Interpretation of HUAWEI CLOUD GaussDB (for Influx): Best Practice Data Modeling

Daily

More

2024-05-03(8)

2024-05-02(0)

2024-05-01(4)

2024-04-30(36)

2024-04-29(5)

2024-04-28(12)

2024-04-27(29)

2024-04-26(22)

2024-04-25(32)

2024-04-24(30)