DL study notes [22] Reinforcement Learning

It is said that understanding reinforcement learning must first understand Markov properties


Markovian

Knowing its current state (now), its future evolution(Future) does not depend on its past evolution ( past


The Markov process is divided into three types according to whether its state and time parameters are continuous or discrete:

  1. The discrete time and state are called Markov chains
  2. Time and state are continuous called Markov process
  3. The continuous time and discrete states are called continuous-time Markov chains.


N- step transition probability matrix:

P(n)=P(n-1)P(1)=P(n-2)P(1)P(1)=......=P(1)^n

The probability of passing n steps from one state to other states can be expressed in matrix form, for example:


Hidden Markov Model

The three dice have 4 sides, 6 sides and 8 sides respectively. According to the sequence of 1 to 8 (visible state), the sequence of the used dice (hidden sequence) can be inferred.

  1. Directly multiply - find the maximum probability of generating a sequence
  2. Crack the dice sequence, start counting from the first one, find the one with the highest probability, then count the second one, go backwards in turn (this is the forward algorithm), and according to the last state, push out the front one in turn (this is the backward algorithm) Used to calculate the sum of the probabilities of all possible situations that produced this sequence
  3. Viterbi algorithm - used to calculate the most likely hidden state sequence that produces the visible state
  4. Baum-Welch algorithmtoo complicated, I didn’t look at it


Reinforcement learning

The following two tutorials are great, first record them, and will make up after your own understanding.

epsilon  greed

http://blog.csdn.net/zjq2008wd/article/details/52860654

Q algorithm

http://blog.csdn.net/zjq2008wd/article/details/52767692

Neural networks and reinforcement learning

http://www.cnblogs.com/Leo_wl/p/5852010.html



Guess you like

Origin blog.csdn.net/Sun7_She/article/details/70482259