I believe that students who have studied random processes will definitely know Markov chains. This is an algorithm that uses statistical methods and models to process and predict things in nature, such as judging the trend of the stock market, dialogue prediction, poetry creation, etc. Since the Markov chain is so widely used, we have reason to get to know it well.
Illustration of a Markov chain
In fact, the Markov chain can be regarded as a relatively simple probabilistic graph model, and each node is embedded in the same graph space in a one-way or two-way connection.
Let's look at a simpler example.
Each node represents a random variable to be analyzed, and the arrows between nodes indicate the conversion relationship between random variables.
The core of the Markov chain has three elements: state space, no memory, and transition matrix.
The state space is the state of each node. It is not difficult to see from the figure that the Markov chain is a random process of transition from one state to another in the state space.
No memory is reflected in the future state distribution only depends on the present and has nothing to do with the past, as expressed in the formula:
P( st ∣ st − 1 , st − 2 , . . . , s 1 s_t|s_{t- 1},s_{t-2},...,s_1st∣st−1,st−2,...,s1)=P( s t ∣ s t − 1 s_t|s_{t-1} st∣st−1)
At each step of the Markov chain, the system can switch from one state to another according to the probability distribution (ie, the transition equation), or it can remain unchanged. The transition equation contains the mapping relationship between different states, and this mapping relationship is usually expressed in the form of probability.
After intuitively understanding the Markov chain from the picture, if you want to delve into its principle and apply it, you need to use the mathematical expression of the Markov chain.
Mathematical expression of Markov chain
state vector
X ( n ) = ( X k ( 1 ) , X k ( 2 ) , X k ( 3 ) , . . . , X k ( n ) ) X^{(n)}=(X^{(1)}_k, X^{(2)}_k,X^{(3)}_k,...,X^{(n)}_k) X(n)=(Xk(1),Xk(2),Xk(3),...,Xk(n))
There are k possible states of the Markov chain, and each element in the state vector is a possible state, and the value of the element is a probability, that is, the possibility of converting to this state under the current conditions, because the state vector includes all possibility, so the sum of each element is 1.
transfer matrix
p = ( p 11 p 21 . . . pk 1 p 12 p 22 . . . pk 2 ⋮ ⋮ ⋮ p 1 kp 2 k . . . pkk ) \begin{gathered} \quad p=\begin{pmatrix} p_{ 11} & p_{21} & ...& p_{k1}\\ p_{12} & p_{22} & ...& p_{k2}\\ ⋮ & ⋮ & & ⋮\\ p_{1k} &p_{2k}&...&p_{kk}\\end{pmatrix}\quad\end{gathered}p=⎝⎜⎜⎜⎛p11p12⋮p1 kp21p22⋮p2 k.........pk 1pk2 _⋮pkk⎠⎟⎟⎟⎞
Among them, pij p_{ij}pijis the probability of transitioning from state i to state j, then according to matrix multiplication and the evolution law of the Markov chain, we can get:
X ( n + 1 ) X^{(n+1)}X(n+1)= P P P X ( n ) X^{(n)} X( n )
Therefore, according to the chain conduction mode of the Markov chain:X ( n + 1 ) X^{(n+1)}X(n+1)= P n P^n Pn X ( 0 ) X^{(0)} X( 0 )
Since the state transition at a certain moment only depends on the previous state, then only the transition probability matrix between any two states in the system is required, and the model of the Markov chain is determined.
Steady-state distribution of Markov chains
When the number of observations is large enough, some Markov chains will appear to stabilize, which we call Markov’s steady-state phenomenon, and the state of the steady-state phenomenon is the steady-state distribution of the Markov chain. But not all Markov chains have a steady-state distribution, and the following conditions need to be met:
- recurring. usually in the form of a loop
- aperiodic
- Pairwise connectivity between states