"Probability and Statistics" state transition: Acquaintance Markov Chain

Recalling the two important stochastic process.

In the overview on a random process, we mentioned two types of very, very typical and important stochastic processes, one is: Bernoulli process and Poisson process, this type of random process is no memory, that is, He said the future status does not depend on the state of the past - the new "success" or "arrival" does not depend on the historical situation of the process of the past.

While the other is just the opposite, the future situation will depend on the case in the past, and is capable of predicting the future through happened in the past in a way, such as this one of our core content - Markov process, it We have in-depth and wide range of applications in many fields.

Discrete time Markov chain

Markov chain three elements

This process is a kind of state transition that occurs over time, so that the Markov chain is divided into two discrete time Markov chain and continuous time. We first consider the discrete time Markov chain, its status changes occur in discrete time points determined.

Markov discrete time points with three core concepts: discrete-time state space, transition probabilities .

In discrete time Markov chain, we usually used to represent the time n, Xn is represented by a Markov chain in the state at this time, then obviously all state Markov chain will constitute a set S, the set S we call it the state space is discrete time Markov chains.

So, based on this state space, we discussed the concept of a second Markov chains: transition probabilities. Transition probability is so described: When the current state of the Markov chain is a discrete time i, j equal to the next state transition probability is a probability, we are referred to as \ (ij of of P_ {} \) .

Transition probability \ (P_ {ij} \) in nature can be expressed by a conditional probability:

\(P_{ij} = P(X_{n+1} = j| X_{n} = i), i,j ∈S\)

We give a state with three discrete time Markov chain, its transition probability is as follows:

It indicated in this figure is "ancient Ming to feel" of three states, every day he is in one of these three states: either house at home or outside the movement, or eat food. This is the "ancient Ming to sleep" state space, if today "ancient Ming to feel" home home tomorrow, he still house the probability of the home is 0.2, the probability go out to eat food tomorrow is 0.6, the probability of tomorrow out movement is 0.2, which is several of his transition probability.

Markov property

If we say that a discrete-time, space and state transition probabilities are the constituent elements of discrete time Markov chain, then the Markov property features is its soul.

This feature is specifically described as follows: as long as the state at time n is i Markov chain, no matter what happened in the past, regardless of the state of the Markov chain is i arrives at a time n + 1 is transferred to a certain probability of state j They are the transition probability \ (P_ {ij} \)

We still use the illustration above, to explain, such as "ancient Ming to feel" Today is eating food, no matter her yesterday in sports or stay home, the probability that he went out movement of tomorrow are the same, that is, 0.3, go out continue to eat food the probability of the same, or 0.6. Profile to say is this: The next state is only related to the current state, regardless of the earlier history of the state.

7788 language said so much, is an example child, but also theoretical terms, in fact, on the implementation of the mathematical language of expression, or a conditional probability equation thing:

I.e. the time for any n, any state of the state space i, j ∈ S, and the time before n (i.e. history) any possible state sequence \ (i_ {0}, i_ {1}, .. .,. 1-n-I_ {} \) , are:

\(P(X_{n+1} = j|X_{n}=i,X_{n-1}=i-1,...,x_{0}=i_0) = P(X_{n+1} = j|X_n=i) = P_{ij}\)

It is fairly straightforward: next state \ (X_ {n + 1} \) state depends only on the previous state \ (X_n \)

Transition probabilities and state transition matrix

So the question is, this transition probability \ (P_ {ij} \) what kind of property it? First \ (P_ {ij} \) must satisfy the non-negative, it is necessary

Secondly, Note that \ (Σ_. 1} = {J} of P_ ^ {m} = {ij of. 1 \) , for all i this equation are established. In fact, this is well understood, we still shining above on "ancient Ming to sleep" state diagram that view. For example, today the house in her home, so she continued to stay home tomorrow probability is 0.2, the probability of moving out tomorrow is 0.2, the probability of food go out to eat tomorrow is 0.6, regardless of the probability of these three values, how to allocate, a little is yes, is the sum of the three must be equal to 1.

Here we look at a little special circumstances, namely when J = i, \ (P_ {ij} \) values issues. In fact, it corresponds to the situation today in the house at home, to stay home tomorrow in the "ancient Ming to sleep." Although the state has not changed, but we can be considered state has a special transfer, that is, the transfer itself.

In the state space S, between any two states i and j has a transition probability \ (ij of of P_ {} \) , and satisfies \ (ij of of P_ {} \)  ≥ 0 (when not state i to the state j when, \ (} ij of of P_ {\) = 0). So we can all transition probabilities between states in the order of the state space organized into a two dimensional matrix, wherein the row i, column j element is \ (ij of of P_ {} \) , then the two-dimensional matrix:

It is called a transition probability matrix , which characterizes the nature of the Markov chain corresponding to.

我们沿用"古明地觉"的例子:我们令状态 1 为宅在家中;状态 2 为运动;状态 3 为吃美食,那么这个马尔科夫链就是一个 3×3 的 2 维矩阵。

马尔科夫链性质的总结

到这里,我们有必要停下来梳理一下马尔科夫链的最基本性质,一个马尔科夫链由以下主要特征确定:

  • 第一是状态集合S = {1, 2, 3, ..., m}
  • 第二是可能发生状态转移的(i, j),即那些\(P_{ij} > 0\)的状态对;
  • 第三是\(P_{ij}\)的取值;

而这三点都可以最终由一个二维的转移概率矩阵所描述。

并且由该上述特征所描述的马尔科夫链是一个随机变量序列\(X_{0},X_{1},X_{2},...,X_{n}\),它们取值于状态空间S,并且满足:对于任意的时间n,所有状态i,j∈S,以及之前所有可能的状态序列:\(i_{0},i_{1},...,i_{n}\),均有:

\(P(X_{n+1} = j|X_{n}=i,X_{n-1}=i-1,...,x_{0}=i_0) = P(X_{n+1} = j|X_n=i) = P_{ij}\)

一步到达与多步转移

刚才我们花大篇幅介绍的转移概率\(P_{ij}\)给出的是从状态i一步到达状态j的转移概率\(P_{ij} = P(X_{n+1}=j|X_{n}=i)\)。那么我们进一步拓展,我们不是通过一步,而是通过m步(其中m>1)从状态i转移到状态j,那么这个对应的就是m步的状态转移概率。

写成条件概率的形式就是:

\(P^{m}(i, j) = P(X_{n+m} = j|X_{n} = i)\)

这里我们换一个例子来看看,社会的流动性问题是大家都非常关注的一个问题,社会的流动性强,社会的底层通过自身的努力使得家族向社会中层甚至上层流动是社会始终保持活力重要推动力。

社会学中就有一个非常有名的反映社会阶层流动的马尔科夫链,在这个马尔科夫链的状态空间中,有三个状态,状态 1 是处于贫困水平,状态 2 是中产阶级,状态 3 是财富自由,它的状态转移矩阵为:

我们不讨论这里面数据的合理性和准确性,单单就事论事,如\(P_{13}=0.1\),表示如果这一代人处于贫困水平,下一代人想成为财富自由的概率为 0.1,而继续处于贫困水平的概率要大得多,为 0.7。而如果这一代人处于财富自由的水平,那么他的下一代处于财富自由的概率也要大不少,为 0.4。

那么,我们现在思考这么一个问题,假设爷爷处于贫困水平(状态 1),那么父亲处于中产阶级(状态 2),而你处于财富自由水平(状态 3)的概率有多大?

从哪里入手呢?还是紧扣定义,从条件概率表达式入手:

\(P(X_{2}=3,X_{1}=2|X_{0}=1)\) = $ P(X_{2}=3,X_{1}=2,X_{0}=1) \over P(X_{0}=1)$

=\(P(X_{2}=3,X_{1}=2,X_{0}=1) \over P(X_{1}=2,X_{0}=1)\) · \(P(X_{1}=2,X_{0}=1) \over P(X_{0}=1)\)

=\(P(X_{2}=3|X_{1}=2,X_{0}=1) · P(X_{1}=2,X_{0}=1)\)

由于这是一个马尔科夫链,因此满足:

\(P(X_{2}=3|X_{1}=2,X_{0}=1) = P(X_{2}=3|X_{1}=2)\)

因此有:

\(P(X_{2}=3,X_{1}=2|X_{0}=1) = P(X_{2}=3|X_{1}=2)P(X_{1}=2|X_{0}=1)\)

对应到转移概率矩阵中,就是从状态 1 转移到状态 2 的概率乘以从状态 2 转移到状态 3 的概率:

\(P_{12}P_{23} = 0.2·0.2 = 0.04\)

接下来我们不指定父亲的状态,只看这么一个问题,就是假设爷爷是贫困水平(状态 1),问孙子处于财富自由(状态 3)状态的概率有多大?

那么这里只指定了爷爷和孙子所处的状态,那么父亲这一代可以是处于贫穷、中产和财富自由中的任意一种状态,那么这个概率的表达式写起来也很简单:

\(P(X_{2}=3|X_{0}=1) = P(X_{2}=3,X_{1}=1|X_{0}=1) + P(X_{2}=3,X_{1}=3|X_{0}=1)=P_{11}P_{13}+P_{12}P_{23}+P_{13}P_{33}\)

\(=∑_{k=1}^{3}P_{1k}P_{k3}=0.7·0.1+0.2·0.2+0.1·0.4=0.15\)

上面具体计算出来的结果对我们而言其实并不重要,我们重点还是回过头来看这个式子:

\(P(X_{2}=3|X_{0}=1) = P_{11}P_{13}+P_{12}P_{23}+P_{13}P_{33}\)

对线性代数熟悉的同学应该对这个等式很敏感,它实际上就是

该转移矩阵中,第一行和第三列点乘的结果,如果按照矩阵相乘的运算法则,这个计算出来的结果恰好位于结果矩阵的第一行第三列,也正对应了从状态 1 到状态 3,两步状态转移的概率值。

试想,如果我们将概率转移矩阵与自身相乘,也就是求它的二次幂,即:

那么得到的新的 3×3 的二维矩阵里就包含了所有状态间,通过两步到达的概率值:

import numpy as np

A = np.array([[0.7, 0.2, 0.1],
              [0.3, 0.5, 0.2],
              [0.2, 0.4, 0.4]])


print(A @ A)
"""
[[0.57 0.28 0.15]
 [0.4  0.39 0.21]
 [0.34 0.4  0.26]]
"""

从结果中我们可以看出,第一行第三列确实就是我们刚刚求出来的概率值 0.15。

那么以此类推,我们想看看 n 步状态转移概率,那么就是求取上面的状态转移矩阵的n次幂

from functools import reduce
import numpy as np

A = np.array([[0.7, 0.2, 0.1],
              [0.3, 0.5, 0.2],
              [0.2, 0.4, 0.4]])


print(np.array(reduce(np.ndarray.__matmul__, [A] * 3)))
"""
[[0.513 0.314 0.173]
 [0.439 0.359 0.202]
 [0.41  0.372 0.218]]
"""
print(np.array(reduce(np.ndarray.__matmul__, [A] * 5)))
"""
[[0.47683 0.3353  0.18787]
 [0.46251 0.34373 0.19376]
 [0.45662 0.34708 0.1963 ]]
"""
print(np.array(reduce(np.ndarray.__matmul__, [A] * 10)))
"""
[[0.46823165 0.34033969 0.19142866]
 [0.4679919  0.34048014 0.19152797]
 [0.46789259 0.3405383  0.19156911]]
"""
print(np.array(reduce(np.ndarray.__matmul__, [A] * 20)))
"""
[[0.46808515 0.34042551 0.19148934]
 [0.46808508 0.34042555 0.19148937]
 [0.46808505 0.34042556 0.19148938]]
"""
print(np.array(reduce(np.ndarray.__matmul__, [A] * 100)))
"""
[[0.46808511 0.34042553 0.19148936]
 [0.46808511 0.34042553 0.19148936]
 [0.46808511 0.34042553 0.19148936]]
"""

很显然,随着 n 的逐渐增大,n 步状态转移矩阵收敛于:

[[0.46808511 0.34042553 0.19148936]
 [0.46808511 0.34042553 0.19148936]
 [0.46808511 0.34042553 0.19148936]]

我们发现,它每行的三个元素都是一模一样的,这说明不论你现在是贫穷水平、中产阶级还是财富自由,过了很多代以后,你的后代落入到三个阶层中任意一个阶层的概率都是一定的。而且最大的概率都是变成贫困阶层。当然这个只是我们依据给定的数据计算而来,具体是否符合社会学的常识,就不是我们所关心的问题了。不过确实也说明,富有从来不是一件容易的事儿。

最后我们看看由 n 步转移概率所派生出来的路径概率问题。

给定一个马尔科夫链模型,我们可以计算未来任何一个给定状态序列的概率,特别的我们有:

\(P(X_{0}=i_{0},X_{1}=i_{1},...,X_{n}=i_{n})=P(X_{0}=i_{0})P_{i_{0}i_{1}}P_{i_{1}i_{2}}···P_{i_{n-1}i_{n}}\)

这个结合马尔可夫性和条件概率的描述形式,说明起来也是非常简单。

首先由贝叶斯定理可得:

\(P(X_{0}=i_{0},X_{1}=i_{1},...,X_{n}=i_{n})=P(X_{n}=i_{n}|X_{0}=i_{0},X_{1}=i_{1},...,X_{n-1}=i_{n-1})P(X_{0}=i_{0},X_{1}=i_{1},X_{2}=i_{2},...X_{n-1}=i_{n-1})\)

然后依照马尔可夫性:

\(P(X_{n}=i_{n}|X_{0}=i_{0},X_{1}=i_{1},...,X_{n-1}=i_{n-1})=P(X_{n}=i_{n}|X_{n-1}=i_{n-1}) = P_{i_{n-1}i_{n}}\)

从而得到:

\(P(X_{0}=i_{0},X_{1}=i_{1},...,X_{n}=i_{n})= P_{i_{n-1}i_{n}}P(X_{0}=i_{0},X_{1}=i_{1},...,X_{n-1}=i_{n-1})\)

然后我们在这个式子的基础上不断递推就能得到最开始的:

\(P(X_{0}=i_{0},X_{1}=i_{1},...,X_{n}=i_{n})= P(X_{0}=i_{0})P_{i_{0}i_{1}}P_{i_{1}i_{2}}···P_{i_{n-1}i_{n}}\)

这个路径问题举个例子就很好理解了,比如从你的太爷爷开始,太爷爷是贫穷,爷爷是贫穷,爸爸是中产,你是财富自由,你儿子中产,你孙子贫穷,这就是一个路径,当然是一个非常悲剧,典型的努力拼搏最后又富不过三代的悲剧路径:那么路径概率是\(P(X_{0}=1)P_{11}P_{12}P_{23}P_{32}P_{21}\),最左边的 \(P(X_{0}=1)\) 表示太爷爷是贫穷的概率,这个值指定了,整个路径的概率就可以计算出来了。

Guess you like

Origin www.cnblogs.com/traditional/p/12609648.html