数学建模系列(五)——马尔科夫模型

$1.$ 马氏链常规操作

$1.1$ 几个性质

粗暴的了解下马尔科夫几个重要的性质，下一个状态只跟当前的状态相关，与前一个、前两个状态八竿子打不着。比如在时间 $t$ 的状态下， $t+1$ 时刻的状态只跟 $t$ 时刻的转台相关， $t+1$ 时刻的状态与 $t-1$ 时刻的状态无关。换成公式表达就是

P {X_{t + 1} = j | X_{t}, X_{t - 1}, . . ., X_{1}} = P {X_{t + 1} | X_{t}}

$P\{X_{t+1}=j|X_t,X_{t-1},...,X_{1}\}=P\{X_{t+1}|X_t\}$

继续推导上面的公式

P {X_{n + m} = j | X_{n} = i} = p_{i j} (m)

$P\{X_{n+m}=j|X_{n}=i\}=p_{ij}(m)$

什么意思呢，在 $n$ 个状态发生 $i$ 事件的基础上经过 $m$ 个步骤发生 $j$ 事件的概率为 $p_{ij}(m)$

$1.2$ 应用

看完了公式可能云里雾里，来个实例醒醒脑。

状态转移矩阵
	A	B	C
A	0.8	0.6	0.3
B	0.2	0.65	0.47
C	0.3	0.8	0.1

看上面的表格，假设表格的矩阵用 $X$ 表示。 $A$ 转移到 $A$ 的概率为 $0.8$ ， $B$ 转移到 $C$ 的概率为0.47，比如现在来了一个初始状态 $S=\{0.5,0.6,0.7\}$ ，也就是发生 $A$ 的概率为 $0.5$ ，发生 $C$ 的概率为 $0.7$ ，在一次转以后，也就是 $S\cdot X$ ，得到一个 $1\times3$ 的输出矩阵，比如是 $S_1= \{ s_1,s_2,s_3 \}$ ，也就是下一时刻发生 $A$ 的概率为 $s_1$ ，发生 $C$ 的概率为 $s_3$ 。如果还要算在下一时刻的状态，就是 $S_1 \cdot A$ 。想要算几次以后的概率，就乘以几次状态转移矩阵。知乎上看到的一个通俗的理解

$1.3$ python程序

import numpy as np 

def Generating_transfer_matrix():
# 原始数据矩阵
    a = np.array([[1, 2, 3, 4],\
                  [1, 2, 2, 3],\
                  [2, 1, 3, 4],\
                  [3, 2, 2, 2],\
                  [4, 2, 3, 4]])

    dimension = a.shape
    Size = a.size
    a = a.flatten()
    ls = a.tolist()
    Set = set(ls)
    count = len(Set)  #检测有几个不重复的元素
    indexitem = list(Set)
    temp = np.arange(count * count)
    temp = temp.reshape(count, count)
    for i in range(0, count):
        for j in range(0, count):
            temp[i][j] = 0
    # 生成转移矩阵
    for k in range(0, 4):
        for j in range(0, count):
            for i in range(0, Size - 1):
                if (i == 3) or (i % (dimension[1]) == 3):
                    temp[k][j] += 0
                elif (a[i] == indexitem[k]) and (a[i + 1] == indexitem[j]):
                    temp[k][j] += 1

    return (count, temp)

def Computed_probability_matrix():
#生成概率矩阵
    receive = Generating_transfer_matrix()
    count = receive[0]
    temp = receive[1]

    temp.dtype = 'float32'
    dimension1 = temp.shape
    for i in range(0, dimension1[0]):
        sum = temp[i].sum()
        temp[i] = temp[i] / sum

    return temp

# 开始马尔科夫计算
def Markov(i):
    # 设置原始状态
    calculated_matrix = Computed_probability_matrix()
    startstatus = np.array([0.3, 0.4, 0.2, 0.6])
    dimension1 = (calculated_matrix.shape)
    temp1 = np.arange(dimension1[0])
    for k in range(0, i):
        if k == 0:
            temp1 = np.dot(startstatus, calculated_matrix)
        else:
            temp1 = np.dot(temp1, calculated_matrix)

    return temp1

if __name__ == '__main__':
    result = Markov(10)   #计算第十次后的状态
    print("The ith selection bias distribution: " + str(result))

忘了看sk-learn库有没有现成的函数，尴尬，大概浏览了下好像没有。

$1.4$ 扩展

继续上面那个矩阵 $A$ ，我们可以得到转移到状态 $A$ 的方程： $p_1=0.8p_1+0.2p_2+0.3p_3$ ，到状态C的方程 $p_3=0.3p_1+0.47p_2+0.1p_3$ 。可以推出一个方程组

p_{1} = 0.8 p_{1} + 0.2 p_{2} + 0.3 p_{3}

$p_1=0.8p_1+0.2p_2+0.3p_3$ $p_{2} = 0.6 p_{1} + 0.65 p_{2} + 0.8 p_{3}$ $p_2=0.6p_1+0.65p_2+0.8p_3$ $p_{3} = 0.3 p_{1} + 0.47 p_{2} + 0.1 p_{3}$ $p_3=0.3p_1+0.47p_2+0.1p_3$

现在来解上面的方程组，高中生用那种方法，大学生用线性代数的方法，不管怎么样，解出来假设结果是(我不想算) $p_1=0.6,p_2=0.8,p_3=0.7$ ，说明了随着时间的转移， $p_2$ 占有的优势更大，发生事件 $B$ 的几率更大。

$1.5$ 继续—–隐马尔科夫模型(HMM)

在看周志华那本机器学习的时候看见过隐马尔科夫模型，然而实在看不懂，玄学的公式和实际的问题对不上号。先引入两个别人的博客，我也是一路百度看他们的内容才看懂的。这个内容粗暴简单。这个内容详细丰富。

$1.5.1$ 概念引入(隐藏的和实际的是两个概念)：

所有可能的隐藏状态集合： $Q=\{q_1,a_2,...,q_n\}$

所有可能的观测状态集合： $V=\{v_1,v_2,...,v_n\}$

隐藏的状态序列： $I=\{i_1,i_2,...,i_n\}$

实际的观测序列： $O=\{o_1,o_2,...,o_n\}$

隐藏状态转移矩阵(行列数相等)：

\begin{matrix} (1) & A = (\begin{array}{ccc} A_{11} & A_{12} & A_{13} \\ A_{21} & A_{22} & A_{23} \\ A_{31} & A_{32} & A_{33} \end{array}) \end{matrix}

$\begin{equation} A= \left( \begin{array}{ccc} A_{11} & A_{12} & A_{13}\\ %第一行元素 A_{21} & A_{22 }& A_{23}\\ %第二行元素 A_{31} & A_{32} & A_{33}\\ %第二行元素 \end{array} \right) \tag{ 1} \end{equation}$

观测概率矩阵(行数等于上面矩阵的列数,列数随意)：

\begin{matrix} (2) & B = (\begin{array}{ccc} B_{11} & B_{12} & B_{13} \\ B_{21} & B_{22} & B_{23} \\ B_{31} & B_{32} & B_{33} \end{array}) \end{matrix}

$\begin{equation} B= \left( \begin{array}{ccc} B_{11} & B_{12} & B_{13}\\ %第一行元素 B_{21} & B_{22 }& B_{23}\\ %第二行元素 B_{31} & B_{32} & B_{33}\\ %第二行元素 \end{array} \right) \tag{ 2} \end{equation}$

最开始隐藏状态分布 $\varphi=\{\varphi_1,\varphi_2,...,\varphi_n\}$ ，维度和矩阵 $A$ 的行数相同。

HMM模型搭建完毕， $\lambda=(A,B,\varphi)$

看个实例吧，公式看着晕。

$1.5.2$ 实例说明

看个实例吧，公式看着晕，比如在盒子里面抓球解解闷，假设三个箱子里只有红色和白色的球。隐藏状态 $O$ 是三个箱子，观测状态 $V$ 是白红。

三个箱子，第一个箱子5红5白，第二个箱子4红6白，第三个箱子7红3白。也就是能得到下面的观测概率矩阵：

\begin{matrix} (3) & B = (\begin{array}{ccc} 0.5 & 0.5 \\ 0.6 & 0.4 \\ 0.7 & 0.3 \end{array}) \end{matrix}

$\begin{equation} B= \left( \begin{array}{ccc} 0.5& 0.5 \\ %第一行元素 0.6 & 0.4\\ %第二行元素 0.7& 0.3\\ %第二行元素 \end{array} \right) \tag{ 3} \end{equation}$

按照下面的方法从盒子里抽球，开始的时候，从第一个盒子抽球的概率是0.2，从第二个盒子抽球的概率是0.4，从第三个盒子抽球的概率是0.4。也就是初始隐藏分布为 $\lambda=\{0.2,0.4,0.4\}$ 。

以这个概率抽一次球后，将球放回。然后从当前盒子转移到下一个盒子进行抽球。规则是：如果当前抽球的盒子是第一个盒子，则以0.5的概率仍然留在第一个盒子继续抽球，以0.2的概率去第二个盒子抽球，以0.3的概率去第三个盒子抽球。

如果当前抽球的盒子是第二个盒子，则以0.5的概率仍然留在第二个盒子继续抽球，以0.3的概率去第一个盒子抽球，以0.2的概率去第三个盒子抽球。

如果当前抽球的盒子是第三个盒子，则以0.5的概率仍然留在第三个盒子继续抽球，以0.2的概率去第一个盒子抽球，以0.3的概率去第二个盒子抽球。上面三段，能得到状态转移矩阵(自己意会一下矩阵 $A,B$ 的维度表示啥意思)。

\begin{matrix} (4) & A = (\begin{array}{ccc} 0.5 & 0.2 & 0.3 \\ 0.3 & 0.5 & 0.2 \\ 0.2 & 0.3 & 0.5 \end{array}) \end{matrix}

$\begin{equation} A= \left( \begin{array}{ccc} 0.5 & 0.2 & 0.3\\ %第一行元素 0.3& 0.5& 0.2\\ %第二行元素 0.2& 0.3 & 0.5\\ %第二行元素 \end{array} \right) \tag{ 4} \end{equation}$

如此下去，直到重复三次，得到一个球的颜色的观测序列: $O=\{红，白，红\}$ 。在这个观测状态下，推测最有可能来源于哪个箱子，也就是哪个隐藏状态，也就是要计算状态 $I$ 。

说了半天，终于把那些公式符号都套上了。然后去解决实际问题。

$1.5.3$ 可应用的实际问题

　　　　 1)评估观察序列概率。即给定模型 $\lambda=(A,B,\varphi)$ 和观测序列 $O=\{红，白，红\}$ ，计算在模型 $\lambda$ 下观测序列 $O$ 出现的概率 $P(O|λ)$ 。这个问题是HMM模型三个问题中最简单的。

　　　　 2)模型参数学习问题。即给定观测序列 $O$ ，估计模型 $\lambda=(A,B,\varphi)$ 的参数，使该模型下观测序列的条件概率 $P(O|λ)$ 最大。

　　　　 3)预测问题，也称为解码问题。即给定模型 $\lambda=(A,B,\varphi)$ 和观测序列 $O$ ，求给定观测序列条件下，最可能出现的对应的状态序列，这个问题的求解需要用到基于动态规划的维特比算法。(当时就是用的这个思想做的2018泰迪杯B题)。

　　　　以上问题的求解步骤我看懂了差不多，但是推导太复杂了我也不想写。上面的两个博客提到了求解的步骤，如果有兴趣可以去看看。真的，人家讲的特别详细，据说是看的《统计学习方法》这本书，好像又在骗我买书的样子。

$1.5.4$ 程序(我把解决的三个问题都写在一个程序里面了)

import numpy as np
from hmmlearn import hmm
import math

# 状态序列
states = ["box 1", "box 2", "box3"]
n_states = len(states)

# 观测序列
observations = ["red", "white"]
n_observations = len(observations)

start_probability = np.array([0.2, 0.4, 0.4])

# 状态转移矩阵
transition_probability = np.array([
  [0.5, 0.2, 0.3],
  [0.3, 0.5, 0.2],
  [0.2, 0.3, 0.5]
])

dimension1 = transition_probability.shape
dimension1 = dimension1[0]

# print("数据维度" + str(dimension))
# 观测概率矩阵
emission_probability = np.array([
  [0.5, 0.5],
  [0.4, 0.6],
  [0.7, 0.3]
])

model = hmm.MultinomialHMM(n_components=n_states)
model.startprob_=start_probability
model.transmat_=transition_probability
model.emissionprob_=emission_probability

# 观测序列
seen = np.array([[0,1,0]]).T
dimension = seen.shape
dimension = dimension[0]

logprob, box = model.decode(seen, algorithm="viterbi")

print("解码问题：观测序列下球的顺序：", end = "")
for i in seen:
    if i[0] == 0:
        print(str(observations[0]), end = "  ")
    else:
        print(str(observations[1]), end = "  ")
print("\t")
print("解码问题：最佳状态序列：", end = "")

savestateindex = []
# 保存状态的索引
for i in box:
    if i == 0:
        print(states[i], end = "  ")
        savestateindex.append(states.index(states[i]))
    elif i == 1:
        print(states[i], end = "  ")
        savestateindex.append(states.index(states[i]))
    elif i == 2:
        print(states[i], end = "  ")
        savestateindex.append(states.index(states[i]))

lt = []
temp = np.array([0,0,0])
print("\n")

for k in range(0, dimension):
    ls = []
    if k == 0:
        temp = np.multiply(start_probability.T, emission_probability[:, seen[k][0]])  # 
        ls.append(temp)
        ls.sort()
        print("对应最佳状态序列的发生概率")
        print(ls[-1][savestateindex[0]])

    if ((k < dimension) and (k > 0)):
        for j in range(0, dimension1):
            ls.append(np.multiply(transition_probability[:,j], temp).max() * emission_probability[j, seen[k][0]])
            lt = ls
            if len(lt) == dimension1:
                # print(ls[-1])
                # print(lt)
                temp = np.array(lt)
                count = 1
                state = savestateindex[count]
                count += 1
                if count > dimensoin1:
                    count = 1
                print(temp[state])
        ls.sort()

score = (model.score(seen))
print("\n" + "概率问题：观测序列出现的概率：" + str(math.pow(math.e, score)))

model2 = hmm.MultinomialHMM(n_components=n_states)
# 给定观测序列
X2 = np.array([[0, 1, 1, 1, 1 ,0, 1]])

model2.fit(X2)
print("模型参数学习问题(已经重新定义的观测序列)：")
print ("初始状态" + str(model2.startprob_))
print ("最佳转移状态矩阵" + str(model2.transmat_))
print ("最佳观测概率矩阵" + str(model2.emissionprob_))
print ("对应发生事件的概率：" + str(math.pow(math.e, model2.score(X2))))

数学建模(五)——马氏链模型

数学建模系列(五)——马尔科夫模型

$1.$ 马氏链常规操作

$1.1$ 几个性质

$1.2$ 应用

$1.3$ python程序

$1.4$ 扩展

$1.5$ 继续—–隐马尔科夫模型(HMM)

猜你喜欢

数学建模(五)——马氏链模型

数学建模系列(五)——马尔科夫模型

1. 1. 1.马氏链常规操作

1.1 1.1 1.1几个性质

1.2 1.2 1.2应用

1.3 1.3 1.3 python程序

1.4 1.4 1.4 扩展

1.5 1.5 1.5继续—–隐马尔科夫模型(HMM)

猜你喜欢

$1.$ 马氏链常规操作

$1.1$ 几个性质

$1.2$ 应用

$1.3$ python程序

$1.4$ 扩展

$1.5$ 继续—–隐马尔科夫模型(HMM)