[Hidden Markov Model] Detailed explanation of observation sequence probability calculation algorithm and examples of hidden Markov model

[Hidden Markov Model] Use the forward algorithm to calculate the observation sequence probability P (O|λ)​​​​​​​

[Hidden Markov Model] Use backward algorithm to calculate observation sequence probability P (O | λ)

The hidden Markov model is a probabilistic model about time series, which describes the process of randomly generating a sequence of unobservable states from a hidden Markov chain, and then randomly generating an observation from each state to generate the sequence of observations. The model itself is a generative model, which represents the joint distribution of the state sequence and the observation sequence, but the state sequence is hidden and unobservable.

The calculation of observation sequence probability requires effective algorithm support.

Model\lambda=(A,B,\pi), A is the state transition probability matrix, B is the observation probability matrix, π is the initial state probability vector

direct calculation method

The direct calculation method is mainly used to illustrate ideas. It is conceptually feasible but computationally unfeasible (the amount of calculation is too large).

Idea:

1. List all possible state sequences with length$T$$I=(i_1,i_2,...,i_T)$

2. Find the joint probability of each state sequence$I$ and the observation sequence$O=(o_1,o_2,\cdots,o_T)$$P(O,I|\lambda)$

3. Sum all possible state sequences to get$P(O|\lambda).$

Input: hidden Markov model$\lambda=(A,B,\pi)$ and observation sequence$O=(o_1,o_2,\cdots,o_T)$

Output: Implementation sequence $O$ probability of occurrence,

1) Probability of state sequence$I=(i_1,i_2,\cdots,i_T)$

$P(I\mid\lambda)=\pi_{i_{1}}a_{i_{1}i_{2}}a_{i_{2}i_{3}}\cdots a_{i_{T-1}i_{T}}$

2) For a fixed state sequence$I=(i_1,i_2,\cdots,i_T)$, the probability of observing the sequence$O=(o_1,o_2,\cdots,o_T)$

$ P(O\mid I,\lambda)=b_{i_{1}}(o_{1})b_{i_{2}}(o_{2})\cdotp\cdotp\cdotp b_{i_{T}}(o_{T}) $

3) The joint probability of $O$ and $I$ appearing simultaneously

\begin{aligned} P(O,I|\lambda)& =P(O|I,\lambda)P(I|\lambda)=\pi_{i_1}b_{i_1}(o_1)a_{i_1i_2}b_{i_2}(o_2)\cdots a_{i_{T-1}i_T}b_{i_T}(o_T) \end{aligned}

4) Sum all possible state sequences$I$ to get the probability of the observation sequence$O$

$ \begin{aligned} P(O|\lambda)& =\sum_{I}P(O\mid I,\lambda)P(I\mid\lambda) =\sum_{i_{1},i_{2},\cdots,i_{T}}\pi_{i_{1}}b_{i_{1}}(o_{1})a_{i_{2}}b_{i_{2}}(o_{2})\cdots a_{i_{r-1}i_{T}}b_{i_{T}}(o_{T}) \end{aligned} $

In actual operation, the calculation amount of step 4 is very large, and it is of order $O(TN^T)$

forward algorithm

Forward probability: Given the hidden Markov model\lambda=(A,B,\pi), the partial observation sequence at time t is defined as o_1,o_2,\cdots,o_t and the state is The probability of q_i is the forward probability, denoted as\alpha_t(i)=P(o_1,o_2,\cdots,o_t,i_t=q_i|\lambda)

Input: hidden Markov model$\lambda=(A,B,\pi)$ and observation sequence$O=(o_1,o_2,\cdots,o_t)$

Output: Observation sequence $O$ Probability of occurrence$P(O|\lambda)$

1) Initial value,\alpha_1(i)=\pi_ib_i(o_1),\quad i=1,2,\cdots,N

2) Recursion, yest=1,2,\cdots,T-1,

\alpha_{t+1}(i)=\left[\sum_{j=1}^N\alpha_t(j)a_{ji}\right]b_i(o_{t+1}),\quad i=1,2,\cdots,N

3) Termination 

P(O|\lambda)=\sum_{i=1}^N\alpha_T(i)

Calculation amount$O(TN^2)$Order

Example: box and ball model$\lambda=(A,B,\pi)$, state setQ=\{1,2,3\}, observation setV=\{Red,White\}

\left.A=\left[\begin{array}{ccc}0.5&0.2&0.3\\0.3&0.5&0.2\\0.2&0.3&0.5\end{array}\right.\right],\quad B=\left[\begin{array}{ccc}0.5&0.5\\0.4&0.6\\0.7&0.3\end{array}\right],\quad\pi=\left[\begin{array}{c}0.2\\0.4\\0.4\end{array}\right]

T=3,O=(\text{Red,White,Red}), use the forward algorithm to find$P(O|\lambda)$

answer:

1) Initial value 

\begin{aligned}\alpha_1(1)&=\pi_1b_1(o_1)=0.10\\\alpha_1(2)&=\pi_2b_2(o_1)=0.16\\\alpha_1(3)&=\pi_3b_3(o_1)=0.28\end{aligned}

A is the state transition probability matrix, B is the observation probability matrix, π is the initial state probability vector, and O is the observation sequence.

a_{ij} ——A's i row and j column,b_i(o_j)——B's i rowo_jcolumn

For exampleo_1 corresponds to Red, corresponds to the first column of the observation set V, and corresponds to the first column of the observation probability matrix B

2) Recursion

\begin{aligned} \alpha_{2}(1)& =\left[\sum_{i=1}^{3}\alpha_{1}(i)a_{i1}\right]b_{1}(o_{2})=0.154\times0.5=0.077 \\ \alpha_{2}(2)& =\left[\sum_{i=1}^3\alpha_1(i)a_{i2}\right]b_2(o_2)=0.184\times0.6=0.1104 \\ \alpha_{2}(3)& =\left[\sum_{i=1}^3\alpha_1(i)a_{i3}\right]b_3(o_2)=0.202\times0.3=0.0606\\ \\ \alpha_{3}(1)& =\left[\sum_{i=1}^3\alpha_2(i)a_{i1}\right]b_1(o_3)=0.04187 & \\ \alpha_{3}(2)& =\left[\sum_{i=1}^3\alpha_2(i)a_{i2}\right]b_2(o_3)=0.03551 \\ \alpha_{3}\left(3\right)& =\left[\sum_{i=1}^3\alpha_2(i)a_{i3}\right]b_3(o_3)=0.05284 \end{aligned}

b_i(o_j)row i of Bo_j column,b_1(o_3) is the element corresponding to the first row and column of B

3) Termination

P(O|\lambda)=\sum_{i=1}^3\alpha_3(i)=0.13022

Recurse to T=3 and sum the forward probabilities to get$P(O|\lambda)$

backward algorithm

Backward probability: Given the hidden Markov model\lambda=(A,B,\pi), it is defined that under the condition that the state at time t isq_i , from t+1 to The probability that the partial observation sequence of T is o_{t+1},o_{t+2},\cdots,o_T is the backward probability, denoted as

\beta_t(i)=P(o_{t+1},o_{t+2},\cdots,o_T|i_t=q_i,\lambda)

Input: hidden Markov model$\lambda=(A,B,\pi)$ and observation sequence$O=(o_{t+1},o_{t+2},\cdots,o_T)$

Output: Probability of occurrence of observation sequenceO$P(O|\lambda)$

1) Initial value,\beta_T(i)=1,\quad i=1,2,\cdots,N

2) Recursion, yest=T-1,T-2,\cdots,1

\beta_t(i)=\sum_{j=1}^Na_{ij}b_j(o_{t+1})\beta_{t+1}(j),\quad i=1,2,\cdots,N

3) Termination 

P(O|\lambda)=\sum_{i=1}^N\pi_ib_i(o_1)\beta_1(i)

Calculation amount$O(TN^2)$Order

Example: Heshiwa ball model$\lambda=(A,B,\pi)$,

\left.A=\left[\begin{array}{ccc}0.5&0.2&0.3\\0.3&0.5&0.2\\0.2&0.3&0.5\end{array}\right.\right],\quad B=\left[\begin{array}{ccc}0.5&0.5\\0.4&0.6\\0.7&0.3\end{array}\right],\quad\pi=(0.2,0.4,0.4)^\mathrm{T}

T=4,O=(\text{Red,White,Red,White}), use backward algorithm to find$P(O|\lambda)$

answer:

1) Initial value 

\mathrm{\beta}_4\left(\mathrm{i}\right)=1\quad\mathrm{~i}=1,2,3

A is the state transition probability matrix, B is the observation probability matrix, π is the initial state probability vector, and O is the observation sequence.

Recursively downward from T=4, generally set the initial value of the backward probability to 1

2) Recursion

Progress\beta _1End

\begin{aligned}\beta_3\left(1\right)& =\sum_{\maths{j}=1}^3\maths{a}_{1\maths{j}}\maths{b}_{\maths{j}}\left(\maths{O}_ {4}\right)\beta_{4}\left(\mathrm{j}\right)=0.25+0.12+0.09=0.46 \\\beta_{3}\left(2\right)& \mathrm{=\sum_{j=1}^3a_{2j}b_{j}\left(O_{4}\right)\beta_{4}\left(j\right)=0.15+0.3+0.06=0 } \\\beta_{3}\left(3\right)& =\sum_{\maths{j}=1}^3\maths{a}_{3\maths{j}}\maths{b}_{\maths{j}}\left(\maths{O}_ {4}\right)\mathrm{\beta}_{4}\left(\mathrm{j}\right)=0.1+0.18+0.15=0.43\\end{aligned}

\begin{aligned}\beta_2\left(1\right)&=\sum_{\mathrm{j}=1}^3\mathrm{a}_{1\mathrm{j}}\mathrm{b}_ {\mathrm{j}}\left(\mathrm{O}_3\right)\beta_3\left(\mathrm{j}\right)=0.25*0.46+0.08*0.51+0.21*0.43=0.2461\\\beta_2 \left(2\right)&=\sum_{\mathrm{j}=1}^3\mathrm{a}_{2\mathrm{j}}\mathrm{b}_{\mathrm{j}} \left(\mathrm{O}_3\right)\beta_3\left(\mathrm{j}\right)=0.15*0.46+0.2*0.51+0.14*0.43=0.2312\\\beta_2\left(3\right) &=\sum_{\mathrm{j}=1}^3\mathrm{a}_{3\mathrm{j}}\mathrm{b}_{\mathrm{j}}\left(\mathrm{O }_3\right)\beta_3\left(\mathrm{j}\right)=0.1*0.46+0.12*0.51+0.35*0.43=0.2577\end{aligned}

\begin{aligned}&\mathrm{\beta_1\left(1\right)=\sum_{j=1}^3a_{1j}b_j\left(O_2\right)\beta_2\left(j\right)=0.25*0.2461+0.12*0.2312+0.09*0.2577=0.112462}\\&\mathrm{\beta_1\left(2\right)=\sum_{j=1}^3a_{2j}b_j\left(O_2\right)\beta_2\left(j\right)=0.15*0.2461+0.3*0.2312+0.06*0.2577=0.121737}\\&\mathrm{\beta_1\left(3\right)=\sum_{j=1}^3a_{3j}b_j\left(O_2\right)\beta_2\left(j\right)=0.1*0.2461+0.18*0.2312+0.15*0.2577=0.104881}\end{aligned}

a_{ij} ——A's i row and j column,b_i(o_j)——B's i rowo_jcolumn

b_i(o_j)row i of Bo_j column,b_1(o_3) is the element corresponding to the first row and column of B

3) Termination

\mathrm{P}\left(\mathrm{O}|\lambda)\right.=\sum_{i=1}^3\pi_\mathrm{i}b_\mathrm{i}\left(\mathrm{O }_1\right)\beta_1\left(\mathrm{i}\right)=0.0600908

Guess you like

Origin blog.csdn.net/weixin_73404807/article/details/134555627