Introduction to EM algorithm and HMM model, detailed explanation of mathematical formulas

1.EM算法(Expectation-Maximization Algorithm)

It is an iterative algorithm for parameter estimation in probabilistic models with hidden variables. The basic idea of ​​the EM algorithm is: in a probability model containing hidden variables, if the value of the observed variable is known, then the value of the hidden variable is easy to estimate; on the contrary, if the value of the hidden variable is known, then the model parameters It is estimated that it will be easy to carry out. Therefore, the EM algorithm is a process of parameter estimation by iteratively solving two steps: E step (Expectation) and M step (Maximization).

2. HMM model (Hidden Markov Model)

This is a probabilistic model for modeling time series data. The HMM model consists of a hidden Markov chain and a sequence of observations associated with each state. In the HMM model, the observation sequence is visible while the state sequence is not. The basic assumption of the HMM model is that the state at the current moment is only related to the state at the previous moment, and has nothing to do with the states and observations at other moments. Therefore, the HMM model can be used in the modeling of many time series data, such as speech recognition, natural language processing, bioinformatics and other fields.

3. Mathematical formula of EM algorithm and HMM model:

EM algorithm:

Step E: Calculate the posterior probability of the hidden variable

$$
Q(\theta,\theta^{(t)}) = E_{Z|X,\theta^{(t)}}[\log P(X,Z|\theta)|X,\theta^{(t)}]
$$

M step: maximize the Q function

$$
\theta^{(t+1)} = \arg\max_{\theta} Q(\theta,\theta^{(t)})
$$

HMM model:

State Transition Probability Matrix

$$
A = [a_{ij}]_{N\times N}
$$

Among them, $a_{ij}$ represents the probability of transitioning from state $i$ to state $j$.

Observation probability matrix

$$
B = [b_j(k)]_{N\times M}
$$

Among them, $b_j(k)$ represents the probability of observing symbol $k$ under state $j$.

initial state probability vector

$$
\pi = [\pi_i]_{1\times N}
$$

Among them, $\pi_i$ represents the probability of being in state $i$ at the initial moment.

forward probability

$$
\alpha_t(i) = P(O_1,O_2,\cdots,O_t,q_t=S_i|\lambda)
$$

Among them, $O_1,O_2,\cdots,O_t$represent the first $t$observations, and $S_i$represents the state $i$.

backward probability

$$
\beta_t(i) = P(O_{t+1},O_{t+2},\cdots,O_T|q_t=S_i,\lambda)
$$

Among them, $O_{t+1},O_{t+2},\cdots,O_T$represents the last $Tt$observations, and $S_i$represents the state $i$.

observation sequence probability

$$
P(O|\lambda) = \sum_{i=1}^N \sum_{j=1}^N \alpha_T(i)a_{ij}b_j(O_T)\beta_T(j)
$$

Among them, $O$ represents the observation sequence, and $\lambda$ represents the parameters of the HMM model.

Guess you like

Origin blog.csdn.net/babyai996/article/details/131219926
Recommended