Machine learning day19 probability graph model

Probabilistic Graphical Model

Probabilistic Graphic Model (Probabilistic Graphic Model) can well dig out potential content.

The nodes in the probability graph are divided into hidden nodes and observation nodes, and the edges are divided into directed edges and undirected edges. From the perspective of probability theory, nodes correspond to random variables, and edges correspond to the dependence or correlation of random variables, in which directed edges represent unidirectional dependence, and undirected edges represent mutual dependence.

Probabilistic graph models are divided into two categories: Bayesian Network and Markov Network. Bayesian is represented by a directed graph structure, and Markov network is represented by an undirected graph network structure.

Probabilistic graph models include naive Bayes models, maximum entropy models, hidden Markov models, conditional random fields, topic models, and so on.

Bayesian joint probability distribution

Bayesian network on the left, Markov network on the right

image

Bayesian network and Markov network


It can be seen from the figure that, given A, B and C are conditionally independent, based on the definition of conditional conditional probability.image.png


In the same way, given the conditions of B and C, A and D are conditionally independent, and we can getimage.png


The above two formulas can be joint probabilityimage.png


Markov joint probability distribution

In the Markov network, the definition of the joint probability distribution is as follows: image.png
where C is the set of the largest clique in the graph, and image.pngis the normalization factor to ensure that P(x) is the correctly defined probability, which is the same as the clique Q Corresponding potential function, the potential function is non-negative, and it should obtain a larger value on a variable with a larger probability, such as the exponential function,
image.pngwhich is a subset image.png
of all nodes in imagethe graph. If this subset, any two points If there are edges connected between them, all the nodes of this self form a clique. If any other nodes are added to this subset, they cannot form a clique. We call this subset a largest clique. Bayesian network on the left, Markov network on the right

Bayesian network and Markov network


Obviously, only (A,B), (A,C), (B,D), (C,D) constitute a clique, and it is the largest clique. The joint probability density can be expressed asimage.png


If you use the above exponential function as the potential function, thenimage.png


Get

image.png

Guess you like

Origin blog.51cto.com/15069488/2578569