Probability theory and basics of probability graphical models

1. Basics of Probability Theory

The difference between posterior probability and conditional probability :
posterior probability is a kind of conditional probability, but the difference from other conditional probabilities is that the posterior probability limits the target event to the value of the hidden variable (in a broad sense, the hidden variable mainly refers to " Something that cannot be directly observed, but has an impact on the state of the system and the observable output"), and the conditions are the observation results. In general conditional probability, the conditions and the target event can be arbitrary, and the conditions and the target event do not have to have any relationship .

Bayes' formula is a formula that calculates posterior probability from prior probability.
For example, distinguish the difference between ordinary conditional probability and posterior probability:
1) Then if we hear the news before we go out that there was a traffic accident on the road today, then we want to Calculate the probability of traffic jam, this is called conditional probability. That is P (traffic jam|traffic accident). This is a cause and effect.
2) If we have already gone out and then encounter a traffic jam, then we want to calculate the probability that the traffic jam is caused by a traffic accident. Then this is called the posterior probability (actually it is also a conditional probability, but it is usually called this way). That is P (traffic accident|traffic jam). This is seeking cause and effect.

Insert image description here
The difference between random events and random variables is (A, B, C represent instances, X, Y, Z represent random variables) :
Random event : Toss a coin, the result is heads.
Random variables : The results of random phenomena are represented by variables. For example, if a coin lands on heads, it is represented by X=0, and if it lands on tails, it is represented by X=1.
Insert image description here

Sample space : Random experiments are experiments and observations of random phenomena. Each possible result of a random experiment is called a sample point; the sample space refers to all sample points.
Insert image description here

The sum of the probabilities of the entire sample space should be equal to 1, that is, the sum of the probabilities of all possible values ​​of the random variable is 1
Insert image description here

2. Three major formulas of conditional probability

The conditions in conditional probability represent the observed variables, and the observed variables mean whether the value of the variable has been determined. The symbols A and B in the conditional probability formula all represent events.

0. Conditional probability definition :
Insert image description here

1. Multiplication formula

Insert image description here
2. Total probability formula
Insert image description here

3.Bayes’ formula

Insert image description here

3. Independence

1. Independence
Insert image description here

2. Conditional independence is the most important concept in probability graphical model representation :
Insert image description here

Insert image description here

The figure below shows that under the condition of C, the two variables A and B are conditionally independent, that is, it is known that the event of staying in bed has already occurred, then whether the person stays up late will not affect whether he is late. From this example, we can derive the mathematical formula of the joint probability expression in the Bayesian network definition based on conditional independence and the chain rule of joint probability distribution.
Insert image description here
If any of the following conditions in the figure below is met, it can be proved that any variables X and Y are conditionally independent (the second equation means: the probability of P(X) given Y and Z is equal to the given Z The probability of P(X) under the condition):
Insert image description here

4. Three concepts commonly used in probabilistic graphical models, corresponding to the three tasks of probabilistic graphical model reasoning

Insert image description here
1. Joint probability distribution : refers to when the system or object (X) we are interested in contains N variables (that is, N random variables), and when these variables take different values ​​(the possible values ​​of each random variable) , what is its corresponding probability value:
Insert image description here

2. Marginal probability : Marginal probability refers to considering some variables of the joint probability distribution. When calculating the probability of other variables not considered, requirements and marginalization are required. The formula in the red line in the figure below is done for the two variables D and G. Summarize and marginalize. See the formula below for details:
Insert image description here

3. Maximum posterior probability state : In the joint probability distribution, when each variable takes a value, the value of the joint probability distribution is maximized.
Insert image description here

5. Liquidity affected by probability

Observed variables : Whether the value of the variable has been determined.
Hidden variable : The value of the variable is unknown.
Insert image description here

6. Representation, reasoning, and learning of probability graphical models

1. Representation of probabilistic graphical model : it is an effective tool for modeling uncertainty problems. The problem to be solved is how to define a joint probability distribution on the graph, and how to express the joint probability distribution as a product of some local factors, and why it can be expressed in this form .
2. Inference of probabilistic graphical model : The inference of probabilistic graphical model is done on the basis of the representation of probabilistic graphical model, that is, the structure and parameters of the graphical model have been given , and we need to perform certain inference calculations on the graphical model. Finding the marginal probability refers to knowing the joint probability distribution and finding the marginal probability of some variables. Finding the maximum posterior probability state refers to the known joint probability distribution. When x in this distribution takes a certain value, the value of this joint probability distribution is maximized.
3. Learning of probabilistic graphical models (only for complex probabilistic models that lack expert experience do we need to learn the structure and parameters of the model): The problem to be solved is, given training data, learn the structure of the graphical model from the training data ( The structure of the graph model determines which nodes the graph model has, and which nodes are connected by edges) and parameters (the weight of the edges in the graph model) . D in the figure below indicates that there are M training samples, and each training sample is an N-dimensional vector.

Insert image description here

The following figure shows the number of parameters of the Bayesian network :
Insert image description here

Guess you like

Origin blog.csdn.net/Rolandxxx/article/details/127948236