【Artificial Intelligence】—Uncertainty, prior probability/posterior probability, probability density, Bayes rule, naive Bayes, maximum likelihood estimation

Uncertainty

insert image description here

insert image description here

Uncertainty and rational decision making

insert image description here
insert image description here

basic probability notation

Prior probability (unconditional probability) / posterior probability (conditional probability)

insert image description here
insert image description here

insert image description here

Random Variables

insert image description here

insert image description here

Probability Density

insert image description here

joint probability distribution

insert image description here
insert image description here

axiom

insert image description here

complete joint distribution

insert image description here
insert image description here

Probability calculus

insert image description here

independence

insert image description here
insert image description here

bayes rule

insert image description here
insert image description here
insert image description here

Example 1

You have two envelopes to choose from. One envelope contains a red ball (worth $100) and one black ball, and the other envelope contains two black balls (worth zero).
insert image description here

You choose an envelope at random, then draw a ball from that envelope at random, and it turns out to be black.

At this point, you can choose whether to change to another envelope. The question is, should you switch or not?


E: envelope, 1 means an envelope with a red ball, 2 means an envelope with black balls 1 = ( R , B ) , 2 = ( B , B ) 1=(R,B), 2=(B,B )1=(R,B),2=(B,B)

B: the event of drawing a black ball

Bayes' rule: P ( E ∣ B ) = P ( B ∣ E ) P ( E ) P ( B ) Bayes' rule: \\{}\\P(E|B) = \frac{P(B |E)P(E)}{P(B)}Bayes' rule:P(EB)=P(B)P(BE)P(E)

We want to compare comparison: P ( E = 1 ∣ B ) comparison: P(E=1|B)Compare: P ( E=1∣B) vs. P ( E = 2 ∣ B ) P(E=2|B) P(E=2∣B)

Black ball in red envelope: P ( B ∣ E = 1 ) = 0.5 Black ball in black envelope: P ( B ∣ E = 2 ) = 1 Black ball in red envelope: P(B |E=1) = 0.5 \\Get the black ball in the black ball envelope: P(B|E=2) = 1Get the black ball in the red ball envelope: P ( B E=1)=0.5Get the black ball in the black ball envelope: P ( B E=2)=1

The probability of getting envelopes 1 and 2 is the same: P ( E = 1 ) = P ( E = 2 ) = 0.5 The probability of getting envelopes 1 and 2 is the same: P(E=1) = P(E=2) = 0.5The probability of getting envelopes 1 and 2 is the same: P ( E=1)=P(E=2)=0.5

The probability of drawing a black ball: The marginal probability of B on the value of E P ( B ) = P ( B ∣ E = 1 ) P ( E = 1 ) + P ( B ∣ E = 2 ) P ( E = 2 ) = ( 0.5 ) ( 0.5 ) + ( 1 ) ( 0.5 ) = 0.75 \begin{aligned} & the probability of drawing a black ball: \\&{the marginal probability of B on the value of E}\\ P(B) & = P(B|E=1)P(E=1) + P(B|E=2)P(E=2) \\ &= (0.5)(0.5) + (1)(0.5) \\ & = 0.75 \\ \end{aligned}P(B)Probability of drawing a black ball:The marginal probability of B on the value of E=P(BE=1)P(E=1)+P(BE=2)P(E=2)=(0.5)(0.5)+(1)(0.5)=0.75

Having drawn a black ball, the probability that this envelope is a red ball envelope: P ( E = 1 ∣ B ) = P ( B ∣ E = 1 ) P ( E = 1 ) P ( B ) = ( 0.5 ) ( 0.5 ) ( 0.75 ) = 1 3 A black ball has been drawn, the probability that this envelope is a red ball envelope: \\{} \\P(E=1|B) = \frac{P(B|E=1)P( E=1)}{P(B)} = \frac{(0.5)(0.5)}{(0.75)} = \frac{1}{3}Having drawn a black ball, the probability that this envelope is a red ball envelope:P(E=1∣B)=P(B)P(BE=1)P(E=1)=(0.75)(0.5)(0.5)=31

A black ball has been drawn, the probability that this envelope is a black ball envelope: P ( E = 2 ∣ B ) = P ( B ∣ E = 2 ) P ( E = 2 ) P ( B ) = ( 1 ) ( 0.5 ) ( 0.75 ) = 2 3 A black ball has been drawn, the probability that this envelope is a black ball envelope: \\{} \\P(E=2|B) = \frac{P(B|E=2)P( E=2)}{P(B)} = \frac{(1)(0.5)}{(0.75)} = \frac{2}{3}A black ball has been drawn, the probability that this envelope is a black ball envelope:P(E=2∣B)=P(B)P(BE=2)P(E=2)=(0.75)(1)(0.5)=32
Through calculation, the probability that the envelope is 1 after drawing a black ball is 1/3, and the probability that the envelope is 2 is 2/3. Therefore, changing the envelope can increase the probability of getting a red ball. It can be obtained by calculation that after drawing a black ball, the probability of the envelope being 1 is 1/3, and the probability of \\ the envelope being 2 is 2/3. Therefore, changing the envelope can increase the probability of getting a red ball.By calculation, the probability that the envelope is 1 after drawing a black ball is 1/3 ,The probability that the envelope is 2 is 2/3 . Therefore, changing the envelope can increase the probability of getting a red ball.


Example 2

A doctor performs a test that is 99% reliable in that 99% of sick people test positive and 99% of healthy people test negative. The doctor estimated that 1% of the entire population was sick.
So, for a patient who tests positive, what is the probability that he is sick?


We can use Bayes' theorem to calculate the conditional probability that a patient is sick. Let an event S represent a patient being ill and an event T represent a positive test result. Then the conditional probability sought is:

P ( S ∣ T ) = P ( T ∣ S ) P ( S ) P ( T ) P(S|T) = \frac{P(T|S)P(S)}{P(T)} P(ST)=P(T)P(TS)P(S)

Among them, P ( T ∣ S ) P(T|S)P ( T S ) represents the probability that the test result is positive under the condition that the patient is sick,P ( S ) P(S)P ( S ) representsthe prior probability,P ( T ) P(T)P ( T ) represents the probability that the test result is positive.

According to the data given in the title, we have: P ( T ∣ S ) = 0.99 According to the data given in the title, we have: \\P(T|S) = 0.99According to the data given in the title, we have:P(TS)=0.99
P ( S ) = 0.01 P(S) = 0.01 P(S)=0.01
P ( T ) = P ( T ∣ S ) P ( S ) + P ( T ∣ S ‾ ) P ( S ‾ ) P(T) = P(T|S)P(S) + P(T|\overline{S})P(\overline{S}) P(T)=P(TS)P(S)+P(TS)P(S)

Among them, S ‾ means that the patient is not sick. Among them, \overline{S} indicates that the patient is not sick.in,SIndicates that the patient is not sick.

According to the reliability of the test, we can get P ( T ∣ S ) = 1 − P ( T ∣ S ) = 0.01 so P ( T ) = P ( T ∣ S ) P ( S ) + P ( T ∣ S ‾ ) P ( S ‾ ) = ( 0.99 ) ( 0.01 ) + ( 0.01 ) ( 0.99 ) = 0.0198 According to the reliability of the test, we can get \\P(T|\overline{S}) = 1-P(T| S)= 0.01 \\{}\\So \\{}\\ \begin{aligned} P(T) &= P(T|S)P(S) + P(T|\overline{S})P (\overline{S}) \\ &= (0.99)(0.01) + (0.01)(0.99) \\ &= 0.0198 \\ \end{aligned}According to the reliability of the test, we can getP(TS)=1P(TS)=0.01thereforeP(T)=P(TS)P(S)+P(TS)P(S)=(0.99)(0.01)+(0.01)(0.99)=0.0198

Plugging into the Bayesian formula, we can calculate the conditional probability that a patient is sick: P ( S ∣ T ) = ( 0.99 ) ( 0.01 ) 0.0198 ≈ 0.50 Therefore, the probability that a patient with a positive test result is sick is about 50 Substituting into Bayesian formula, we can calculate the conditional probability of the patient being sick: \\{}\\P(S|T) = \frac{(0.99)(0.01)}{0.0198} \approx 0.50\\{}\\ Therefore, the test Patients with a positive result have about a 50% chance of getting sick.Substituting Bayesian formula, we can calculate the conditional probability of the patient being sick:P(ST)=0.0198(0.99)(0.01)0.50Therefore, the probability that a patient who tests positive is sick is about 50


Using Bayes' Rule: Combining Evidence

insert image description here
insert image description here
insert image description here
insert image description here

Naive Bayes

insert image description here

maximum likelihood estimation

Maximum Likelihood Estimation (MLE for short) is a commonly used parameter estimation method, which is used to estimate the parameters of the model based on known sample data. Its core idea is to select the parameter that can maximize the probability of the observed data as the estimated value.

Specifically, in maximum likelihood estimation, we assume that the sample data come from a certain probability distribution, but the parameters of the distribution are unknown. Our goal is to estimate these parameters from the sample data such that the distribution best explains the observed data.

Suppose we have a sample set X = x 1 , x 2 , . . . , xn X={x_1, x_2, ..., x_n}X=x1,x2,...,xn, each sample comes from a certain distribution f ( x ∣ θ ) f(x|\theta)Observations of f ( x θ ) , where θ \thetaθ is a parameter of the distribution. We want to find the sample setXXThe joint probability density function L of X ( X ∣ θ ) L(X|\theta)L ( X θ ) parameter valueθ \thetaθ . This joint probability density function can be expressed as:

L ( X ∣ θ ) = ∏ i = 1 n f ( x i ∣ θ ) L(X|\theta) = \prod_{i=1}^n f(x_i|\theta) L(Xθ)=i=1nf(xiθ)

Our goal is to find the ability to maximize L ( X ∣ θ ) L(X|\theta)L(Xθ) θ \theta Theta value. Therefore, the calculation of maximum likelihood estimation can be expressed as:

θ ^ MLE = arg ⁡ max ⁡ θ L ( X ∣ θ ) \hat{\theta}_{MLE} = \arg\max_{\theta}L(X|\theta);i^MLE=argimaxL(Xθ)

Sometimes we need to take the logarithm of the above formula to avoid computer calculation underflow, and the obtained formula is:

θ ^ MLE = arg ⁡ max ⁡ θ log ⁡ L ( X ∣ θ ) = arg ⁡ max ⁡ θ ∑ i = 1 n log ⁡ f ( xi ∣ θ ) \hat{\theta}_{MLE} = \arg\ max_{\theta} \log L(X|\theta) = \arg\max_{\theta} \sum_{i=1}^n \log f(x_i|\theta)i^MLE=argimaxlogL(Xθ)=argimaxi=1nlogf(xiθ)

The maximum likelihood estimation method is a commonly used parameter estimation method, which has the advantages of simple calculation and good theoretical basis. It has been widely used in statistics, machine learning, signal processing and other fields.

summary

The following is a compilation of important formulas in probability theory:


  1. Conditional probability formula:

For event A and event B, the conditional probability is expressed as P ( A ∣ B ) P(A|B)P ( A B ) represents the probability of event A occurring under the condition that event B occurs. The conditional probability formula is:

P ( A ∣ B ) = P ( A , B ) P ( B ) P(A|B) = \frac{P(A,B)}{P(B)} P(AB)=P(B)P(A,B)


  1. Multiplication rule formula:

For event A and event B, the joint probability is expressed as P ( A , B ) P(A,B)P(A,B ) , which represents the probability that event A and event B occur at the same time. The multiplication rule formula is:

P ( A , B ) = P ( A ∣ B ) P ( B ) P(A,B) = P(A|B)P(B) P(A,B)=P(AB)P(B)


  1. Chain rule formula:

For multiple events A , B , C , DA,B,C,DA,B,C,D , whose joint probability is expressed asP ( A , B , C , D ) P(A,B,C,D)P(A,B,C,D ) , the chain rule formula can be expressed as:

P ( A , B , C , D ) = P ( A ∣ B , C , D ) P ( B ∣ C , D ) P ( C ∣ D ) P ( D ) P(A,B,C,D) = P(A|B,C,D)P(B|C,D)P(C|D)P(D) P(A,B,C,D)=P(AB,C,D)P(BC,D)P(CD)P(D)


  1. Conditional chain rule formula:

For event A and event B, the joint probability is expressed as P ( A , B ) P(A,B)P(A,B ) , the conditional chain rule formula can be expressed as:

P ( A , B ∣ C ) = P ( A ∣ B , C ) P ( B ∣ C ) P(A,B|C) = P(A|B,C)P(B|C) P(A,BC)=P(AB,C)P(BC)


  1. Bayes' theorem formula:

Bayesian theorem is a method of calculating posterior probability based on prior probability and conditional probability, which can be used for classification, prediction and other tasks. The formula for Bayes' theorem is:

P ( A ∣ B ) = P ( B ∣ A ) P ( A ) P ( B ) P(A|B) = \frac{P(B|A)P(A)}{P(B)} P(AB)=P(B)P(BA)P(A)


  1. Conditional Bayes' theorem formula:

For event A and event B, the conditional Bayes theorem formula can be expressed as:

P ( A ∣ B , C ) = P ( B ∣ A , C ) P ( A ∣ C ) P ( B ∣ C ) P(A|B,C) = \frac{P(B|A,C)P(A|C)}{P(B|C)} P(AB,C)=P(BC)P(BA,C)P(AC)


  1. Additive/conditional probability formula:

For event A and event B, the additive/conditional probability formula can be expressed as:

P ( A ) = P ( A , B ) + P ( A , ¬ B ) = P ( A ∣ B ) P ( B ) + P ( A ∣ ¬ B ) P ( ¬ B ) P(A) = P(A,B) + P(A,\neg B) = P(A|B)P(B) + P(A|\neg B)P(\neg B) P(A)=P(A,B)+P(A,¬B)=P(AB)P(B)+P(A∣¬B)P(¬B)


These formulas are very important in probability theory and can be applied to problems in various fields such as statistics, machine learning, signal processing, financial fields, and medical fields. Proficiency in these formulas can help us better understand and solve practical problems.

Guess you like

Origin blog.csdn.net/weixin_56462041/article/details/130472364