Statistics (3) Expectation
This chapter includes:
- 3.1 Expectation of random variables
- 3.2 Nature of expectations
- 3.3 Variance and Covariance
- 3.4 Expectations and variances of several important random variables
- 3.5 Conditional expectations
- 3.6 Moment Generating Function
As for the key nouns, some words may not convey the meaning, so the key nouns are organized as follows
1. Expectation:Expectation
2. Variance: Variance
3. Covariance:Covariance
4. Mean:Mean
5. First Moment:First Moment
6. Integration by parts: Integration by parts
7. Law of the unconscious statistician:Law of the unconscious statistician
8. Moment: Moment
9. standard deviation: standard deviation
10. sample mean: sample mean
11. Sample variance: sample variance
12. Correlation:correlation
13. Conditional Expectation: Conditional Expectation
14. The Rule of Iterated Expectations: The Rule of Iterated Expectations
15. Conditional Variance: Conditional Variance
16. Hierarchical model: hierarchical model
17. Moment Generating Function: Moment Generating Function
18. Laplace transform: Laplace transform
3.1 Expectation of random variables
The mean and expectation of a random variable X is the mean of X
3.1 Definition
The expected value (expected value), or mean (mean), or first moment (first moment) of the random variable X is defined as follows:
Among them, it is assumed that the summation (or integration) conforms to the definition. We use the following formula to express the expected value of X:
Expectation is a single-valued generalization of a distribution. Will be regarded as the mean of many independent simultaneous distributions X1, Theorem of number laws. It will be introduced in Chapter 5.
The notation needs some explanation. We use it simply as a convenient unifying notation so that we don't have to write for discrete random variables and for continuous random variables . But you should know that this notation is used in actual analysis courses, has a precise meaning.
In order to ensure that the definition is met, if , we say it exists. Otherwise, we say the expectation does not exist.
3.2 Example
Suppose , then
3.3 Example
Toss a coin twice. Let X be the number of heads. Then
3.4 Example
Suppose , then,
3.5 Example
To review, if a random variable obeys the Cauchy distribution, then it has the following density function:
Using integration by parts (let) , we get
Therefore the mean does not exist. If you simulate the Cauchy distribution many times and take the average, you will find that the mean never stabilizes. This is because the Cauchy distribution has thick tails, so extreme observations are common .
From now on, whenever we discuss expectations, we will assume that they exist.
Assume , how to calculate it? One way is to find and pass . But there is an easier way.
3.6 Theorem
Law of the unconscious statistician. Suppose , then
This result is intuitive. Consider the situation where we play a game in which you randomly draw X, and then I pay you Y=r(X). Your average income is r(X) times X=x Probability, sum (or integrate) over x. Here is a special case, let A be an event. And let , if , and , , then:
In other words, probability is a special case of expectation.
3.7 Example
Suppose , ,then .
Alternatively, you can find out that in the case of , it is . Then
3.8 Example
Take a stick of unit length and break it randomly. Let Y be the length of the longer part. So what is the mean value of Y ? If
Multi-variable functions are handled in the same way. For example:
3.9 Example
Assume (X, Y) is on the unit square and obeys joint uniform distribution. Assume again , then:
The kth-order moment (moment) of X is defined as: , where satisfies
3.10 Theorem
If the k-th moment exists, then when j<k, the j-th moment also exists
prove:
The kth order central moment is defined as:
3.2 Nature of expectations
3.11 Theorem
If X1,...Xn are random variables, and a1,...an are constants, then
3.12 Example
Assuming that X obeys , then what is the mean value of X? We can try to find it using the definition, as follows:
It is relatively difficult to sum up the above formula. We notice that X_i can be expressed as the i-th coin toss. When it is heads, Xi=1, when it is tails, Xi=0. Then. Therefore :
3.13 Theorem
Assume X1,...Xn are independent random variables, then:
Note: The above summation rule does not require the random variables to be independent. However, the product rule requires the random variables to be independent.
3.3 Variance and Covariance (Variance and Covariance)
The variance measures the "spread" degree of the distribution. (Translator's Note: The degree of "spread" in double quotes, that is, whether the distribution is concentrated or diffuse).
3.14 Theorem
Let X be a random variable XX with mean μ. The variance of XX is defined as follows, recorded as: :
If this expectation exists, then the standard deviation is , denoted as
3.15 Theorem
If the variance exists and satisfies the definition, then it has the following properties:
1.
2. If a and b are constants, then
3. If X1..Xn are independent and a1,...,an are constants, then
3.16 Example
Assume that _ _ _
Now
Therefore: .finally
Note: If p=0, or p=1, then
If X1..Xn are random variables, then we can define the sample mean as:
Sample variance is defined as:
3.17 Theorem
Suppose X1,...Xn are independent and identically distributed random variables, and , then
,,
If X and Y are random variables, then the covariance and correlation between X and Y measure how strong the linear relationship between X and Y is.
3.18 Definition
Let X and Y be random variables with mean and standard deviation . The covariance between X and Y is defined as follows:
Correlation is defined as follows:
3.19 Theorem
The covariance satisfies:
The correlation satisfies:
If a and b are constants, , then when a>0, ; when a<0
If X and Y are independent, then in general, the converse proposition does not hold.
3.20 Theorem
; , in a more general case, for multiple random variables X1..Xn:
3.4 Expectations and variances of several important random variables
The following table contains the expectations of several important random variables:
We have previously derived the expectation and variance of the binomial distribution. For calculations of other distributions, please see the after-class exercises.
The last two items in the above table are the multivariate models, which involve the vector X and have the following format:
The mean of the random vector X is defined as follows:
The variance-covariance matrix Σ is defined as follows:
X obeys , then .
To understand this, it is necessary to note that the marginal distribution of any element of the vector satisfies the binomial distribution . Therefore , , should also be noted , so in other words using the variance formula of and we get
Equality the above formula with , and find .
Finally, there is a lemma that can be used to find the mean and variance of a linear combination of multivariate random vectors, which can be very useful in some situations.
3.21 Lemma
If a is a vector, X is a random vector with mean μ and variance Σ. Then, , if A is a matrix, then ,
3.5 Conditional Expectation
Assuming that X and Y are random variables, when Y=y, what is the mean value of X? The answer is that we calculate the mean value of X as before, but in the definition of expectation, we replace the alternative of
3.22 Theorem
Given Y=y, the conditional expectation of X is defined as:
If r(x,y) is a function of x and y, then
Warning: There is a subtle point to note here. Although it is a numerical value, it is a function about y. We do not know the value before getting the y value , so it is a random variable, denoted as . In other words is a random variable, and its value is . Similarly, is also a random variable, and its value is . This is a very confusing point, so let's look at an example.
3.23 Example
Assume that X follows a uniform distribution . After X=x, .Intuitively, we expect that , in fact ,
Therefore, .Note is a random variable, and its value is the value when X=x
3.24 Theorem (The Rule of Iterated Expectations)
For random variables X and Y, assuming that expectations exist, then we have:
,
In a more general case:
prove:
We prove the first equation using conditional expectations and.
3.25 Example
Consider 3.23 Example. How to calculate E(Y)? One method is to find the joint density function f(x, y) and then calculate it . Another simpler method only requires two steps. First, it is already known , so
3.26 Definition
Conditional variance is defined as follows:
in,
3.27 Theorem
For random variables X and Y:
3.28 Example
Randomly select a county from US, and then randomly select n people from this county. Let variable because it varies by county. Given Q=q, we have . Therefore . Assume that the random variable Q obeys a uniform distribution Uniform(0,1). The distribution constructed in stages like this is called a hierarchical model and can be written as:
Now . Let us calculate the variance of X.
Now , let's calculate these two terms
first,
Next,
therefore
3.6 Moment Generating Function
Now, we will define the moment generating function, which is used to find moments, find the distribution of sums of random variables, and is also used in the proof of certain theorems
3.29 Definition
The moment generating function (Moment Generating Function) MGF or Laplace transform (Laplace transform) is defined as follows:
where t varies over the range of real numbers
In the following content, we assume that MGF is defined in the open interval near t=0.
When MGF satisfies the definition, it can be proved that the differential and "expected value" operations can be exchanged. This leads to
By taking k derivative operations, we can get . This gives us a way to calculate the moments of the distribution
3.30 Example
Assume that X obeys exponential distribution , for any t<1 we get:
If , the integral will diverge. Therefore when t < 1, , now ,
Therefore, ,
3.31 Lemma
The properties of MGF are:
1. If Y=aX+b, then
2. If X1,...Xn are independent, and , then , where is the MGF of Xi
3.32 Example
Suppose X obeys the binomial distribution . We know that , where . Now , where q=1-p.
therefore,
Recall the previous content, if X and Y have the same distribution function, then we write it as
3.33 Theorem
Assume that X and Y are random variables. If there is an open interval near 0 point for all t, then
3.34 Example
Assume that X1 obeys the binomial distribution and X2 obeys the binomial distribution , and the two are independent. Let Y=X1+X2, then we get:
We can think of this as the moment generating function of the binomial distribution Binomial(n1 + n2, p). Because the moment generating function characterizes the distribution (i.e., there does not exist another random variable with the same moment generating function). We conclude that Y obeys the binomial distribution
3.35 Example
Suppose Y1 obeys the Poisson distribution , Y2 obeys the Poisson distribution and the two are independent. The moment generating function of Y=Y1+Y2 is: , which is also the moment generating function. Therefore, we have proved that two independent Poisson random variables The sum of has a Poisson distribution
End of this chapter
Untranslated: Appendix, homework