本文为 $I n t r o d u c t i o n$ $t o$ $P r o b a b i l i t y$ 的读书笔记

Limit Theorems

In this chapter, we discuss some fundamental issues related to the asymptotic behavior (渐近性质) of sequences of random variables. Our principal context involves a sequence $X_1, X_2 , ...$ of independent identically distributed random variables with mean $μ$ and variance $\sigma^2$
Let
$S_n= X_1 +· · ·+ X_n$ be the sum of the first $n$ of them. Limit theorems are mostly concerned with the properties of $S_n$ and related random variables as $n$ becomes very large.
$var(S_n) = var(X_1) +· · ·+ var(X_n) = n\sigma^2$ Thus, the distribution of $S_n$ spreads out as $n$ increases, and cannot have a meaningful limit.
The situation is different if we consider the sample mean
$M_n=\frac{X_1+...+X_n}{n}=\frac{S_n}{n}$ We have
$E[M_n]=\mu,\ \ \ \ \ \ \ var(M_n)=\frac{\sigma^2}{n}$
- This phenomenon is the subject of certain laws of large numbers (大数定律), which generally assert that the sample mean $M_n$ (a random variable) converges to the true mean $μ$ (a number), in a precise sense.
- These laws provide a mathematical basis for the loose interpretation of an expectation $E [X] = μ$ as the average of a number of independent samples drawn from the distribution of $X$ .
We will also consider a quantity which is intermediate between $S_n$ and $M_n$ .
$Z_n=\frac{S_n-n\mu}{\sigma\sqrt n}$ It can be seen that
$E[Z_n]=0,\ \ \ \ \ var(Z_n)=1$ Since the mean and the variance of $Z_n$ remain unchanged as $n$ increases, its distribution neither spreads, nor shrinks to a point. The central limit theorem (中心极限定理) is concerned with the asymptotic shape of the distribution of $Z_n$ and asserts that it becomes the standard normal distribution.

Markov Inequality

在这里插入图片描述

Loosely speaking, it asserts that if a $n o n n e g a t i v e$ random variable has a small mean, then the probability that it takes a large value must also be small.

To justify the Markov inequality, let us fix a positive number $a$ and consider the random variable $Y_a$ defined by
It is seen that the relation $Y_a\leq X$ always holds and therefore,
$E[X]\geq E[Y_a]=aP(X\geq a)$
We see that the bounds provided by the Markov inequality can be quite loose.

Chebyshev Inequality

在这里插入图片描述

Loosely speaking, it asserts that if a random variable has small variance, then the probability that it takes a value far from its mean is also small.
Note that the Chebyshev inequality does not require the random variable to be nonnegative.
An alternative form of the Chebyshev inequality is obtained by letting $k\sigma$ , where $k$ is positive, which yields
$P(|X-\mu|\geq k\sigma)\leq\frac{1}{k^2}$

To justify the Chebyshev inequality, we consider the nonnegative random variable $X - μ)^2$ and apply the Markov inequality with $a= c^2$ . We obtain
$μ)^2\geq c^2)\leq \frac{E[(X-\mu)^2]}{c^2}=\frac{\sigma^2}{c^2}$
For a similar derivation that bypasses the Markov inequality, assume for simplicity that $X$ is a continuous random variable, introduce the function
note that $-μ)^2\geq g(x)$ for all $x$ , and write
$\sigma^2=\int_{-\infty}^\infty (x-\mu)^2f_X(x)dx\geq \int_{-\infty}^\infty g(x)f_X(x)dx=c^2P(|X-\mu|\geq c)$

The Chebyshev inequality tends to be more powerful than the Markov inequality (the bounds that it provides are more accurate), because it also uses information on the variance of $X$ .

Example 5.3. Upper Bounds in the Chebyshev Inequality.

When $X$ is known to take values in a range $[a, b]$ , we claim that $\boldsymbol{\sigma^2\leq (b - a)^2/4}$ . Thus, if $\sigma^2$ is unknown, we may use the bound $b - a)^2/4$ in place of $\sigma^2$ in the Chebyshev inequality, and obtain
$P(|X-\mu|\geq c)\leq\frac{(b-a)^2}{4c^2},\ \ \ \ \ \ for\ all\ c>0$
To verify our claim, note that for any constant $\gamma$ , we have
$E[(X-\gamma)^2]=E[X^2]-2E[X]\gamma+\gamma^2$ and the above quadratic is minimized when $\gamma = E[X]$ . It follows that
$\sigma^2=E[(X-E[X])^2]\leq E[(X-\gamma)^2],\ \ \ \ \ for\ all \ \gamma$ By letting $\gamma = (a + b) /2$ , we obtain
$\sigma^2\leq E[(X-\frac{a+b}{2})^2]=E[(X-a)(X-b)]+\frac{(b-a)^2}{4}\leq\frac{(b-a)^2}{4}$ It is satisfied with equality when $X$ is the random variable that takes the two extreme values $a$ and $b$ with equal probability $1 / 2$ .

Problem 2. The Chernoff bound. (切尔诺夫界)
The Chernoff bound is a powerful tool that relies on the transform associated with a random variable, and provides bounds on the probabilities of certain tail events.

$(a)$ Show that the inequality
$P(X\geq a) \leq e^{-sa}M(s)$ holds for every $a$ and every $s\geq0$ , where $M(s) = E[e^{sX}]$ is the transform associated with the random variable $X$ , asumed to be finite in a small open interval containing $s = 0$ .
$(b)$ Show that the inequality
$P(X\leq a) \leq e^{-sa}M(s)$ holds for every $a$ and every $s\leq 0$ .
$(c)$ Show that the inequality
$P(X\geq a)\leq e^{-\phi(a)}$ holds for every $a$ , where
$\phi(a)=\max_{s\geq0}(sa-\ln M(s))$
$(d)$ Show that if $a > E [X]$ , then $\phi(a)> 0$ .
$(e)$ Apply the result of part $(c)$ to obtain a bound for $P(X\geq a)$ , for the case where $X$ is a standard normal random variable and $a > 0$ .
$(f)$ Let $X_1, X_2, ...$ be independent random variables with the same distribution as $X$ . Show that for any $a > E [X]$ , we have
$P(\frac{1}{n}\sum_{i=1}^nX_i\geq a)\leq e^{-n\phi(a)}$ so that the probability that the sample mean exceeds the mean by a certain amount decreases exponentially with $n$ .

SOLUTION

$(a)$ Given some $a$ and $s\geq0$ , consider the random variable $Y_a$ defined by
It is seen that the relation
$Y_a\leq e^{sX}$ Thus
$M(s)=E[e^{sX}]\geq E[Y_a]=P(X\geq a)e^{sa}$ from which we obtain
$P(X\geq a) \leq e^{-sa}M(s)$
$(b)$ We define $Y_a$ by
Since $s < 0$ , the relation
$Y_a\leq e^{sX}$ Thus
$M(s)=E[e^{sX}]\geq E[Y_a]=P(X\leq a)e^{sa}$ from which we obtain
$P(X\leq a) \leq e^{-sa}M(s)$
$(c)$ Since the inequality from part $(a)$ is valid for every $s\geq0$ , we obtain
$\begin{aligned}P(X\geq a)&\leq\min_{s\geq0}(e^{-sa}M(s)) \\&=\min_{s\geq0}e^{-(sa-\ln M(s))} \\&=e^{-\max_{s\geq0}(sa-\ln M(s))} \\&=e^{-\phi(a)}\end{aligned}$
$(d)$ For $s = 0$ , we have
$sa-\ln M(s)=0$ Furthermore,
$\frac{d}{ds}(sa-\ln M(s))\bigg|_{s=0}=a-\frac{1}{M(s)}\cdot \frac{d}{ds}M(s)\bigg|_{s=0}=a-E[X]>0$ Since the function $\ln M( s)$ is zero and has a positive derivative at $s = 0$ , it must be positive when $s$ is positive and small. It follows that the maximum $\phi(a)$ of the function $\ln M(s)$ over all $s\geq0$ is also positive
$(e)$ For a standard normal random variable $X$ , we have $M(s)=e^{s^2/2}$ . Therefore, $sa-\ln M(s)=sa-s^2/2\leq a^2/2\ \ \ (s=a)$ . Thus,
$P(X\geq a)\leq e^{-a^2/2}$
$(f)$ Let $Y = X_1 +· · ·+ X_n$ . Using the result of part $(c)$ , we have
$P(\frac{1}{n}\sum_{i=1}^nX_i\geq a)=P(Y\geq na)\leq e^{-\phi_Y(na)}$ where
$\begin{aligned}\phi_Y(na)&=\max_{s\geq0}(nsa-\ln M_Y(s)) \\&=\max_{s\geq0}(nsa-\ln M(s)^n) \\&=n\max_{s\geq0}(sa-\ln M(s)) \\&=n\phi(a)\end{aligned}$ Thus
$P(\frac{1}{n}\sum_{i=1}^nX_i\geq a)\leq e^{-n\phi(a)}$ Note that when $a > E [X]$ , part $(d)$ asserts that $\phi(a) > 0$ , so the probability of interest decreases exponentially with $n$ .

Problem 3. Jensen inequality. (詹森不等式)
A twice differentiable real-valued function $f$ of a single variable is called convex if its second derivative $d^2 f /dx^2 )(x)$ is nonnegative for all $x$ in its domain of definition.

(a) Show that if $f$ is twice differentiable and convex, then the first order Taylor approximation (一阶泰勒展开) of $f$ is an underestimate of the function, that is,
$f(a)+(x-a)\frac{df}{dx}(a)\leq f(x)$ for every $a$ and $x$ .
(b) Show that if $f$ has the property in part (b), and if $X$ is a random variable, then
$f(E[X])\leq E[f(X)]$

SOLUTION

(a)
$f(x)=f(a)+\int_a^x\frac{df}{dx}(t)dt\geq f(a)+\int_a^x\frac{df}{dx}(a)dt=f(a)+(x-a)\frac{df}{dx}(a)$
(b) Since the inequality from part (a) is assumed valid for every possible value $x$ of the random variable $X$ , we obtain
$f(a)+(X-a)\frac{df}{dx}(a)\leq f(X)$ We now choose $a = E [X]$ and take expectations, to obtain
$f(E[X])+(E[X]-E[X])\frac{df}{dx}(E[X])\leq E[f(X)] \\\therefore f(E[X])\leq E[f(X)]$

Chapter 5 (Limit Theorems): Markov and Chebyshev Inequalities (马尔可夫和切比雪夫不等式)

目录

Limit Theorems

Markov Inequality

Chebyshev Inequality

猜你喜欢