UA MATH564 概率论VI 数理统计基础4 t分布

UA MATH564 概率论VI 数理统计基础4 t分布

t分布的定义

假设 X , Y X,Y 互相独立, X N ( δ , 1 ) X \sim N(\delta,1) Y χ n 2 Y\sim \chi^2_n ,则
Z = X Y / n t n , δ Z = \frac{X}{\sqrt{Y/n}}\sim t_{n,\delta}
其中 n n 被称为自由度, δ \delta 为非中心参数, δ = 0 \delta=0 称为中心化的t分布,简称t分布。下面计算t分布的CDF与PDF并介绍几个常用性质。

t分布的概率密度

Z Z 的概率密度为 s ( x n , δ ) s(x|n,\delta) ,分布函数为 S ( x n , δ ) S(x|n,\delta)

引理1 假设 X , Y X,Y 互相独立,概率密度分别为 g ( x ) , h ( y ) g(x),h(y) Y > 0 , a . s . Y>0,a.s. Z = X / Y Z=X/Y ,则
f Z ( z ) = 0 t g ( t z ) h ( t ) d t ,   F Z ( z ) = 0 G ( t z ) h ( t ) d t , z R f_Z(z) = \int_0^{\infty} tg(tz)h(t)dt,\ F_Z(z) = \int_0^{\infty} G(tz)h(t)dt,\forall z \in \mathbb{R}
证明
F Z ( z ) = P ( Z z ) = P ( X Y z ) = P ( X z Y ) = x z y g ( x ) h ( y ) d x d y = 0 h ( y ) d y z y g ( x ) d x = 0 G ( z y ) h ( y ) d y f Z ( z ) = F Z ( z ) = 0 G ( z y ) h ( y ) d y = 0 z g ( z y ) h ( y ) d y F_Z(z) = P(Z \le z) = P(\frac{X}{Y} \le z) = P(X \le zY) = \int_{x \le zy} g(x)h(y)dxdy \\ = \int_{0}^{\infty} h(y)dy \int_{-\infty}^{zy} g(x)dx = \int_{0}^{\infty} G(zy)h(y)dy \\ f_Z(z) = F_Z'(z) = \int_{0}^{\infty} G'(zy)h(y)dy = \int_{0}^{\infty} zg(zy)h(y)dy

证毕

下面分别考虑 X X Y / n \sqrt{Y/n} 的分布。 X N ( δ , 1 ) X \sim N(\delta,1) ,因此
g ( x ) = 1 2 π exp ( ( x δ ) 2 2 ) = 1 2 π e x 2 + δ 2 2 e δ x g(x) = \frac{1}{\sqrt{2\pi}} \exp \left(-\frac{(x-\delta)^2}{2} \right) = \frac{1}{\sqrt{2\pi}} e^{-\frac{x^2+\delta^2}{2}} e^{-\delta x}

类似我们在计算卡方分布时的处理,将 e δ x e^{-\delta x} 展开为级数,
g ( x ) = 1 2 π e x 2 + δ 2 2 i = 0 ( δ x ) i i ! g(x) = \frac{1}{\sqrt{2\pi}} e^{-\frac{x^2+\delta^2}{2}}\sum_{i=0}^{\infty} \frac{(\delta x)^i}{i!}

因为 Y χ n 2 Y \sim \chi^2_n ,记 W = Y / n W=\sqrt{Y/n} ,则
H W ( w ) = P ( W w ) = P ( Y / n w ) = P ( Y n w 2 ) = F Y ( n w 2 ) h W ( w ) = F W ( w ) = 2 n w f Y ( n w 2 ) = 2 n w ( 1 / 2 ) n / 2 Γ ( n / 2 ) ( n w 2 ) n 2 1 e n w 2 / 2 = n n 2 w n 1 2 n 2 1 Γ ( n 2 ) e n w 2 2 , w > 0 H_W(w) = P(W \le w) = P(\sqrt{Y/n} \le w) = P(Y \le nw^2) = F_Y(nw^2) \\ h_W(w) = F_W'(w) = 2nwf_Y(nw^2) = 2nw \frac{(1/2)^{n/2}}{\Gamma(n/2)}(nw^2)^{\frac{n}{2}-1}e^{-nw^2/2} \\ = \frac{n^{\frac{n}{2}}w^{n-1}}{2^{\frac{n}{2}-1}\Gamma(\frac{n}{2})}e^{-\frac{nw^2}{2}},w>0
下面根据引理1计算t分布的概率密度:
s ( z n , δ ) = 0 z g ( z y ) h ( y ) d y = 0 z n n 2 y n 1 2 n 2 1 Γ ( n 2 ) e n y 2 2 1 2 π e ( z y ) 2 + δ 2 2 i = 0 ( δ z y ) i i ! d y = z n n 2 2 n 2 1 Γ ( n 2 ) 1 2 π e δ 2 2 i = 0 δ i z i i ! 0 y i + n 1 e n + z 2 2 y 2 d y s(z|n,\delta) = \int_{0}^{\infty} zg(zy)h(y)dy=\int_{0}^{\infty} z\frac{n^{\frac{n}{2}}y^{n-1}}{2^{\frac{n}{2}-1}\Gamma(\frac{n}{2})}e^{-\frac{ny^2}{2}}\frac{1}{\sqrt{2\pi}} e^{-\frac{(zy)^2+\delta^2}{2}}\sum_{i=0}^{\infty} \frac{(\delta zy)^i}{i!}dy \\ = z\frac{n^{\frac{n}{2}}}{2^{\frac{n}{2}-1}\Gamma(\frac{n}{2})}\frac{1}{\sqrt{2\pi}}e^{-\frac{\delta^2}{2}}\sum_{i=0}^{\infty}\frac{\delta ^i z^i}{i!}\int_{0}^{\infty} y^{i+n-1}e^{-\frac{n+z^2}{2}y^2} dy

这里需要用到Gamma函数求积技巧:
0 y i + n 1 e n + z 2 2 y 2 d y = 1 n + z 2 0 y i + n 2 e n + z 2 2 y 2 d ( n + z 2 2 y 2 ) = 2 i + n 2 2 ( n + z 2 ) i + n 2 0 ( n + z 2 2 y 2 ) i + n 2 2 e n + z 2 2 y 2 d ( n + z 2 2 y 2 ) = 2 i + n 2 2 ( n + z 2 ) i + n 2 Γ ( n + i 2 ) \int_{0}^{\infty} y^{i+n-1}e^{-\frac{n+z^2}{2}y^2} dy = \frac{1}{n+z^2} \int_{0}^{\infty} y^{i+n-2}e^{-\frac{n+z^2}{2}y^2} d\left( \frac{n+z^2}{2}y^2\right) \\ = \frac{2^{\frac{i+n-2}{2}}}{(n+z^2)^{\frac{i+n}{2}}} \int_{0}^{\infty} \left( \frac{n+z^2}{2}y^2\right) ^{\frac{i+n-2}{2}}e^{-\frac{n+z^2}{2}y^2} d\left( \frac{n+z^2}{2}y^2\right) =\frac{2^{\frac{i+n-2}{2}}}{(n+z^2)^{\frac{i+n}{2}}} \Gamma(\frac{n+i}{2})

带入概率密度并化简:
s ( z n , δ ) = n n 2 π Γ ( n 2 ) e δ 2 2 ( n + z 2 ) n + 1 2 i = 0 ( n + i + 1 ) ( δ z ) i 2 i ! ( 2 n + z 2 ) i 2 s(z|n,\delta) = \frac{n^{\frac{n}{2}}}{\sqrt{\pi}\Gamma(\frac{n}{2})} \frac{e^{-\frac{\delta^2}{2}}}{(n+z^2)^{\frac{n+1}{2}}}\sum_{i=0}^{\infty} \frac{(n+i+1)(\delta z)^i}{2i!} \left( \frac{2}{n+z^2} \right)^{\frac{i}{2}}

δ = 0 \delta=0 时,
s ( z n , 0 ) = Γ ( n + 1 2 ) n π Γ ( n 2 ) ( 1 + z 2 n ) n + 1 2 s(z|n,0) = \frac{\Gamma(\frac{n+1}{2})}{\sqrt{n\pi}\Gamma(\frac{n}{2})}\left(1+\frac{z^2}{n} \right)^{-\frac{n+1}{2}}

t分布的性质

性质1: X 1 , , X n i i d N ( μ , σ 2 ) X_1,\cdots,X_n \sim_{iid} N(\mu,\sigma^2) Z = n ( X ˉ b ) 1 n 1 i = 1 n ( X i X ˉ ) 2 t n 1 , δ Z = \frac{\sqrt{n}(\bar{X}-b)}{\sqrt{\frac{1}{n-1}\sum_{i=1}^n(X_i-\bar{X})^2}} \sim t_{n-1,\delta} δ = n ( μ b ) σ \delta = \frac{\sqrt{n}(\mu-b)}{\sigma}

性质2: X 1 , , X m i i d N ( a , σ 2 ) , Y 1 , , Y n i i d N ( b , σ 2 ) X_1,\cdots,X_m \sim_{iid} N(a,\sigma^2),Y_1,\cdots,Y_n \sim_{iid} N(b,\sigma^2) ,他们均互相独立,则
Z = m n ( m + n 2 ) m + n X ˉ Y ˉ c i = 1 m ( X i X ˉ ) 2 + j = 1 n ( Y j Y ˉ ) 2 t m + n 2 , δ Z = \sqrt{\frac{mn(m+n-2)}{m+n}}\frac{\bar{X}-\bar{Y}-c}{\sqrt{\sum_{i=1}^m(X_i-\bar{X})^2+\sum_{j=1}^n (Y_j - \bar{Y})^2}} \sim t_{m+n-2,\delta}

其中 δ = m n m + n a b c σ \delta = \sqrt{\frac{mn}{m+n}}\frac{a-b-c}{\sigma}

性质3: X n t n , δ X_n \sim t_{n,\delta} ,则 X n d N ( δ , 1 ) X_n \to _d N(\delta,1)

性质1和性质2都比较简单,性质1是单总体正态均值的t检验的基础;性质2是双总体正态均值的t检验的基础。性质1中,对 Z Z 做简单变形:
Z = n ( X ˉ b ) 1 n 1 i = 1 n ( X i X ˉ ) 2 = n ( X ˉ b ) σ 1 n 1 i = 1 n ( X i μ σ X ˉ μ σ ) 2 Z = \frac{\sqrt{n}(\bar{X}-b)}{\sqrt{\frac{1}{n-1}\sum_{i=1}^n(X_i-\bar{X})^2}} = \frac{\frac{\sqrt{n}(\bar{X}-b)}{\sigma}}{\sqrt{\frac{1}{n-1}\sum_{i=1}^n(\frac{X_i-\mu}{\sigma}-\frac{\bar{X}-\mu}{\sigma})^2}}

分子 n ( X ˉ b ) σ N ( n ( μ b ) σ , 1 ) \frac{\sqrt{n}(\bar{X}-b)}{\sigma} \sim N(\frac{\sqrt{n}(\mu-b)}{\sigma},1) ,并且与分母互相独立;数理统计基础1中已经证明了分母是服从 χ n 1 2 \chi^2_{n-1} 的。性质2的证明方法与性质1类似。下面证明性质3:

证明 定义+大数定律
根据定义, X n X_n 可以写成 Z / 1 n i = 1 n Y i 2 Z/\sqrt{\frac{1}{n}\sum_{i=1}^n Y_i^2} ,其中 Z N ( δ , 1 ) Z \sim N(\delta,1) 且与所有的 Y i Y_i 独立, Y 1 , , Y n i i d N ( 0 , 1 ) Y_1,\cdots,Y_n \sim _{iid} N(0,1) ,根据弱大数定律
1 n i = 1 n Y i 2 P E [ Y 1 2 ] = 1 \frac{1}{n}\sum_{i=1}^n Y_i^2 \to_P E[Y_1^2] = 1

因此当 n n\to \infty 时, X n d Z N ( δ , 1 ) X_n \to_d Z \sim N(\delta,1)

猜你喜欢

转载自blog.csdn.net/weixin_44207974/article/details/105340065