常见概率分布总结

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/crazy_scott/article/details/82818042

Discrete

Bernoulli distribution

  • pmf
    • f X ( x ) = P ( X = x ) = { ( 1 p ) 1 x p x for x = 0 or 1 0 otherwise f_X(x) = P(X= x) =\left\{\begin{aligned}(1-p)^{1-x}p^x & \quad \text{for x = 0 or 1}\\ 0 & \quad\text{otherwise}\end{aligned}\right.
  • expectation
    • E ( X ) = p E(X) = p

Binomial distribution

  • pmf
    • f X ( k ) = P ( X = k ) = { C n k p k ( 1 p ) n k for k=0,1,....,n 0 otherwise f_X(k) = P(X= k) =\left\{\begin{aligned}C_n^kp^k(1-p)^{n-k} & \quad \text{for k=0,1,....,n}\\ 0 & \quad\text{otherwise}\end{aligned}\right.
  • expectation
    • E ( X ) = n p E(X) = np
  • variance
    • v a r ( X ) = n p ( 1 p ) var(X) = np(1-p)

Geometric distribution

  • pmf
    • f X ( k ) = P ( X = k ) = { p ( 1 p ) k 1 for k=1,2,3... 0 otherwise f_X(k) = P(X= k) =\left\{\begin{aligned}p(1-p)^{k-1} & \quad \text{for k=1,2,3...}\\ 0 & \quad\text{otherwise}\end{aligned}\right.
  • expectation
    • E ( X ) = 1 P E(X) = \frac{1}{P}

Negative binomial distribution

  • The negative binomial distribution arises as a generalization of the geometric distribution.

  • Suppose that a sequence of independent trials each with probability of success p p is performed until there are r r successes in all.

    • so can be denote as p C k 1 r 1 p r 1 ( 1 p ) ( k 1 ) ( r 1 ) p \cdot C_{k-1}^{r-1} p^{r-1}(1-p)^{(k-1)-(r-1)}
  • pmf

    • f X ( k ) = P ( X = k ) = { C k 1 r 1 p r ( 1 p ) k r for k=1,2,3... 0 otherwise f_X(k) = P(X= k) =\left\{\begin{aligned}C_{k-1}^{r-1}p^r(1-p)^{k-r} & \quad \text{for k=1,2,3...}\\ 0 & \quad\text{otherwise}\end{aligned}\right.

Hypergeometric distribution

  • Suppose that an urn contains n n balls, of which r r are black and n r n-r are white. Let X X denote the number of black balls drawn when taking m m balls without replacement.
  • pmf
    • f X ( k ) = P ( X = k ) = { C r k C n r m k C n m 0 k r 0 otherwise f_X(k) = P(X= k) =\left\{\begin{aligned}\frac{C_r^kC_{n-r}^{m-k}}{C_n^m} & \quad 0\le k \le r\\ 0 & \quad\text{otherwise}\end{aligned}\right.

Possion distribution

  • can be derived as the limit of a binomial distribution as the number of trials approaches infinity and the probability of success on each trial approaches zero in such a way that n p = λ np = \lambda , λ \lambda can be seen as the successful trials
  • pmf
    • P ( X = k ) = λ k k ! e λ k = 0 , 1 , 2... P(X = k) = \frac{\lambda^k }{k!} e^{-\lambda} \quad k = 0,1,2...

Continuous

Uniform distribution

  • A uniform r.v on the interval [a,b] is a model for what we mean when we say “choose a number at random between a and b”
  • pdf
    • f X ( x ) = { 1 b a a x b 0 otherwise f_X(x) = \left\{\begin{aligned}\frac{1}{b-a} & \quad a\le x \le b\\ 0 & \quad\text{otherwise}\end{aligned}\right.

Exponential distribution

  • Exponential distribution is often used to model lifetimes or waiting times, in which context it is conventional to replace x x by t t .
  • pdf
    • f X ( x ) = { λ e λ x x 0 0 otherwise f_X(x) = \left\{\begin{aligned}\lambda e^{-\lambda x} & \quad x\ge 0\\ 0 & \quad\text{otherwise}\end{aligned}\right.
  • cdf(easy to get)
    • F X ( x ) = { 1 e λ x x 0 0 otherwise F_X(x) = \left\{\begin{aligned}1-e^{-\lambda x} & \quad x\ge 0\\ 0 & \quad\text{otherwise}\end{aligned}\right.
  • expectation
    • E ( X ) = λ E(X) = \lambda
  • variance
    • v a r ( X ) = λ 2 var(X) = \lambda^2

property

  • let X , Y X,Y are independent Poisson r.v.s with θ 1 , θ 2 \theta_1,\theta_2 ,then X + Y P o i s s o n ( θ 1 + θ 2 ) X+Y\sim Poisson (\theta_1+\theta_2)

Gamma distribution

  • pdf
    • g ( t ) = { λ α τ ( α ) t α 1 e λ t t 0 0 otherwise g(t) = \left\{\begin{aligned}\frac{\lambda^\alpha}{\tau (\alpha)}t^{\alpha-1}e^{-\lambda t} & \quad t\ge 0\\ 0 & \quad\text{otherwise}\end{aligned}\right.
  • τ ( x ) = 0 u x 1 e u d u , x > 0 \tau(x) = \int _0^\infty u^{x-1}e^{-u}du,x>0
  • expectation
    • E ( X ) = α λ E(X) = \frac{\alpha}{\lambda}
  • variance
    • V a r ( X ) = α λ 2 Var(X)= \frac{\alpha}{\lambda ^2}

Property

  • Note that if α = 1 \alpha = 1 , the gamma density coincides with the exponential density.
  • conduct
    • τ ( α ) = 0 x α 1 e t d x \because \tau(\alpha ) =\int _0^\infty x^{\alpha-1}e^{-t}dx
    • x = λ t , τ ( α ) = λ α 0 t α 1 e λ t d t \therefore x = \lambda t,\to \tau (\alpha) = \lambda^\alpha \int _0^\infty t^{\alpha-1}e^{-\lambda t}dt
    • 1 τ ( α ) λ α 0 t α 1 e λ t d t = 1 \therefore \frac{1}{\tau (\alpha)}\lambda^\alpha \int _0^\infty t^{\alpha-1}e^{-\lambda t}dt = 1
    • g ( t ) = λ α τ ( α ) t α 1 e λ t \therefore g(t) =\frac{\lambda^\alpha}{\tau(\alpha)}t^{\alpha-1}e^{-\lambda t}
  • α \alpha is called a shape parameter for the gamma density,
  • Varying α \alpha changes the shape of the density
  • λ \lambda is called a scale parameter
  • Varying λ \lambda corresponds to changing the units of measurement and does not affect the shape of the density
  • how to understand gamma?

Normal distribution

  • pdf
    • g ( t ) = { 1 σ 2 π e ( x μ ) 2 / ( 2 σ 2 ) t 0 0 otherwise g(t) = \left\{\begin{aligned}\frac{1}{\sigma\sqrt{2\pi}}e^{-(x-\mu)^2/(2\sigma^2)} & \quad t\ge 0\\ 0 & \quad\text{otherwise}\end{aligned}\right.
  • μ \mu is the mean
  • σ \sigma is the standard deviation
  • If X N ( μ ; σ 2 ) X \sim N(\mu; \sigma^2) ,and Y = a X + b Y = aX + b , then Y N ( a μ + b , a 2 σ 2 ) Y \sim N(a\mu+b,a^2\sigma^2)
    • especially, if X N ( μ , σ 2 ) X \sim N(\mu,\sigma^2) , then Z = x μ σ N ( 0 , 1 ) Z = \frac{x-\mu}{\sigma}\sim N(0,1)
  • a X + b Y N ( a μ X + b μ Y , a 2 σ X 2 + b 2 σ Y 2 + 2 a b ρ σ X σ Y ) aX+bY \sim N(a\mu_X+b\mu_Y,a^2\sigma_X^2 + b^2\sigma_Y^2 + 2ab\rho \sigma_X\sigma_Y)

property

  • if X , Y N ( 0 , 1 ) X,Y \sim N(0,1) ,then $U = \frac{X}{Y} $ is Cauchy r.v (lec3)
    • f U ( u ) = 1 π ( u 2 + 1 ) f_U(u) = \frac{1}{\pi (u^2+1)}

Exponential family

  • A family of pdfs or pmfs is called an exponential family if it can
    be expressed as:
    • p ( x , θ ) = H ( x ) exp ( θ T ϕ ( x ) A ( θ ) ) p(x,\theta) = H(x)\exp(\theta^T \phi(x) - A(\theta))
    • H ( x ) H(x) is the normalization factor
  • It is very helpful to model heterogeneous data in the era of big data.
  • Bernoulli, Gaussian, Binomial, Poisson, Exponential, Weibull, Laplace, Gamma, Beta, Multinomial, Wishart distributions are all exponential families
  • the explain can be seen here

Property

  • E ( X ) = E ( E ( X Y ) ) E(X) = E(E(X|Y))
    • 可以理解为先分组求期望,与直接求期望一样
  • V a r ( X ) = E ( V a r ( X Y ) ) + V a r ( E ( X Y ) ) Var(X) = E(Var(X|Y)) + Var(E(X|Y))
    • 可以理解为组内方差的期望 + 组间方差

猜你喜欢

转载自blog.csdn.net/crazy_scott/article/details/82818042