Exponential distribution family and generalized linear models

1. Exponential distribution family

1.1 Definition

  Exponential family distribution (The exponential family distribution), which is different from exponential distribution (exponential distribution). The exponential distribution family does not specifically refer to a distribution, but a general term for a series of distributions that meet the characteristics. In probability statistics, if a probability distribution satisfies the following formula, we call it an exponential family distribution.
p ( y ; θ ) = b ( y ) exp ( η ( θ ) T ( y ) − A ( θ ) ) p(y;\theta)=b(y)exp\left(\eta(\theta)T (y)-A(\theta)\right)p ( and ;i )=b(y)exp( η ( θ ) T ( y )A ( θ ) )
among them,the \etaη is the natural parameter of the distribution (nature parameter);T ( y ) T(y)T ( y ) is a sufficient statistic, usuallyT ( y ) = y T(y)=yT(y)=ya ( η ) a(\eta)a ( η ) is the logarithmic partition function,e − a ( η ) e^{-a(\eta)}ea ( η ) plays a normalizing role in the formula to ensure that the probability density function is in the random variableyyThe integral over y is 1, once T , a , b T, a, bIf T , a , b are determined, a distribution can be determined,η \etaη is a parameter.

Commonly used distributions such as normal distribution, Bernoulli distribution, exponential distribution, Poisson distribution, and gamma distribution all belong to the exponential distribution family.

1.2 Bernoulli distribution

We have the following equations:
p ( y ; θ ) = θ y ( 1 − θ ) 1 − y = exp ( y log ⁡ θ + ( 1 − y ) log ⁡ ( 1 − θ ) ) = exp ( log ⁡ θ 1 − θ y + log ⁡ ( 1 − θ ) ) \begin{align}p(y;\theta)&=\theta^y(1-\theta)^{1-y}\\[ 2ex ] &=exp\left(y\log\theta+\left(1-y\right)\log\left(1-\theta\right)\right) \\[2ex] &=exp\left(\log \frac{\theta}{1-\theta}y+\log(1-\theta)\right)\end{align}p ( and ;i )=iy(1i )1y=exp(ylogi+(1y)log(1i ) )=exp(log1iiy+log(1i ) )
We have the following equation:
b ( y ) = 1 η ( θ ) = log ⁡ θ 1 − θ T ( y ) = y A ( θ ) = − log ( 1 − θ ) = log ( 1 ). + e η ( θ ) ) \begin{align} &b(y)=1 \\[2ex] &\eta(\theta)=\log\frac{\theta}{1-\theta}\\[2ex] &T(y)=y \\[2ex] &A(\theta)=-log(1-\theta)=log(1+e^{\eta(\theta)}) \end{align}b(y)=1h ( i )=log1iiT(y)=yA ( i )=log(1i )=log(1+eh ( i ) ).

1.3 Gaussian distribution

For the mean value μ \muμ with varianceσ \sigmaDetermine the equation for σ as a function of σ: p
( y ; µ , σ ) = 1 2 π σ e − ( y − µ ) 2 2 σ 2 = 1 2 π e η ( µ , σ ) T ( y ) − log ⁡ σ − μ 2 2 σ 2 \begin{align} p(y;\mu,\sigma)&=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{( y-\mu)^2}{2\sigma^2}} \\[2ex] &=\frac{1}{\sqrt{2\pi}}e^{\eta(\mu,\sigma)T (y)-\log\sigma-\frac{\mu^2}{2\sigma^2}}\\[2ex]\end{align}p ( and ;m ,s )=2 p.m p1e2 p2( y μ )2=2 p.m 1eη ( μ , σ ) T ( y ) l o gp 2 p2m2
Let us have the following equations for the inverse equation:
b ( y ) = 1 2 π η ( σ ) = [ µ σ 2 , − 1 2 σ 2 ] T ( y ) = [ y , y 2 ] A ( θ ) . = µ 2 2 σ 2 + log ⁡ σ \begin{align} &b(y)=\frac{1}{\sqrt{2\pi}} \\[2ex] &\eta(\sigma)=[\frac {\mu}{\sigma^2},-\frac{1}{2\sigma^2}]\\[2ex] &T(y)=[y,y^2] \\[2ex] &A(\ theta)=\frac{\mu^2}{2\sigma^2}+\log\sigma\end{align}b(y)=2 p.m 1h ( s )=[p2m,2 p21]T(y)=[y,y2]A ( i )=2 p2m2+logp

1.4 Other exponential distribution families

  • Multinomial distribution (multinomial), used to model multivariate classification problems;
  • Poisson distribution (Poisson), used to model counting processes, such as the number of visitors to a website, the number of customers in a store, etc.;
  • Gamma distribution (gamma) and exponential distribution (exponential), used to model time intervals, such as waiting time, etc.;
  • Beta distribution (beta) and Dirichlet distribution (Dirichlet), for probability distribution;
  • Wishart distribution (Wishart), for covariance matrix distributions.

2. Generalized linear model (GLM)

The well-known linear regression and logistic regression belong to glm, in which linear regression assumes a Gaussian distribution, and logistic regression assumes a Bernoulli distribution, but it is not very clear why this is so.

2.1 Three assumptions

  • In the given argument xxx and parametersθ \thetaIn the case of θ , the dependent variableyyy obeys the exponential distribution family
  • Given xxx , the ultimate goal is to findT ( y ) T(y)T ( y ) expectationsE [ T ( y ) ∣ x ] E[T(y)|x]E[T(y)x]
  • Natural parameter η \etaη can be expressed as the independent variablexxThe linear relationship of x , that is, η = θ T x η=\theta^Txthe=iTx

A generalized linear model is fitted by fitting yyThe conditional mean/expectation of y (at xxx and parametersθ \thetaθ given), and assumingyyy fits a distribution in the family of exponential distributions, extending the standard linear model

2.2 Bernoulli distribution

For the Bernoulli distribution, because it is a binary classification problem, we choose p ( y ∣ x ; θ ) ∼ Bernoulli ( Φ ) p(y|x;\theta) \sim Bernoulli(\Phi)p(yx;i )The mean of B er n o u ll i ( Φ ) isϕ \phiϕ is the only parameter under the exponential distribution family. According to the above derivation:
h θ ( x ) = E [ y ∣ x ; θ ] = Φ \begin{align} h_\theta(x) &= E[y|x;\theta] \\[2ex] & =\Phi \end{align}hi(x)=E[yx;i ]=F

η = log ⁡ ϕ 1 − ϕ = θ T x \begin{align} \eta&=\log\frac{\phi}{1-\phi}\\[2ex] &=\theta^Tx \\[2ex] \end{align} the=log1ϕϕ=iTx
Equation:
y = 1 1 + e − η = 1 1 + e − θ T x \begin{align} y&=\frac{1}{1+e^{-\eta}}\\[2ex] &= \frac{1}{1+e^{-\theta^Tx}}\\[2ex]\end{align}y=1+eh1=1+eiTx1 
The above formula is the expression of logistic regression, which corresponds to the assumption of Bernoulli distribution of y under logistic regression.

2.3 Gaussian distribution

For a Gaussian distribution, yyThe mean of y is the parameterμ \muμ , according to the above derivation:
y = μ = η = θ T x (Assume σ = 1 ) y=\mu=\eta=\theta^Tx (Assume \sigma=1)y=m=the=iT x(assumingσ=1 )
The above formula and linear regression foryyEchoes the assumption that y is a Gaussian distribution

3. GLM modeling process

  • According to the problem choose a distribution in the exponential distribution family as the pair yyassumption of y
  • Calculate the η \eta under this distributionη, actuallyη = η ( w T ) \eta=\eta(w^T)the=h ( wT ), of whichw T w^TwT is the true parameter of the distribution, andη \etaη is simplyw T w^TwA link function with T as a parameter
  • Calculate the expectation of this distribution, and use η \etaη means, for example , y = ϕ = 1 1 + e − η y=\phi=\frac{1}{1+e^{−η}}in the above Bernoulli distributiony=ϕ=1+eh1
  • Replace η = θ T x \eta=\theta^Tx according to the assumption of GLMthe=iT xis the GLM model

4. Summary

  • Let us define the equation: p ( y ; θ ) = b ( y ) exp ( η ( θ ) T ( y ) − A ( θ ) ) p(y;\theta)=b(y)exp\left(\ eta(\theta)T(y)-A(\theta)\right)p ( and ;i )=b(y)exp( η ( θ ) T ( y )A ( i ) )
  • Commonly used distributions such as normal distribution, Bernoulli distribution, exponential distribution, Poisson distribution, and gamma distribution all belong to the exponential distribution family.
  • A generalized linear model is fitted by fitting yyThe conditional mean/expectation of y (at xxx and parametersθ \thetaθ given), and assumingyyy fits a distribution in the family of exponential distributions, extending the standard linear model.

This article is only used as a personal learning record, not for commercial use, thank you for your understanding and cooperation.

Reference: https://shangzhih.github.io/zhi-shu-fen-bu-zu-he-yan-yi-xian-xing-hui-gui.html

Guess you like

Origin blog.csdn.net/weixin_44852067/article/details/130048600