Boob buckle - Myanmar New Jinhai customer service - Django acquisition request parameter

A regression analysis overview

1. The relationship between the variables

Deterministic phenomenon (function), for example, a rectangular perimeter

Non-deterministic phenomena (statistical correlation), for example, height and weight

2. correlation and regression analysis

Correlation: Correlation and relevance of two (or more) variables (expressed using the correlation coefficient)

Regression analysis: already have relationships with, solving the causality, the unequal status variables (a result of a State), according to changes in the independent variables can predict the law of motion.

For chestnut:

(1) playing basketball taller.

不对,现实是个子更高的人选择了打篮球,属于因果倒置。

(2) high social status of people living longer.

不对,社会地位高受到的医疗较好,医疗较好导致寿命长一些。

Tips:因果关系的前提:时间先后。

3, linear or nonlinear correlation analysis (Note: if the correlation forming correlation will be 0)

Linear correlation:

Two variables: covariance, correlation coefficient

A plurality of variables: computing partial correlation coefficients, multiple correlation coefficient

Second, the population regression function (PRF)

Under the conditions given the explanatory variables X, Y is the dependent variable desired trajectory generally called regression curve, the corresponding function

                                              E(Y|X)=f(X)

F最简形式为线性函数。其截距、斜率为线性回归系数,表达式如下所示,其中β0代表自发消费,β1代表边际消费趋向。

                                            E(Y|X)=β0+β1X

识别:因变量Y为被解释变量、被预测变量、回归子、响应变量,自变量X为解释变量、预测变量、回归元、控制变量。

Third, the random error term

将一个真实Y减去它的均值,则为离差:

                                                μ=Y-E(Y|X)

then

                                                Y=E(Y|X)+μ

Interprets variables consists of two parts, after determining the portion of a given X, since another portion of the random factors involved in, and the fluctuation reaction itself.

Random error term meaning:

1、未知的影响因素

2、残缺数据

3、众多细小因素

4、数据观测误差

5、模型设定误差

6、内在随机性

Fourth, the sample regression function

We must first understand a basic fact: in general is always unknown. Population parameter values, variance, etc. are not informed of the purpose of metrology is: be inferred from the overall sample.

Scatter from the sample points fit a straight line, the regression line used to approximate the overall replaced.

For example, a linear regression:

Note symbol estimates (remember to wear a hat)

Key distinction:

** The overall regression function PRF:

                                          E(Y)=β0+β1Xi             (i=1,2,,,,n)

PRF random form:

                                          E(Y)=β0+β1Xi +μi

** Regression Function SRF:

SRF random form:

一个是根据总体数据进行回归,一个根据样本数据进行后归,回归结果加上各自的随机误差后,都可以得到真实值。

Fifth, the model assumes

Suppose a: Errors in the model no setting (correct variable, the correct functional form).

Hypothesis 2: X is a variable deterministic (non-random variables).

Suppose three: X takes at least two or more different values, the sample variance and converged (If X is a constant value, the significance of the problem would change) (as sample size increases, the convergence of the variance, which is a time series in order to avoid the spurious regression problem)

Hypothesis 4: error term "zero mean with variance and zero covariance." (On the whole, we expect the overall mean error is 0) (same population variance, guarantee the same degree of variation on each point) (for two different sample points, covariance is zero, the correlation coefficient is 0, sent two sample point information related to possible)

Five assumptions: a random error term associated explanatory variable not (can be introduced from the front)

Assume Six: random error term normally distributed (mean 0 and variance σ2) (optional, not this assumption, it is still automatically and then set up)

Sixth, parameter estimation

1. The ordinary least squares parameter estimation (the OLS) (commonly used)

原理:总体误差达到最小 → 最小二乘思想,即残差平方和最小,将这个问题变成最优化问。

result

or

Classification: estimator: The function of the random variable

         估计值:具体数值

2. The maximum likelihood (ML) (ML when using unsuitable OLS)

Likelihood: refers to the probability or a possibility

Principle: there is reasonable (when you observe a set of samples, must be the mechanism behind the decision that it can appear in front of you). ---- maximum probability of maximum likelihood method; total least squares method is the smallest of errors.

3. Parameter Estimation moment method (MM)

此方法使用较少,暂不讲解。

Guess you like

Origin www.cnblogs.com/12421yi/p/12543945.html