Linear correlation | regression |

Biostatistics ----- Correlation and Regression

The relationship between correlation and regression after predictor variables to describe and predict statistics.

Is the correlation between variables can not be precisely expressed by a function, i.e., no points are distributed around one correspondence but straight.

Evaluation is the correlation coefficient, there is an overall correlation coefficient and sample correlation coefficients , positive and negative synchronization with the positive and negative correlation coefficient covariance, when the correlation coefficient is 0 While wireless-related, but there may be other non-linear relationship:

The correlation coefficient has symmetry, change the coordinate system and the scale does not change r size (but does not meet the covariance nature, can only show that there is a linear correlation between two factors, but not necessarily a causal relationship. In general, the correlation coefficient will do hypothetical test.

 

After the establishment of equations to predict the regression equation is starting from a set of samples, teach more influential factors are in screening, use of the results of the regression equation the dependent variable by an argument. The difference is that associated with the linear dependence of the x , the y- equal and are random variables and requirements in the regression x is the independent variable and does not matter whether it is a random variable, the y- is interpreted to be a random variable. It is used to describe linear correlation, and regression analysis is used to describe and predict.

Regression analysis type can be mono- or diverse.

Regression model, the X- is the independent variable. y is the dependent variable. Is the error term, random factors, is not used to explain the relationship between linear portions meet normal distribution with zero mean and the same variance, independent.

But each x the information obtained is not related to independent input x get all possible y mean.

Lagrangian multiplier coefficients required two (least square method), i.e., partial derivatives zero point to obtain coefficients.

 

 

 

Sum of squares decomposition, i.e., the SST + = the SSR the SSE . SST is a deviation of observation and the mean value, i.e., the total variation, the SSR is x to explain y the sum of squares, the SSE is in addition to x than other factors y squared effects and can use the SSR / SST , i.e. the coefficient of determination to determine the regression equation fitting degree, the value is bound to the (0,1) .

 

 

 

Inspection of the regression equation: to determine whether the regression equation significant linear relationship using F distribution ( SSR , SSE ).

Residual analysis to test the hypothesis whether to set up test regression coefficients can be used.

 

 

Guess you like

Origin www.cnblogs.com/yuanjingnan/p/11668028.html