Logistic regression to solve the collinearity

Explanatory variables are highly correlated with theoretical observations are highly relevant not necessarily related, theoretically possible that two explanatory variables are highly correlated, but not necessarily highly correlated observations, and vice versa. So the multi-collinearity is essentially a data issue.

There are reasons for multicollinearity what categories:

1, the explanatory variables are entitled to a common time trends;

2, is a another explanatory variable lag, both tend to follow a trend;

3, as the basis of the data collected is not wide enough, some of the explanatory variables might change together;

4, some approximate linear relationship between the presence of certain explanatory variables;

Judgment:

1, found wrong sign coefficient estimates;

2, some important explanatory variables t value is low, but not low side R

3, when a less important explanatory variable is deleted, the regression results change significantly;

test;

1, correlation analysis, the correlation coefficient is higher than 0.8, indicating the presence of multicollinearity; but the low correlation coefficient, and can not represent Multicollinearity absent;

2, vif test;

3, coefficient test condition;

Solution:

1, to increase the data;

2, imposes certain constraints on the model;

3, to delete one or several collinear variables;

4, the appropriate deformation model;

5, the principal component regression

Multi-disciplinary co-linear process:

1, multicollinearity is widespread, slightly multicollinearity may take measures;

2, severe multicollinearity, regression analysis and found that generally based on experience or through. The influence coefficient signs, important explanatory variables t value is very low. To take the necessary measures according to different situations.

3, if only for the prediction model, as long as the degree of fit is good, do not deal with multicollinearity problem, when there is multicollinearity model is used to predict and often does not affect the predicted results;

Guess you like

Origin www.cnblogs.com/ivyharding/p/11505475.html