After spss analysis there were common line, followed by how analysis?

When performing linear regression analysis, the correlation between each other independent variable (explanatory variable) are prone to this situation is called multi-collinearity.

When moderate multicollinearity is not a problem, but a serious collinearity problems that may lead to unstable results, symbolic regression coefficient with the actual situation is completely the opposite situation occurs. This should be a significant independent variable is not significant, this is not a significant independent variable preached significant, it is necessary to eliminate the influence of multicollinearity in this case.

 

The reason appears collinearity

That is a multi-collinearity explanation explanatory variable changes due to changes in another variable.

 

Originally argument should be the independent variable, so based on the test results, we can know what factors had a significant effect on the dependent variable Y, which has no effect. If there is a strong linear relationship between each independent variable x, other variables can not be fixed, it can not find a real relationship between the x and y.

In addition, because multicollinearity may also include:

  • Insufficient data. In some cases, additional data collection may solve the problem.
  • Incorrect use of dummy variables. (For example, while the male and female are two dummy variables into the model, this time must appear collinearity, called complete collinearity)

 

Total linear discriminant index

1, variance inflation factor (the VIF)

There are several ways detect multicollinearity, is more frequently used in the regression analysis VIF values, the larger value VIF, multicollinearity more serious. VIF is generally considered greater than 10 (Strict 5), there is a serious problem on behalf of co-linear model.

2, the tolerance value

Is also sometimes used as a standard in a tolerance value, a tolerance value = 1 / VIF, so the tolerance value greater than 0.1 indicates that no collinearity (strictly greater than 0.2), VIF tolerance values ​​and logical correspondence between any two indicators you can choose one.

3, the correlation coefficient

In addition, the argument directly correlation analysis, the correlation coefficient view and also a significant determination method. If a correlation coefficient between independent variables and other independent variables significantly, it means that there may be multicollinearity.

 

Multicollinearity processing method

Multicollinearity are ubiquitous, usually, if the situation is not serious collinearity (VIF <5), do not require special treatment. The severe multicollinearity problem, consider the following process using the method:

 

1. Manual removal of a variable collinearity

Correlation analysis do first, if it is found that a correlation coefficient value from two variables X (explanatory variables) is greater than 0.7, then removed out from a variable (explanatory variable), and then do regression analysis. This method is the most direct way, but sometimes we do not want an argument to be removed from the model, so we should consider using other methods.

 

2. stepwise regression

Let the software automatically select reject the argument, will be co-linear regression of the independent variables automatically removed out. Such solutions have a problem is likely to weed out the algorithm did not want to reject the argument, if there are such cases generated at this time is best to use ridge regression analysis.

 

 

Use Path: SPSSAU> Advanced Method> stepwise regression

 

3. Increase the sample size

Increasing the sample size is a way to explain the collinearity problem, but may not very suitable in practice, because the sample collection time required cost.

 

4. ridge regression

The first and the second type solutions use more in the actual study, but the question is, if the actual study does not want to weed out some arguments, some arguments are important and can not be removed. At this point you can use SPSSAU ridge regression analysis method, ridge regression is to solve the current problem of collinearity most effective way of explanation.

 

Use Path: SPSSAU> Advanced Method> ridge regression

 

Principles

1. Multicollinearity are ubiquitous, slight multicollinearity problems may take measures, if the value is greater than 10 VIF described collinearity very serious, need to deal with this situation, if the process does not require the VIF values ​​below 5, if the dielectric VIF between 5 and 10 as appropriate.

2. If the model is only used to predict, as long as the degree of fit is good, do not deal with multicollinearity problem, when there is multicollinearity model for predicting, often do not affect the forecast results.

 

 

Guess you like

Origin www.cnblogs.com/spssau/p/11458375.html