[SAS solve multiple regression equations] REG multiple regression analysis-multiple quadratic regression

【Example】

Suppose there is a correlation between Y andx_1,x_2,x_3, consider the quadratic regression modely=\beta _0+\beta _1x_1+\beta _2x_2+\beta _3x_3+\beta _4x_1^{2}+\beta _5x_2^{2}+\beta _6x_3^{3}+\beta _7x_1x_2+\beta _8x_1x_3+\beta _9x_2x_3+\varepsilon

,8 sets of observation data are as follows:

 Data preprocessing

Process the data set according to the requirements of the quadratic model to obtain a new data set

Basic grammar 

 PROC REG data = data set;
MODEL Dependent variable = independent variable list</optional>;
< restrict equality constraints of independent variables;>

 SAS code

for processed data

data d1;
  input x1-x9 y ;
  cards;
38 47.5 23 1444 2256.25 529 1805.00 1805.00 1092.50 66.0 
41 21.3 17 1681  453.69 289	 873.30  873.30  362.10 43.0 
34 36.5 21 1156 1332.25 441 1241.00 1241.00  766.50 36.0 
35 18.0 14 1225	 324.00 196	 630.00  630.00  252.00 23.0 
31 29.5 11  961  870.25 121	 914.50  914.50  324.50 27.0 
34 14.2  9 1156	 201.64 81   482.80  482.80  127.80 14.0 
29 21.0  4  841  441.00 16   609.00  609.00  84.00 	12.0 
32 10.0  8 1024	 100.00 64   320.00  320.00  80.00 	7.6 
;
proc print;
run;
proc reg data=d1;
   model y=x1-x9 
   /selection=stepwise
   sle=0.05 sls=0.05;
run;
proc reg data=d1;
	model y=x4 x7;
run;
quit;

Choose step by step

first step

Step 2

Until all variables remaining in the model have a significance level of 0.0500, while no other variables meet the 0.0500 significance level.

Obtain x4 and x7, corresponding tox_1^{2},x_1x_2

Parameter Estimation

So the quadratic regression equation is:

\hat{Y}=-30.0098+0.02672x_1^{2}+0.03130x_1 x_2

The parameter estimation table not only gives the coefficients of the regression equation, but also gives the results of the testH_0 (significant probability p value)

For example, given\alpha =0.05, if the p-values ​​of the constant term and the independent variable are both\geqa, it means that there is a highly significant contradiction with the regression equation, In order to obtain the optimal regression equation, the least important independent variables should be deleted from the equation, and the regression equation with the remaining independent variables should be re-established and then tested. This is the meaning of variable screening. AND

variance analysis

Regression sum of squares:U=2597.3352

Residual sum of squares:Q=27.1798

Mean squared error:MSE=27.1798/5=5.4360

The mean square error is an estimate of the error variance in the model\sigma ^{2}

Test statisticF=238.90, the significance probability p value is less than0.0001, which means that the fitted model is highly significant, and the model explains represents the main part of the total variation in this set of data.

regression statistics

decisive factor:R^{2}=0.9896

Complex correlation coefficient:R=\sqrt{0.9896}

Estimator of standard deviation\sigma:ROOTMSE=2.3315

Fitting diagnosis of y

Regressor value of y - residual value

Guess you like

Origin blog.csdn.net/weixin_73404807/article/details/133952854