sklearn the ridge regression

'' ' 
    Ridge Regression: 
            general linear regression model using the least squares method based on gradient descent, while minimizing the loss function, to find the optimal model parameters, 
            in the process, including a small number of abnormal samples, including all the training data will be equal degree of impact on the final model parameter, 
            an outlier of the impact model can not be identified in the training process. To this end, the loss of function in the ridge regression model iterative process is based on an increase in the regular term, 
            to the extent of abnormal samples match the restrictions model parameters, thereby improving the accuracy of the model fitting face most normal samples. 
    Purpose of ridge regression: 
            1> general linear regression does not recognize or avoid the effects of abnormal samples model parameters, resulting in predicting the effects of search (prediction results biased to abnormal samples), ridge regression can be provided regular intensity 
            to reduce abnormal samples of Effect of model parameters such that the prediction result prefer to normal sample, to improve the accuracy of the model fit. 
            2> After adding regularization term, it will certainly be lower than normal scores R2 the linear regression, because: ordinary linear regression consideration is all the minimum sample loss function, and ridge regression is to avoid the influence of outliers on the predicted 
            thus deliberately reducing abnormal samples right in the calculation of the weight, which in turn results in minimum loss function is larger than the minimum loss function ordinary linear regression. 

  Regular strength, the stronger generalization
related API: Import sklearn.linear_model AS LM # Create a model model = lm.Ridge (regular strength, fit_intercept = whether training intercept, max_iter = maximum number of iterations) # training model # Input: sample matrix representation of a two-dimensional array # Output: for each sample of the final result model.fit (input, output) # forecast output # input array is a two-dimensional array, each row is a sample and each column is a feature. result = model.predict (array) Example: abnormal.txt load the data file, the training algorithm based on ridge regression regression model.
'' ' Import sklearn.linear_model AS LM Import numpy AS NP Import matplotlib.pyplot AS MP Import sklearn.metrics AS SM X, Y = np.loadtxt ( ' ./ml_data/abnormal.txt ' , DELIMITER = ' , ', The unpack = True, usecols = (0,. 1 )) # input into two-dimensional array, as this line, a feature of an X x.reshape = (-1,. 1) # becomes an n-th row model = lm .Ridge (150, fit_intercept = True, max_iter = 1000 ) model.fit (x, Y) pred_y = model.predict (x) # the samples x into the model prediction determined Y # evaluation index output of the model Print ( ' mean absolute error: ' , sm.mean_absolute_error (Y, pred_y)) Print ( ' mean squared error: ' , sm.mean_squared_error (Y, pred_y)) Print ( ' median absolute error: ' , sm.median_absolute_error (Y, pred_y)) Print ( 'R2得分:', sm.r2_score(y, pred_y)) # 绘制图像 mp.figure("Linear Regression", facecolor='lightgray') mp.title('Linear Regression', fontsize=16) mp.tick_params(labelsize=10) mp.grid(linestyle=':') mp.xlabel('x') mp.ylabel('y') mp.scatter(x, y, s=60, marker='o', c='DodgerBlue ' , label = ' Points ' ) mp.plot (X, pred_y, C = ' OrangeRed ' , label = ' the LR Line ' ) mp.tight_layout () mp.legend () mp.show ()

output:
mean absolute error: 1.0717908951634179
mean square error: 3.7362971803503267
median absolute error: .696470799282414
R2 score: 0.44530850891980656

  

Guess you like

Origin www.cnblogs.com/yuxiangyang/p/11183051.html