The road to machine learning: python linear regression overfitting L1 and L2 regularization

git:https://github.com/linyi0604/MachineLearning

Regularization: 
Improve the generalization ability of the model on unknown data
Avoid parameter overfitting
Common methods of regularization:
Add a penalty term to the parameter on the objective function
Reduce the influence of a parameter on the result

L1 regularization: lasso
is linear The regression objective function is followed by an L1 norm vector penalty term.

f = w * x^n + b + k * ||w||1

x is the input sample feature
w is the learned parameter of each feature
n is the number of times
b is the bias, intercept
||w||1 is the L1 norm of the feature parameter, as the penalty vector
k is the strength of the penalty

L2 norm Regularization: ridge
adds the L2 norm vector penalty term after the objective function of the linear regression.

f = w * x^n + b + k * ||w||2

x is the input sample feature
w is the learned parameter of each feature
n is the number of times
b is the bias and intercept
||w||2 is the L2 norm of the feature parameter, as the penalty vector
k is the strength of the penalty The


following simulation uses a 4-order linear model to predict the cake price according to the diameter of the cake
, which is an overfitting model
using two regularization methods for learning and predict


1  from sklearn.linear_model import LinearRegression , Lasso, Ridge
 2  #Import polynomial feature generator 
3  from sklearn.preprocessing import PolynomialFeatures
 4  
5  
6  ''' 
7  Regularization:
 8      Improve the generalization ability of the model on unknown data
 9      Avoid excessive parameters Fitting
 10  Commonly used methods of regularization:
 11      Add the penalty item to the parameter on the objective function
 12      Reduce the influence of a parameter on the result
 13  
14  L1 regularization: lasso
 15      Add the L1 norm to the objective function of linear regression Vector penalty term.
16      
17      f = w * x^n + b + k * ||w||1 
 18      
19     x is the input sample feature
 20      w is the parameter of each feature learned
 21      n is the number of times
 22      b is the bias, the intercept
 23      ||w||1 is the L1 norm of the feature parameter, as the penalty vector
 24      k is Penalty intensity
 25  
26  L2 norm regularization: ridge
 27      adds the L2 norm vector penalty term to the objective function of linear regression.
28      
29      f = w * x^n + b + k * ||w||2 
 30      
31      x is the input sample feature
 32      w is the learned parameter of each feature
 33      n is the number of times
 34      b is the bias, cut The distance
 35      ||w||2 is the L2 norm of the feature parameter, as the penalty vector
 36      k is the penalty intensity
 37          
38          
39 The following simulation  uses the 4th linear model to predict the cake price according to the diameter of the cake
 40 , which is an overfitting  combined model
41  Use two regularization
 methods for learning and prediction  
, respectively 18 ]]
 47 y_train = [[7], [9], [13], [17.5], [18 ]]
 48 #Prepare test data 49 x_test = [[6], [8], [11], [16 ]]
 50 y_test = [[8], [12], [15], [18 ]]
 51 #Fit a quadratic linear regression model 52 poly4 = PolynomialFeatures(degree=4)   # 4th degree polynomial feature generator 53 x_train_poly4 = poly4.fit_transform(x_train)
 54 #Build a model to predict 55 
 
 
 
 

 
regressor_poly4 = LinearRegression ()
 56  regressor_poly4.fit(x_train_poly4, y_train)
 57 x_test_poly4 = poly4.transform(x_test)
 58  print ( " Quadic linear model prediction score: " , regressor_poly4.score(x_test_poly4, y_test))   # 0.8095880795746723 
59  
60  # L1-norm regularized linear model for learning and prediction 
61 lasso_poly4 = Lasso()
 62  lasso_poly4.fit(x_train_poly4, y_train)
 63  print ( " The predicted score for L1 regularization is: " , lasso_poly4.score(x_test_poly4, y_test))   # 0.8388926873604382 
64 
65  # L2 norm regularized linear model for learning and prediction 
66 ridge_poly4 = Ridge()
 67  ridge_poly4.fit(x_train_poly4, y_train)
 68  print ( " The predicted score of L2 regularization is: " , ridge_poly4.score(x_test_poly4, y_test ))   # 0.8374201759366456

 

By comparing the generalization ability of the regularized model, the generalization ability is significantly better.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325424514&siteId=291194637