git:https://github.com/linyi0604/MachineLearning
Regularization:
Improve the generalization ability of the model on unknown data
Avoid parameter overfitting
Common methods of regularization:
Add a penalty term to the parameter on the objective function
Reduce the influence of a parameter on the result
L1 regularization: lasso
is linear The regression objective function is followed by an L1 norm vector penalty term.
f = w * x^n + b + k * ||w||1
x is the input sample feature
w is the learned parameter of each feature
n is the number of times
b is the bias, intercept
||w||1 is the L1 norm of the feature parameter, as the penalty vector
k is the strength of the penalty
L2 norm Regularization: ridge
adds the L2 norm vector penalty term after the objective function of the linear regression.
f = w * x^n + b + k * ||w||2
x is the input sample feature
w is the learned parameter of each feature
n is the number of times
b is the bias and intercept
||w||2 is the L2 norm of the feature parameter, as the penalty vector
k is the strength of the penalty The
following simulation uses a 4-order linear model to predict the cake price according to the diameter of the cake
, which is an overfitting model
using two regularization methods for learning and predict
1 from sklearn.linear_model import LinearRegression , Lasso, Ridge 2 #Import polynomial feature generator 3 from sklearn.preprocessing import PolynomialFeatures 4 5 6 ''' 7 Regularization: 8 Improve the generalization ability of the model on unknown data 9 Avoid excessive parameters Fitting 10 Commonly used methods of regularization: 11 Add the penalty item to the parameter on the objective function 12 Reduce the influence of a parameter on the result 13 14 L1 regularization: lasso 15 Add the L1 norm to the objective function of linear regression Vector penalty term. 16 17 f = w * x^n + b + k * ||w||1 18 19 x is the input sample feature 20 w is the parameter of each feature learned 21 n is the number of times 22 b is the bias, the intercept 23 ||w||1 is the L1 norm of the feature parameter, as the penalty vector 24 k is Penalty intensity 25 26 L2 norm regularization: ridge 27 adds the L2 norm vector penalty term to the objective function of linear regression. 28 29 f = w * x^n + b + k * ||w||2 30 31 x is the input sample feature 32 w is the learned parameter of each feature 33 n is the number of times 34 b is the bias, cut The distance 35 ||w||2 is the L2 norm of the feature parameter, as the penalty vector 36 k is the penalty intensity 37 38 39 The following simulation uses the 4th linear model to predict the cake price according to the diameter of the cake 40 , which is an overfitting combined model 41 Use two regularization methods for learning and prediction , respectively 18 ]] 47 y_train = [[7], [9], [13], [17.5], [18 ]] 48 #Prepare test data 49 x_test = [[6], [8], [11], [16 ]] 50 y_test = [[8], [12], [15], [18 ]] 51 #Fit a quadratic linear regression model 52 poly4 = PolynomialFeatures(degree=4) # 4th degree polynomial feature generator 53 x_train_poly4 = poly4.fit_transform(x_train) 54 #Build a model to predict 55 regressor_poly4 = LinearRegression () 56 regressor_poly4.fit(x_train_poly4, y_train) 57 x_test_poly4 = poly4.transform(x_test) 58 print ( " Quadic linear model prediction score: " , regressor_poly4.score(x_test_poly4, y_test)) # 0.8095880795746723 59 60 # L1-norm regularized linear model for learning and prediction 61 lasso_poly4 = Lasso() 62 lasso_poly4.fit(x_train_poly4, y_train) 63 print ( " The predicted score for L1 regularization is: " , lasso_poly4.score(x_test_poly4, y_test)) # 0.8388926873604382 64 65 # L2 norm regularized linear model for learning and prediction 66 ridge_poly4 = Ridge() 67 ridge_poly4.fit(x_train_poly4, y_train) 68 print ( " The predicted score of L2 regularization is: " , ridge_poly4.score(x_test_poly4, y_test )) # 0.8374201759366456
By comparing the generalization ability of the regularized model, the generalization ability is significantly better.