Article directory
-
-
-
-
-
- 1.1 Introduction to Glmnet
- 1.2 Glmnet mathematical representation
- 1.3 Comparison of Glmnet multiple regression methods
- 1.4 Glmnet code principle
- 1.5 Glmnet installation and loading
- 1.6 Glmnet regression use
- 1.7 Analysis of Glmnet regression results
- 1.8 Visualization of Glmnet regression results
- 1.9 Glmnet model evaluation method
- 1.10 Glmnet selects the best model
- 1.11 Glmnet prediction
-
-
-
-
1.1 Introduction to Glmnet
Glmnet is a package for fitting generalized linear and similarity models via penalized maximum likelihood. The parameter controlling the computation of lasso regression or elastic net regression on a logarithmic scale is the regularization parameter lambda. This algorithm is very fast and can take advantage of the sparsity of the input matrix x.
It fits linear, logistic, and polynomial, Poisson, etc. regression models. It can also fit multiple linear regression models, custom family generalized linear regression models, and lasso regression models. The package also has methods for prediction, plotting, and cross-validation.
1.2 Glmnet mathematical representation
GlmnetSpecify the equation:
min β 0 , β 1 N ∑ i = 1 N wil ( yi , β 0 + β T xi ) + λ [ ( 1 − α ) ∣ ∣ β ∣ ∣ 2 2 / 2 + α ∣ ∣ β ∣ ∣ 1 ] min_{\beta_0,\beta}\frac{1}{N}\sum_{i=1}^{N}w_{i}l(y_{i},\beta_0+\beta^{ T}x_{i})+\lambda [(1-\alpha )||\beta ||_{2}^{2}/2+\alpha ||\beta ||_{1}]minb0, bN1i=1∑Nwil ( yi,b0+bTxi)+l [( 1−a ) ∣∣ b ∣ ∣22/2+α∣∣β∣∣1]
Cover the entire range of possible solutions on a grid of lambda values. Here the l(y i , η i ) function is the negative log-likelihood estimate of observation i, eg, in the Gaussian case, it is 12*(y i −η i ) 2 .
1.3 Comparison of Glmnet multiple regression methods
Elastic net regression is controlled by α, which bridges the gap between lasso regression (α=1, default) and ridge regression (α=0). The parameter λ controls the overall strength of the penalty.
a | return |
---|---|
α=1, default value | lasso regression |
α=0 | ridge ridge regression |
α=(0,1) | elastic net regression |
As we all know, ridge regression shrinks the coefficients of interrelated predictors, while lasso regression tends to select some of them and discard others, that is, lasso regression has the effect of selecting variables.
Elastic net regression combines the strengths and weaknesses of both: α = 0.5 tends to select or ignore features across groups if predictors are correlated across groups. This is a higher-level parameter, and the user may pre-select an alpha value or try several different alpha values. One use of the alpha parameter in the function is for numerical stability; for example, as alpha gets closer to 1, elastic net will be more similar to lasso regression, but eliminates cases where extreme data make the model too simple.
1.4 Glmnet code principle
The glmnet algorithm uses the cyclic coordinate descent method to optimize the objective function sequentially while other parameters remain unchanged, and repeat the cycle until convergence. This R package also utilizes strong rules to effectively constrain the active set. The algorithm can be computed very quickly due to efficient updates and techniques such as warm start and active set convergence.
The code can handle sparse input matrix formats, as well as range restrictions on coefficients. The core code of glmnet is a set of Fortran subroutines, which makes its execution very fast.
1.5 Glmnet installation and loading
Like all other R packages, installing glmnet requires only one line of code.
install.packages("glmnet")
Setting it as a domestic mirror website in R Studio can speed up the installation of the package.
library(glmnet)
1.6 Glmnet regression use
Load the data package that comes with R for Glmnet display
data("MultinomialExample")
In the default case where we don't adjust the parameters, it uses a Gaussian distribution or least squares for regression. First determine the regression method to be used,
And we first need to determine the regression method to use
Such as Lasso regression, Ridge regression, Elastic net, etc.
fit = glmnet(data_x, data_y,alpha =0.5,standardize=TRUE)
alpha=0 Ridge regression does not perform variable selection
alpha=1 lasso regression for variable selection
Other values between [0,1] are Elastic net regression.
fit = glmnet(MultinomialExample$x,MultinomialExample$y,alpha = 0.5,standardize=TRUE)
standardize to standardize to avoid dimension influence.
Fit is an object of the glmnet class that contains all relevant information about the fitted model for further use. It is not necessary to use this object directly for analysis. Various methods are provided on the fit object in the Glmnet package, such as plot, print, coef, and predict, allowing us to perform these tasks more efficiently.
1.7 Analysis of Glmnet regression results
print fit result
print(fit)
> #打印拟合结果
> print(fit)
Call: glmnet(x = MultinomialExample$x, y = MultinomialExample$y, alpha = 0, standardize = TRUE)
Df %Dev Lambda
1 30 0.00 267.300
2 30 0.28 243.600
3 30 0.30 221.900
4 30 0.33 202.200
5 30 0.37 184.200
6 30 0.40 167.900
7 30 0.44 153.000
8 30 0.48 139.400
9 30 0.53 127.000
10 30 0.58 115.700
11 30 0.64 105.400
12 30 0.70 96.060
13 30 0.76 87.530
14 30 0.84 79.750
15 30 0.92 72.670
16 30 1.00 66.210
17 30 1.10 60.330
18 30 1.20 54.970
19 30 1.32 50.090
20 30 1.44 45.640
21 30 1.58 41.580
22 30 1.73 37.890
23 30 1.89 34.520
24 30 2.07 31.460
25 30 2.26 28.660
26 30 2.47 26.120
27 30 2.69 23.800
28 30 2.94 21.680
29 30 3.21 19.760
30 30 3.50 18.000
31 30 3.81 16.400
32 30 4.15 14.940
33 30 4.52 13.620
34 30 4.92 12.410
35 30 5.34 11.300
36 30 5.80 10.300
37 30 6.29 9.385
38 30 6.82 8.552
39 30 7.38 7.792
40 30 7.98 7.100
41 30 8.62 6.469
42 30 9.29 5.894
43 30 10.01 5.371
44 30 10.76 4.893
45 30 11.55 4.459
46 30 12.38 4.063
47 30 13.25 3.702
48 30 14.15 3.373
49 30 15.08 3.073
50 30 16.04 2.800
- It shows, from left to right, the number of non-zero coefficients (Df), the explained percentage (null) deviation (%Dev) and the value of lambda (Lambda).
- glmnet by default fits a model with 100 values of lambda.
- If %Dev doesn't change by a sufficient size from one lambda to the next, it considers the fit to be complete and stops early to improve computation speed.
- For brevity, we have truncated the printouts here to show only parts.
1.8 Visualization of Glmnet regression results
We can visualize the regression results with plot()
plot(fit)
elastic net:
Lasso returns:
Each curve corresponds to a variable. It shows the path of its coefficients with respect to the ℓ1 norm of the entire coefficient vector as λ varies. The upper axis represents the number of non-zero coefficients at the current λ, which is the lasso's effective degrees of freedom (df). Users may also wish to annotate the curves: this can be achieved by setting label=TRUE in the plot command.
Ridge regression:
It can be clearly observed that the variables under the ridge regression are always greater than 0, that is, the explanatory variables have not been screened out; in contrast, when the lambda changes in the elastic net and lasso regression, some variables disappear, and the model interpretation is no longer performed.
We can get the model coefficients at one or more λ within the sequence range through the following code:
Get the parameter estimation results when hyperparameter = 0.1
coef(fit,s=0.1)
1.9 Glmnet model evaluation method
The function glmnet returns a list of models for the user to choose from. In many cases, users may prefer software to choose one of them. Cross-validation is probably the easiest and most widely used method for this task. CV. Glmnet is the main function here for cross-validation, along with various supporting methods such as plotting and predicting.
Ten-fold cross-validation for model evaluation:
cvfit <- cv.glmnet(data_x, data_y,nfolds = 10)
cv.glmnet returns a cv.glmnet object, a list containing all components that cross-validate matches. As with glmnet, direct use of this object is discouraged, instead utilizing functions designed by its package.
Plot showing the effect of hyperparameters:
plot(cvfit)
This will plot the cross-validation curve (dashed red line) along with the upper and lower standard deviation curves along the lambda series (error bars). Two special values along the lambda sequence are indicated by vertical dashed lines.
1.10 Glmnet selects the best model
Print the best hyperparameters to help the user choose:
The model cross-validation error under lambda.min is the smallest. lambda.1se gives the most regularized model such that the cross-validation error is within one standard error of the minimum.
print(cvfit$lambda.min)
print(cvfit$lambda.1se)
> print(cvfit$lambda.min)
[1] 0.02866132
> print(cvfit$lambda.1se)
[1] 0.07266689
1.11 Glmnet prediction
Set the seed to randomly generate new data
set.seed(29)
nx <- matrix(rnorm(500 * 30), 500, 30)
Prediction through the trained model
predict(fit, newx = nx, s = c(0.1, 0.05))
> predict(fit, newx = nx, s = c(0.1, 0.05))
s1 s2
[1,] 2.7309311 2.7732605
[2,] 2.2859731 2.2954917
[3,] 2.2992034 2.3208074
[4,] 1.6894255 1.6735020
[5,] 2.5623190 2.5941302
[6,] 1.5733710 1.5388561