Regularization linear model
Article Directory
1.Ridge Regression (ridge regression, also known Tikhonov regularization)
+ L2 regular linear regression
Specific loss function
+ = Loss function objective function regularization term
Ridge regression is a regular version of linear regression, is added regularization (regularization term) in the cost function of the original linear regression in:
to achieve at the same time fit the data, the model weights as small as possible purpose, ridge regression cost function :
which is
- α = 0: Ridge regression is a linear regression degenerate
2.Lasso return
Linear regression + L1 canonical
Lasso return is another regular version of linear regression, regular ℓ1 entry for the weight vector norm.
The cost Lasso regression function:
【note】
- Lasso Regression cost function at θi = 0 is a non-conductive.
- Solution: θi = 0 at a time instead of a gradient of the gradient vector (subgradient vector), the following formula
- Lasso Regression secondary gradient vector
Lasso Regression There is a very important property: tend to completely eliminate unimportant weights.
For example: when the α value is relatively large, the degradation of the secondary even order polynomial is linear: the right order polynomial feature weight is set to zero.
That is, Lasso Regression feature selection can be automatically performed, and outputs a sparse model (only a few of which the right weight is non-zero).
3. resilient network
The first two fused together, ridge regression + Lasso return
Flexible Network Lasso and Ridge Regression Regression was compromise, by ratio (mix ratio) r Mixed control:
- r = 0: the network becomes resilient ridge regression
- r = 1: elastic network will return to Lasso
Resilient network cost function:
In general, we should avoid the use of simple linear regression , and response to model certain regularization process, that how to choose the regularization methods?
summary:
-
Common: ridge regression
-
Assuming that only a small portion of the feature is useful:
- Elastic Network
- Lasso
- In general, the more extensive use of network resiliency. Because the characteristic dimension greater than the number of training samples, or characterized by strong relevant circumstances, Lasso return performance is not stable.
-
api:
-
from sklearn.linear_model import Ridge, ElasticNet, Lasso
-
4.Early Stopping
Strictly speaking not a regularization
Early Stopping is one of regularization iterative learning method.
Its approach is: to stop training in validation error rate reaches a minimum of time.