We won't say much about the common least squares method. The following mainly introduces some advanced methods of the least squares method.
-
regularized least squares
When using the common least squares method for regression analysis, it often encounters the problem of overfitting, that is, it performs well on the training data set, but performs poorly on the test data set. At this time, it is necessary to introduce a regularization term into the least squares method. There are two common types of regularization.
L2 regularization (Ridge regression):
L1 regularization (Lasso regression):
Explain regularization from a probabilistic perspective: regularization is equivalent to a prior distribution of the parameter W. If the distribution is a Gaussian distribution, it is L2 regularization; if the distribution is a Laplace distribution, it is L1 regularization. By adding regularization to limit the parameter space, control the complexity of the model, thereby preventing overfitting.
-
Damped least squares method (Levenberg–Marquardt algorithm, LMA)
The least squares method we commonly use is to fit linear equations , but for nonlinear functions, we need to use the damped least squares method, which is essentially an iterative solution process. The basic idea is to use Taylor expansion to linearize the nonlinear function .
Let the equation where x is the variable and c is the parameter to be fitted. We want to find a set of c such that:
Expanding the function Taylor, keeping only the first-order term, we can get:
where is the Jacobian matrix:
Thereby , it can be solved , and iteratively updated until .