Least squares method, weighted least squares method, iterative reweighted least squares method (including code) [least squares linear solution]



Original Link: Address

Personal Notes:
Least Squares, Weighted Least Squares, Iterative Reweighted Least Squares. Combined with the purpose of realizing the function, the following mainly gives the derivation results, code implementation and some practical applications. At the end of the derivation process, some articles and materials for personal reference will be placed.

Here is a video derivation process recommended: use the matrix to find the partial derivation to get xxx= ( A T A ) − 1 A T B (A^TA)^{-1}A^TB (AT A)1AT BMatrix Multiplication Derivation Video

One: least square method (OLS)

1 Overview

The method of least squares (also known as the method of least squares) is a mathematical optimization technique. It finds the best function fit to the data by minimizing the sum of squared errors. The unknown data can be easily obtained by using the least square method, and the sum of squares of the errors between the obtained data and the actual data can be minimized. Here is an example, for example, the objective function y = a 0 + a 1 x + a 2 x 2 y=a_0+a_1x+a_2x^2
has been determinedy=a0+a1x+a2x2 ,x and yx and yx and y are determined actual values,xxx is the independent variable,yyy is the dependent variable, needa 0 , a 1 , a 2 a_0,a_1,a_2a0,a1,a2Three unknown parameters. At this time, three equations are generally required to form a system of equations to solve the three unknown parameters to determine the only solution. In practice, we usually find three unknown parameters in an overdetermined equation system (the number of equations is greater than the unknown parameters). At this time, we need to use the least squares method to solve this problem and find the optimal solution. The algebraic and matrix solutions are given below. It is recommended to use matrix solution (very convenient).

2: Algebraic formula

The idea of ​​the least squares method is to minimize the sum of the squares of the distances between the theoretical value and the predicted value.
insert image description here
Example: The most basic and commonly used in curve fitting is straight line fitting. set xxxyyThe functional relationship between y is: unary linear functiony = f ( a 0 , a 1 ) = a 0 + a 1 xy = f(a_0,a_1) = a_0+a_1xyf(a0,a1)a0+a1x algebraic derivation:
insert image description here

for a 0 and a 1 a_0 and a_1 respectivelya0and a1Find the partial derivative, here a 0 and a 1 a_0 and a_1a0and a1is an unknown parameter
insert image description here

Organize into a system of equations
insert image description here

Then simplify to:
insert image description here

3: Matrix (recommended)

Example: The most basic and commonly used in curve fitting is straight line fitting. set xxxyyThe functional relationship between y is: unary linear functiony = f ( a 0 , a 1 ) = a 0 + a 1 xy = f(a_0,a_1) = a_0+a_1xyf(a0,a1)a0+a1x is expressed using a matrix:A x = B Ax = BAx=B , seekxxx vector parameter
insert image description here

I will put a link at the end of the derivation process. So:

insert image description here

If it is a one-variable polynomial function:
insert image description here
where m represents the order of the polynomial, the sum of the squares of the discrete point and the polynomial is F ( a 0 , a 1 , . . . , am ) F(a_0,a_1,..., a_m )F(a0,a1,...,am) . wherenn represents the number of sampling points:

insert image description here

The univariate polynomial matrix expression is the same as the univariate linear term matrix expression: A x = B Ax = BAx=B

insert image description here
insert image description here

3.1: Implementation code

/* 普通最小二乘 Ax = B
 * (A^T * A) * x = A^T * B
 * x = (A^T * A)^-1 * A^T * B
 */
Array<double,Dynamic,1> GlobleFunction::leastSquares(Matrix<double,Dynamic,Dynamic> A, Matrix<double,Dynamic,1> B)
{
    
    
    //获取矩阵的行数和列数
    int rows =  A.rows();
    int col = A.cols();
    //A的转置矩阵
    Matrix<double,Dynamic,Dynamic> AT;
    AT.resize(col,rows);

    //x矩阵
    Array<double,Dynamic,1> x;
    x.resize(col,1);

    //转置 AT
    AT = A.transpose();

    //x = (A^T * A)^-1 * A^T * B
    x = ((AT * A).inverse()) * (AT * B);
    return x;

}

Matrix is ​​a matrix class in the Eigen library , and the Eigen library is introduced here to facilitate algebraic operations.

Two: Weighted Least Squares (WLS)

The weighted least squares method is a mathematical optimization technique that weights the original model to make it a new model without heteroskedasticity, and then uses the ordinary least squares method to estimate its parameters. Baidu Encyclopedia

1: Increase the diagonal matrix W

Add a diagonal matrix WW on the basis of the least square methodW , giving different weights to each set of data.

insert image description here
W T ∗ W W^T * W WTThe square of each data in the W diagonal matrix eliminates negative numbers.

insert image description here

1.1: Implementation code

/* 加权最小二乘(WLS)  W为对角线矩阵
 * W²(Ax - B) = 0
 * W²Ax = W²B
 * (A^T * W^T * W * A) * x = A^T * W^T * W * B
 * x = (A^T * W^T * W * A)^-1 * A^T * W^T * W * B
 */
Array<double,Dynamic,1> GlobleFunction::reweightedLeastSquares(Matrix<double,Dynamic,Dynamic> A, Matrix<double,Dynamic,1> B,Array<double,Dynamic,1> vectorW)
{
    
    
    //获取矩阵的行数和列数
    int rows =  A.rows();
    int col = A.cols();
    //vectorW为空,默认构建对角线矩阵1
    if(vectorW.isZero())
    {
    
    
        vectorW.resize(rows,1);
        for(int i=0;i<rows;++i)
        {
    
    
            vectorW(i,0) = 1;
        }
    }
    
    //A的转置矩阵
    Matrix<double,Dynamic,Dynamic> AT;
    AT.resize(col,rows);

    //x矩阵
    Array<double,Dynamic,1> x;
    x.resize(col,1);

    //W的转置矩阵
    Matrix<double,Dynamic,Dynamic> WT,W;
    W.resize(rows,rows);
    WT.resize(rows,rows);

    //生成对角线矩阵
    W = vectorW.matrix().asDiagonal();
    //转置
    WT = W.transpose();
    //转置 AT
    AT = A.transpose();

    // x = (A^T * W^T * W * A)^-1 * A^T * W^T * W * B
    x = ((AT * WT * W * A).inverse()) * (AT * WT * W * B);
    return x;
}

Matrix is ​​a matrix class in the Eigen library , and the Eigen library is introduced here to facilitate algebraic operations.

Three: Iterative Reweighted Least Squares (IRLS)

1: The method of iterative reweighted least squares (also called iterative weighted least squares) (IRLS) is used to solve problems with ppCertain optimization problems for objective functions of the p- norm form:Wikipedia. Iterative weighting can fit the objective function and known data, but at this time some data are far away from the overall objective function, and when participating in the least square method, it will have a great impact on the estimated parameters. At this time It is necessary to optimize the parameters, give less weight to the farther data (don't make it appear very important, so the impact is relatively small), and give greater weight to the closer data (great influence). Iterative weighted least squares is to establish a weighted least squares to perform an iteration to estimate the optimal value.
insert image description here

By an iterative approach, where each step involves solving a weighted least squares problem of the form:
insert image description hereinsert image description here

2: The following is an iterative method in a paper to solve the paper address: Burrus, CS (2014). Iterative Reweighted Least Squares ∗.

insert image description hereinsert image description here
MATLAB code 1:

% m-file IRLS0.m to find the optimal solution to Ax=b
% minimizing the L_p norm ||Ax-b||_p, using basic IRLS.
% csb 11/10/2012
function x = IRLS0(A,b,p,KK)
if nargin < 4, KK=10; end;
x = pinv(A)*b; 					% Initial L_2 solution
E = [];
for k = 1:KK 					% Iterate
	e = A*x - b; 				% Error vector
	w = abs(e).^((p-2)/2); 		% Error weights for IRLS
	W = diag(w/sum(w)); 		% Normalize weight matrix
	WA = W*A; 					% apply weights
	x = (WA'*WA)\(WA'*W)*b; 	% weighted L_2 sol.
	ee = norm(e,p); E = [E ee]; % Error at each iteration
end
plot(E)

MATLAB code 2:

% m-file IRLS1.m to find the optimal solution to Ax=b
% minimizing the L_p norm ||Ax-b||_p, using IRLS.
% Newton iterative update of solution, x, for M > N.
% For 2<p<infty, use homotopy parameter K = 1.01 to 2
% For 0<p<2, use K = approx 0.7 - 0.9
% csb 10/20/2012
function x = IRLS1(A,b,p,K,KK)
if nargin < 5, KK=10; end;
if nargin < 4, K = 1.5; end;
if nargin < 3, p = 10; end;
pk = 2; % Initial homotopy value
x = pinv(A)*b; 						% Initial L_2 solution
E = [];
for k = 1:KK 						% Iterate
	if p >= 2, pk = min([p, K*pk]);	    % Homotopy change of p
	else pk = max([p, K*pk]); end
	e = A*x - b; 						% Error vector
	w = abs(e).^((pk-2)/2); 			% Error weights for IRLS
	W = diag(w/sum(w)); 				% Normalize weight matrix
	WA = W*A; 							% apply weights
	x1 = (WA'*WA)\(WA'*W)*b;		    % weighted L_2 sol.
	q = 1/(pk-1); 						% Newton's parameter
	if p > 2, x = q*x1 + (1-q)*x; nn=p; % partial update for p>2
	else x = x1; nn=2; end 				% no partial update for p<2
	ee = norm(e,nn); E = [E ee]; 		% Error at each iteration
end
plot(E)

C++ code:


/* 迭代重加权最小二乘(IRLS)  W为权重,p为范数
 * e = Ax - B
 * W = e^(p−2)/2
 * W²(Ax - B) = 0
 * W²Ax = W²B
 * (A^T * W^T * W * A) * x = A^T * W^T * W * B
 * x = (A^T * W^T * W * A)^-1 * A^T * W^T * W * B
 * 参考论文地址:https://www.semanticscholar.org/paper/Iterative-Reweighted-Least-Squares-%E2%88%97-Burrus/9b9218e7233f4d0b491e1582c893c9a099470a73
 */
Array<double,Dynamic,1> GlobleFunction::iterativeReweightedLeastSquares(Matrix<double,Dynamic,Dynamic> A, Matrix<double,Dynamic,1> B,double p,int kk)
{
    
    
    /* x(k) = q x1(k) + (1-q)x(k-1)
     * q = 1 / (p-1)
     */
    //获取矩阵的行数和列数
    int rows =  A.rows();
    int col = A.cols();

    double pk = 2;//初始同伦值
    double K = 1.5;

    double epsilong = 10e-9; // ε
    double delta = 10e-15; // δ
    Array<double,Dynamic,1> x,_x,x1,e,w;
    x.resize(col,1);
    _x.resize(col,1);
    x1.resize(col,1);
    e.resize(rows,1);
    w.resize(rows,1);
    //初始x  对角矩阵w=1
    x = reweightedLeastSquares(A,B);

    //迭代  最大迭代次数kk
    for(int i=0;i<kk;++i)
    {
    
    
        //保留前一个x值,用作最后比较确定收敛
        _x = x;

        if(p>=2)
        {
    
    
            pk = qMin(p,K*pk);
        }
        else
        {
    
    
            pk = qMax(p,K*pk);
        }
        //偏差
        e = (A * x.matrix()) - B;
        //偏差的绝对值//  求矩阵绝对值 :e = e.cwiseAbs(); 或 e.array().abs().matrix()
        e = e.abs();
        //对每个偏差值小于delta,用delta赋值给它
        for(int i=0;i<e.rows();++i)
        {
    
    
            e(i,0) = qMax(delta,e(i,0));
        }
        //对每个偏差值进行幂操作
        w = e.pow(p/2.0-1);
        w = w / w.sum();

        x1 = reweightedLeastSquares(A,B,w);

        double q = 1 / (pk-1);
        if(p>2)
        {
    
    
            x = x1*q + x*(1-q);
        }
        else
        {
    
    
            x = x1;
        }
        //达到精度,结束
        if((x-_x).abs().sum()<epsilong)
        {
    
    
            return x;
        }
    }
    return x;
}

The C++ implementation code is basically the same as MATLAB, but with a slight improvement, which refers to Wikipedia and Burrus, CS (2014). Iterative Reweighted Least Squares ∗.

Four: Application

The following is solved for overdetermined equations. The number of data is greater than the unknown parameter.

1: Fitting circle (algorithm: iteratively reweighted least squares)

1: Using the least square effect, it can be seen that the external noise interference is still relatively large. The following optimizes using iterative reweighted least squares algorithm.
insert image description here2: Iterative reweighted least squares
1st iteration
insert image description here
2nd iteration 3rd
insert image description hereiteration
insert image description here4th iteration
insert image description here...
20th iteration
insert image description here

2: Straight line fitting (algorithm: iteratively reweighted least squares)

1: y = a 0 + a 1 x y = a_0 + a_1x y=a0+a1The picture below x
uses the least squares method. It can be seen that the data noise far below has a great impact on the overall dense data.
insert image description here1.2: Use iterative reweighted least squares
1st iteration
insert image description here...
100th iteration
insert image description herethroughnnAfter n iterations, the noise basically has no effect on the whole, and the parameters obtained at this time are ideal.

3: Curve fitting (algorithm: least squares)

1: Least squares curve fitting, looking for the best subterm function
The least squares method is a common method for curve fitting, and using this method is very important for the selection of matching functions. . The so-called matching function is the route that the function passes through to achieve a best match at the point in the graph.
Otherwise, overfitting and underfitting will occur.
insert image description hereinsert image description here

insert image description hereinsert image description here
Linear function fitting:
insert image description here
Quadratic function fitting:

insert image description here
Fitting a cubic function:
insert image description here
Fitting a quartic function:
insert image description here
Fitting a quintic function:

insert image description here
Hexagram fit:

insert image description here
Fitting a heptonic function:
insert image description here
Fitting an octagonal function:

insert image description here

Nine function fitting:
insert image description here
It can be found that the function fits very well at the fifth function, and it becomes more and more overfitting as it goes on.

4: N-point calibration (including 9-point calibration) (algorithm: least squares)

The 9-point calibration is to find the relationship between the pixel coordinates and the world coordinates in the vision.
insert image description hereIt can be seen that the halcon operator block vector_to_hom_mat2d uses the least square method to calculate the matrix. The external algorithm [2] in the figure is actually realized by the least squares algorithm in this article. The internal algorithm [1] is realized by calculating partial derivatives. In this article, N-point calibration is implemented.

Five: Summary

1: Tools: Main Qt + Eigen library + QCustomPlot class
Eigen library is used for matrix calculation, algebraic calculation library
QCustomPlot class is used for drawing and data visualization

2: The complete code above has been uploaded to GitHub

3: Reference
Least Squares Algebra Derivation
Least Squares Matrix Derivation
Least Squares?
The principle understanding of the absolute value regularization
robust learning algorithm least square method for Shenma is not bad

Understanding the least squares method from the perspective of maximum likelihood
insert image description here

Guess you like

Origin blog.csdn.net/weixin_43763292/article/details/127839786