The least square solution and the best least square solution of compatible/incompatible non-homogeneous linear equations

First, explain the concept of compatible/incompatible non-homogeneous linear equations:

(1) The necessary and sufficient condition for the linear equations Ax=b to be solvable is rank(A,b)=rank(A). At this time, Ax=b is called the compatible non-homogeneous linear equations;

(2) For Ax=b, if rank(A,b)≠rank(A), that is, b∉R(A), the equation system Ax=b has no solution. At this time, it is said that Ax=b is incompatible and non-homogeneous. Sublinear system of equations.

The following formally begins.


Question : Suppose x 1 , x 2 ,..., x n and y satisfy a linear relationship, where x 1 , x 2 ,..., x n are n independent variables and y is the dependent variable, so y=a 1 x 1 +... +a n x n , now there are a total of s groups of observations as follows:
Insert picture description here
find a 1 ,a 2 ,...,a n ?

The solution process is as follows :

First of all, the system of equations can be constructed as follows: The
Insert picture description here
above figure is a linear equation system. If the solution can be found, everything is OK, but it is very likely that there is no solution.

When there is no solution, only an approximate solution
Insert picture description here
can be obtained. If the error of the approximate solution is required to be the smallest, a function can be constructed: just find the minimum value of the function in the figure.

The conventional method is as follows: respectively find the partial derivatives and set them to 0, that is, ∂f/∂a i =0, where i=1,2,...,n. Finally, α 12 ,...,α n can be obtained .

The following uses the projection theory of the inner product space to solve the solution of the incompatible equations Ax=b.

Let s×n order matrix A be:
Insert picture description here
Let A=[α 12 ,…,α n ], where α i is the i-th column vector of A, where i=1,2,…,n, let the vector b=[y 1 ,y 2 ,…,y s ] T , x=[a 1 ,a 2 ,…,a n ] T , then: Ax=b, that is,
Insert picture description here
let W=span{α 12 ,…,α n }, α i ∈R s , where i=1,2,…,n. Then W is a linear subspace of the s-dimensional vector space, b∈R s . If the vector b is in the subspace W, then the system of equations has a solution, otherwise there is no solution, which is the following situation:
Insert picture description here
Set the vector y=Ax, and now the problem is converted to find the distance d(b, between the vector Ax and the vector b by x) Ax) is the smallest . It is easy to know that among all the vectors on the subspace W, the distance between the projection of b on W and b is the smallest, so when y=Ax is the projection of b on W, the x is Ax = Approximate solution of b.

Since y is now the projection of b on W, there is by⊥W, that is, by⊥span{α 12 ,…,α n }, so there is by⊥α i , where i=1,2,…, n.

According to the inner product operation: <α i ,b-Ax>=0, so <α i ,b>=<α i ,Ax>=<α i ,a 1 α 1 +…+a n α n >, therefore <α i ,b>=a 1i1 >+…+a nin >, where i=1,2,…,n. Written as a matrix equation:
Insert picture description here
G(α 12 ,…,α n )·x=G(α 12 ,…,α n ;b), where G(α 12 ,…,α n ) is α 12,...,Α n Gram matrix, G(α 12 ,…,α n ;b) is the co-Gram matrix of α 12 ,…,α n and b.

It should be noted that G(α 12 ,…,α n ) is not necessarily invertible, because the column vector group α 12 ,…,α n of A is not necessarily a linearly independent vector group, when it is a linearly independent vector When grouping, directly set x=G(α 12 ,…,α n ) -1 ·G(α 12 ,…,α n ;b) to get the solution x=[a 1 ,a 2 , ..., A n- ] T . But regardless of whether it is a linearly independent vector group, there are the following conclusions.

First, for the equation group G(α 12 ,…,α n )·x=G(α 12 ,…,α n ;b), that is, A H Ax=A H b, this is still a non Homogeneous linear equations, the relationship between the coefficient matrix of the equations, namely the rank of A H A and the rank of the augmented matrix [A H A, A H b], is discussed below.

First of all, rank(A H A)≤rank(A H A,A H b). After all, increasing the number of columns may increase the rank of the matrix, right.

And rank(A H A,A H b)=rank[A H (A,b)]≤rank(A H ), after all, the rank of the product of two matrices is less than or equal to the rank of any one of the matrices, right.

In the real number field, rank(A)=rank(A H )=rank(A T ).

The following content is all based on the real number field, but uses A H to represent AT .

Substituting the solution x of the equation system Ax=0 into A H Ax=0 must be true, that is, the solution of Ax=0 must be the solution of A H Ax=0. If x is the solution of the equation system A H Ax=0, the conjugate transpose x H of the vector x on the left and right sides of A H Ax=0 is multiplied by the conjugate x H to obtain x H A H Ax=0, then <Ax,Ax>=0, Then there is Ax=0, that is , the solution of A H Ax=0 must be the solution of Ax=0. In summary, A H Ax=0 and Ax=0 are the same solution equations, so the rank of the coefficient matrix of the two is the same, that is, rank(A H A)=rank(A).

Rank (A) = rank (A H ), 则 rank (A H A) = rank (A) = rank (A H ). Also, the cause rank (A H A, A H b) = rank [A H (A, b)] ≤ rank (A H ) = rank (A) = rank (A H A), immediate rank (A H A, A) H b) ≤Rank (A H A),且有rank (A H A) ≤Rank (A H A, A H b), top after chromatic rank (A H A) = rank (A H A, A H b) ..

That is, for the equation system A H Ax=A H b, the rank of the coefficient matrix, namely A H A, is equal to the rank of the augmented matrix [A H A, A H b], so the equation system A H Ax=A H b total There is a solution, that is, the equation system G(α 12 ,…,α n )·x=G(α 12 ,…,α n ;b) always has a solution, and any solution α of this system of equations is It is the approximate solution of the equation system Ax=b, which solves the problem at the beginning of the blog post.

If you can get a better solution, it would be better. Then what kind of solution is better?

(1) For compatible equations, it is said that the solution with the smallest modulus (2-norm) among all solutions x of the compatible equation system Ax=b is the smallest modulus solution of Ax=b, where the 2-norm of x is | |x||=sqrt(x H x).

(2) For incompatible equations, we also hope to have "solutions" of the equations, and require that the "solutions" obtained are the least-squares solution and the best least-squares solution of the system of equations . The least square solution and the best least square solution are explained below.

Suppose A∈C m×n , b∈C m , n-dimensional column vector x 0 satisfies that for any n-dimensional column vector x, there is ||Ax 0 -b|| 2 ≤||Ax-b|| 2 , It is said that x 0 is a least square solution of the equation system Ax=b . If μ is the least square solution, if for any least square solution x 0 , there are inequalities ||μ||≤||x 0 ||, then μ is the best least square solution or the smallest norm Two multiplication solution .

Theorem : Suppose A∈C m×n and B∈C n×m , then the following two propositions are equivalent:

(1) For any given b∈C m , then x=Bb must be the least square solution of Ax=b;

(2) (AB) H = AB , ABA = A。

The above theorem shows that the least squares solution of the equations Ax=b is x=Bb, where B is the generalized inverse matrix A -of A , and B needs to satisfy (AB) H = AB.

Finally, it should be pointed out that x=A + b is the best least square solution of the equations Ax=b, where A + is the plus inverse of matrix A.


to sum up:

(1) For the compatible non-homogeneous linear equations Ax=b, the best solution is the smallest modular solution.

(2) For incompatible non-homogeneous linear equations Ax=b, the best solution is the best least square solution x=A + b.

Guess you like

Origin blog.csdn.net/qq_40061206/article/details/114105804