Linear equations AX=b, AX=0 and least squares solutions of nonlinear equations (solution equations -> optimization problems)

1. The least squares solution of the non-homogeneous linear equation system AX=b

Insert image description here
The overdetermined system of equations has no solution because the system of equations contains too many constraints and cannot satisfy all the constraints. In this case, some equations of the system of equations must be contradictory, that is, the conditions they describe are incompatible and cannot be satisfied at the same time. So solving overdetermined equations is actually a fitting problem. The basic idea is to minimize the sum of squared errors of all equations to obtain the optimal solution!

The meaning of the full rank of column A of the matrix is: the q variables x we ​​require are independent of each other (no vector can be represented by a linear combination of a limited number of other vectors), and all q variables must be found. This cannot It is said that q variables can be determined by finding q-1.

Then let’s talk about the second situation when the columns of matrix A are full rank : p=q means that matrix A is a square matrix, and because the columns are full rank, it is a square matrix of full rank. Due to the property of full rank, the rows and columns of A The formula is not equal to 0. At this time, the equation Ax=y is easy to find. Multiplying the left and right sides of the equation by the inverse of A will have a unique solution.

What we want to focus on is finding the least squares solution, which is the third case p>q when the column of matrix A is full rank : p>q means that the constraints I gave are more than the parameters you asked for. At this time, we define an energy function E(x), so that A and x are multiplied as close as possible to y. That is, the closer the total error of Ax-y is to 0, the better. The solution obtained at this time is called the least squares solution. The two vertical lines represent the modulus of the vector or the L2 norm .Insert image description here

Therefore, for the linear least squares problem, as long as ATA is non-singular, it can be solved using solution method 1 in the figure above ( full rank of column A already guarantees that ATA is non-singular, because full rank of column A - "Full rank of A -" ATA is invertible (not strange )). Whether ATA is invertible depends on whether A is a full-rank matrix (PS: No matter whether matrix A is a square matrix or not, the rank of the columns and the rank of the rows are the same, so whether it is full-rank depends on the rank of the column or row) , If it is not a full-rank matrix of A, it means that the constraints are not enough, and this method is invalid. If it is reversible, then this problem has a unique solution ! (PS: Matrix non-singularity is usually also called an invertible matrix, which is an equivalent concept ! It means that the determinant of a square matrix is ​​not equal to zero.)
Insert image description here

2. The least squares solution of the homogeneous linear equation system AX=0

Insert image description here

In this case, what we need to pay attention to is finding the least squares solution, which is also the third case p>q when the matrix A column is full rank: At this time, we still use the above solution method 1 to find the partial derivative of x, and then Let the derivative equal 0, but we find that the unknown x vector solved in this way is actually a 0 vector, but in most cases, we are not interested in the 0 solution. What we want is a non-0 solution, so we must add a constraint to X, let When X satisfies the conditions, the square of ║AX║ is the smallest, so a least squares problem with constraints is constructed.

Insert image description here

3. Least squares solution of nonlinear equations

For problems solved by nonlinear least squares , if the elements of the unknown Due to the nature of linear transformation of X, the matrix cannot express the transformation of nonlinear transformation. (PS: The linear transformation of the matrix can be used to describe the rotation, scaling, projection and other transformations of the vector.)
Insert image description here

Refer: least squares solution to a system of linear equations

Guess you like

Origin blog.csdn.net/Rolandxxx/article/details/126585851