7.4 The singular value decomposition (奇异值分解)

本文为《Linear algebra and its applications》的读书笔记

As we know, not all matrices can be factored as A = P D P − 1 A =PDP^{-1} A=PDP1 with D D D diagonal. However, a factorization A = Q D P − 1 A = QDP^{-1} A=QDP1 is possible for any m × n m\times n m×n matrix A A A ! A special factorization of this type, called the singular value decomposition, is one of the most useful matrix factorizations in applied linear algebra.

The singular value decomposition is based on the following property of the ordinary diagonalization that can be imitated for rectangular matrices: The absolute values of the eigenvalues of a symmetric matrix A A A measure the amounts that A A A stretches or shrinks certain vectors (the eigenvectors). If A x = λ x A\boldsymbol x =\lambda \boldsymbol x Ax=λx and ∥ x ∥ = 1 \left\|\boldsymbol x\right\|= 1 x=1, then

在这里插入图片描述
If λ 1 \lambda_1 λ1 is the eigenvalue with the greatest magnitude, then a corresponding unit eigenvector v 1 \boldsymbol v_1 v1 identifies a direction in which the stretching effect of A A A is greatest. This description of v 1 \boldsymbol v_1 v1 and ∥ λ 1 ∥ \left\|\lambda_1\right\| λ1 has an analogue for rectangular matrices that will lead to the singular value decomposition.

EXAMPLE 1
If A = [ 4 11 14 8 7 − 2 ] A=\begin{bmatrix}4& 11& 14\\8& 7& -2\end{bmatrix} A=[48117142], then the linear transformation x ↦ A x \boldsymbol x \mapsto A\boldsymbol x xAx maps the unit sphere(球) { x : ∥ x ∥ = 1 } \{\boldsymbol x:\left\|\boldsymbol x\right\|= 1\} { x:x=1} in R 3 \R^3 R3 onto an ellipse in R 2 \R^2 R2, shown in Figure 1. Find a unit vector x \boldsymbol x x at which the length ∥ A x ∥ \left\|A\boldsymbol x\right\| Ax is maximized, and compute this maximum length.

在这里插入图片描述
SOLUTION
Observe that

在这里插入图片描述
Also, A T A A^TA ATA is a symmetric matrix. So the problem now is to maximize the quadratic form x T ( A T A ) x \boldsymbol x^T(A^TA)\boldsymbol x xT(ATA)x subject to the constraint ∥ x ∥ = 1 \left\|\boldsymbol x\right\|= 1 x=1. By Theorem 6 in Section 7.3, the maximum value is the greatest eigenvalue λ 1 \lambda_1 λ1 of A T A A^TA ATA. Also, the maximum value is attained at a unit eigenvector of A T A A^TA ATA corresponding to λ 1 \lambda_1 λ1.

在这里插入图片描述
The eigenvalues of A T A A^TA ATA are λ 1 = 360 , λ 2 = 90 \lambda_1 = 360, \lambda_2 = 90 λ1=360,λ2=90, and λ 3 = 0 \lambda_3 = 0 λ3=0. Corresponding unit eigenvectors are, respectively,

在这里插入图片描述
A v 1 A\boldsymbol v_1 Av1 is a point on the ellipse in Figure 1 farthest from the origin.

The Singular Values of an m × n m\times n m×n Matrix

Let A A A be an m × n m \times n m×n matrix. Then A T A A^TA ATA is symmetric and can be orthogonally diagonalized. Let { v 1 , . . . , v n } \{\boldsymbol v_1,...,\boldsymbol v_n\} { v1,...,vn} be an orthonormal basis for R n \R^n Rn consisting of eigenvectors of A T A A^TA ATA, and let { λ 1 , . . . , λ n } \{\lambda_1,...,\lambda_n\} { λ1,...,λn} be the associated eigenvalues of A T A A^TA ATA. Then, for 1 ≤ i ≤ n 1\leq i\leq n 1in,

在这里插入图片描述
So the eigenvalues of A T A A^TA ATA are all nonnegative. By renumbering, if necessary, we may assume that the eigenvalues are arranged so that

在这里插入图片描述
The singular values of A A A are the square roots of the eigenvalues of A T A A^TA ATA, denoted by σ 1 , . . . , σ n \sigma_1,...,\sigma_n σ1,...,σn, and they are arranged in decreasing order. By equation (2), the singular values of A A A are the lengths of the vectors A v 1 , . . . , A v n A\boldsymbol v_1,...,A\boldsymbol v_n Av1,...,Avn.

在这里插入图片描述
PROOF
For i ≠ j i\neq j i=j ,

在这里插入图片描述
Thus { A v 1 , . . . , A v n } \{A\boldsymbol v_1,...,A\boldsymbol v_n\} { Av1,...,Avn} is an orthogonal set. Furthermore, since the lengths of the vectors A v 1 , . . . , A v n A\boldsymbol v_1,...,A\boldsymbol v_n Av1,...,Avn are the singular values of A A A, and since there are r r r nonzero singular values, A v i ≠ 0 A\boldsymbol v_i\neq\boldsymbol 0 Avi=0 if and only if 1 ≤ i ≤ r 1\leq i\leq r 1ir. So A v 1 , . . . , A v r A\boldsymbol v_1,...,A\boldsymbol v_r Av1,...,Avr are linearly independent vectors, and they are in C o l A ColA ColA. Finally, for any y = A x \boldsymbol y=A\boldsymbol x y=Ax in C o l A ColA ColA, we can write x = c 1 v 1 + . . . + c n v n \boldsymbol x = c_1\boldsymbol v_1+...+ c_n\boldsymbol v_n x=c1v1+...+cnvn, and

在这里插入图片描述
Thus y \boldsymbol y y is in S p a n { A v 1 , . . . , A v r } Span\{A\boldsymbol v_1,...,A\boldsymbol v_r\} Span{ Av1,...,Avr}, which shows that { A v 1 , . . . , A v r } \{A\boldsymbol v_1,...,A\boldsymbol v_r\} { Av1,...,Avr} is an (orthogonal) basis for C o l A ColA ColA. Hence r a n k A = d i m C o l A = r rankA = dim ColA= r rankA=dimColA=r.

在这里插入图片描述
The first singular value σ 1 \sigma_1 σ1 of an m × n m \times n m×n matrix A A A is the maximum of ∥ A x ∥ \left\|Ax\right\| Ax over all unit vectors. This maximum value is attained at a unit eigenvector v 1 \boldsymbol v_1 v1 of A T A A^TA ATA corresponding to the greatest eigenvalue λ 1 \lambda_1 λ1 of A T A A^TA ATA. The second singular value is the maximum of ∥ A x ∥ \left\|Ax\right\| Ax over all unit vectors orthogonal to v 1 \boldsymbol v_1 v1.

The Singular Value Decomposition (SVD)

The decomposition of A A A involves an m × n m\times n m×n “diagonal” matrix ∑ \sum of the form

在这里插入图片描述
where D D D is an r × r r\times r r×r diagonal matrix for some r r r not exceeding the smaller of m m m and n n n. (If r r r equals m m m or n n n or both, some or all of the zero matrices do not appear.)

在这里插入图片描述
The matrices U U U and V V V are not uniquely determined by A A A. The columns of U U U in such a decomposition are called left singular vectors of A A A, and the columns of V V V are called right singular vectors of A A A.

PROOF
Let λ i \lambda_i λi and v i \boldsymbol v_i vi be as in Theorem 9, so that { A v 1 , . . . , A v r } \{A\boldsymbol v_1,...,A\boldsymbol v_r\} { Av1,...,Avr} is an orthogonal basis for C o l A ColA ColA. Normalize each A v i A\boldsymbol v_i Avi to obtain an orthonormal basis { u 1 , . . . , u r } \{\boldsymbol u_1,...,\boldsymbol u_r\} { u1,...,ur}, where

在这里插入图片描述
and

在这里插入图片描述
Now extend { u 1 , . . . , u r } \{\boldsymbol u_1,...,\boldsymbol u_r\} { u1,...,ur} to an orthonormal basis { u 1 , . . . , u m } \{\boldsymbol u_1,...,\boldsymbol u_m\} { u1,...,um} of R m \R^m Rm, and let

在这里插入图片描述
By construction, U U U and V V V are orthogonal matrices. Also, from (4),

在这里插入图片描述
Let D D D be the diagonal matrix with diagonal entries σ 1 , . . . , σ r \sigma_1,...,\sigma_r σ1,...,σr , and let ∑ \sum be as in (3) above. Then

在这里插入图片描述
Since V V V is an orthogonal matrix,

在这里插入图片描述

The next two examples focus attention on the internal structure of a singular value decomposition. An efficient and numerically stable algorithm for this decomposition would use a different approach. See the Numerical Note at the end of the section.

EXAMPLE 3
Use the results of Examples 1 to construct a singular value decomposition of A = [ 4 11 14 8 7 − 2 ] A=\begin{bmatrix}4& 11& 14\\8& 7& -2\end{bmatrix} A=[48117142]

SOLUTION
A construction can be divided into three steps.

Step 1. Find an orthogonal diagonalization of A T A A^TA ATA.
Step 2. Set up V V V and ∑ \sum . Arrange the eigenvalues of A T A A^TA ATA in decreasing order. In Example 1, the eigenvalues are already listed in decreasing order: 360 , 90 360, 90 360,90, and 0 0 0. The corresponding unit eigenvectors, v 1 , v 2 \boldsymbol v_1, \boldsymbol v_2 v1,v2, and v 3 \boldsymbol v_3 v3, are the right singular vectors of A A A.

在这里插入图片描述
The square roots of the eigenvalues are the singular values:

在这里插入图片描述
在这里插入图片描述
Step 3. Construct U U U. When A A A has rank r r r, the first r r r columns of U U U are the normalized vectors obtained from A v 1 , . . . , A v r A\boldsymbol v_1,...,A\boldsymbol v_r Av1,...,Avr . In this example, A A A has two nonzero singular values, so r a n k A = 2 rankA = 2 rankA=2. Recall that ∥ A v 1 ∥ = σ 1 \left\|A\boldsymbol v_1\right\|=\sigma_1 Av1=σ1 and ∥ A v 2 ∥ = σ 2 \left\|A\boldsymbol v_2\right\|=\sigma_2 Av2=σ2. Thus

在这里插入图片描述
Note that { u 1 , u 2 } \{\boldsymbol u_1,\boldsymbol u_2\} { u1,u2} is already a basis for R 2 \R^2 R2. Thus no additional vectors are needed for U U U. The singular value decomposition of A A A is

在这里插入图片描述
EXAMPLE 4
Find a singular value decomposition of

在这里插入图片描述
SOLUTION
The eigenvalues of A T A A^TA ATA are 18 and 0, with corresponding unit eigenvectors
在这里插入图片描述

在这里插入图片描述
在这里插入图片描述
To construct U U U, first construct A v 1 A\boldsymbol v_1 Av1 and A v 2 A\boldsymbol v_2 Av2:

在这里插入图片描述
The only column found for U U U so far is

在这里插入图片描述
The other columns of U U U are found by extending the set { u 1 } \{\boldsymbol u_1\} { u1} to an orthonormal basis for R 3 \R^3 R3. In this case, we need two orthogonal unit vectors u 2 \boldsymbol u_2 u2 and u 3 \boldsymbol u_3 u3 that are orthogonal to u 1 \boldsymbol u_1 u1. Each vector must satisfy u 1 T x = 0 \boldsymbol u_1^T\boldsymbol x= 0 u1Tx=0, which is equivalent to the equation x 1 − 2 x 2 + 2 x 3 = 0 x_1-2x_2+ 2x_3= 0 x12x2+2x3=0. A basis for the solution set of this equation is

在这里插入图片描述
Apply the Gram–Schmidt process (with normalizations) to { w 1 , w 2 } \{\boldsymbol w_1,\boldsymbol w_2\} { w1,w2}, and obtain

在这里插入图片描述
在这里插入图片描述

Another way to find u 2 \boldsymbol u_2 u2 and u 3 \boldsymbol u_3 u3 is to realize that u 1 \boldsymbol u_1 u1 form an orthonormal basis for C o l A Col A ColA. The remaining u 2 \boldsymbol u_2 u2 and u 3 \boldsymbol u_3 u3 must be a basis for ( C o l A ) ⊥ = N u l A T (Col A)^\perp = Nul A^T (ColA)=NulAT.


The next few exercises show some interesting facts.

EXERCISE
How are the singular values of A A A and A T A^T AT related?
SOLUTION
A T = ( U ∑ V T ) T = V ∑ T U T A^T=(U\sum V^T)^T=V\sum^T U^T AT=(UVT)T=VTUT. This is an SVD of A T A^T AT because V V V and U U U are orthogonal matrices and ∑ T \sum^T T is an n × m n\times m n×m “diagonal” matrix. Since ∑ \sum and ∑ T \sum^T T have the same nonzero diagonal entries, A A A and A T A^T AT have the same nonzero singular values.
[Note: If A A A is 2 × n 2\times n 2×n, then A A T AA^T AAT is only 2 × 2 2\times 2 2×2 and its eigenvalues may be easier to compute (by hand) than the eigenvalues of A T A A^TA ATA.]

EXERCISE 17
Show that if A A A is square, then ∣ d e t A ∣ |detA| detA is the product of the singular values of A A A.
SOLUTION
∣ d e t A ∣ = ∣ d e t ( U ∑ V T ) ∣ = ∣ d e t U ⋅ d e t ∑ ⋅ d e t V T ∣ = ∣ ± 1 ⋅ d e t ∑ ⋅ ± 1 ∣ = d e t ∑ |detA|=|det(U\sum V^T)|=|detU\cdot det\sum\cdot detV^T|=|\pm1\cdot det\sum\cdot \pm 1|=det\sum detA=det(UVT)=detUdetdetVT=±1det±1=det

EXERCISE 19
A A A is an m × n m\times n m×n matrix with a singular value decomposition A = U ∑ V T A=U\sum V^T A=UVT , where U U U is an m × m m\times m m×m orthogonal matrix, ∑ \sum is an m × n m\times n m×n “diagonal” matrix with r r r positive entries and no negative entries, and V V V is an n × n n\times n n×n orthogonal matrix. Show that the columns of V V V are eigenvectors of A T A A^TA ATA, the columns of U U U are eigenvectors of A A T AA^T AAT , and the diagonal entries of ∑ \sum are the singular values of A A A.
SOLUTION
[Hint: Use the SVD to compute A T A A^TA ATA and A A T AA^T AAT .]
在这里插入图片描述

EXERCISE 20
Show that if P P P is an orthogonal m × m m\times m m×m matrix, then P A PA PA has the same singular values as A A A.

EXERCISE 22
Show that if A A A is an n × n n\times n n×n positive definite matrix, then an orthogonal diagonalization A = P D P T A= PDP^T A=PDPT is a singular value decomposition of A A A.

EXERCISE 23
Let
在这里插入图片描述

, where the u i \boldsymbol u_i ui and v i \boldsymbol v_i vi are as in Theorem 10. Show that

在这里插入图片描述
SOLUTION
在这里插入图片描述
This expansion generalizes the spectral decomposition in Section 7.1.

EXERCISE 25
Let T : R n ↦ R m T: \R^n\mapsto \R^m T:RnRm be a linear transformation. Describe how to find a basis B \mathcal B B for R n \R^n Rn and a basis C \mathcal C C for R m \R^m Rm such that the matrix for T T T relative to B \mathcal B B and C \mathcal C C is an m × n m \times n m×n “diagonal” matrix.
SOLUTION
Consider the SVD for the standard matrix of T T T, say, A = U ∑ V T A = U\sum V^T A=UVT. Let B = { v 1 , … , v n } \mathcal B = \{\boldsymbol v_1, …, \boldsymbol v_n\} B={ v1,,vn} and C = { u 1 , … , u m } C = \{\boldsymbol u_1, …, \boldsymbol u_m\} C={ u1,,um} be bases constructed from the columns of V V V and U U U, respectively. Observe that, since the columns of V V V are orthonormal, V T v j = e j V^T\boldsymbol v_j = \boldsymbol e_j VTvj=ej, where e j \boldsymbol e_j ej is the j j jth column of the n × n n\times n n×n identity matrix. To find the matrix of T T T relative to B \mathcal B B and C \mathcal C C, compute

在这里插入图片描述

So [ T ( v j ) ] C = σ j e j [T(\boldsymbol v_j)]_{\mathcal C} = \sigma_j\boldsymbol e_j [T(vj)]C=σjej. The discussion at the beginning of Section 5.4 shows that the “diagonal” matrix ∑ \sum is the matrix of T T T relative to B \mathcal B B and C \mathcal C C.

EXERCISE
Prove that any n × n n\times n n×n matrix A A A admits a polar decomposition(极分解) of the form A = P Q A= PQ A=PQ, where P P P is an n × n n \times n n×n positive semidefinite matrix with the same rank as A A A and where Q Q Q is an n × n n\times n n×n orthogonal matrix.
SOLUTION
[Hint: Use a singular value decomposition, A = U ∑ V T A= U\sum V^T A=UVT , and observe that A = ( U ∑ U T ) ( U V T ) A=(U\sum U^T)(UV^T) A=(UUT)(UVT) and U ∑ U T U\sum U^T UUT is a symmetric matrix.]

Applications of the Singular Value Decomposition

The SVD is often used to estimate the rank of a matrix, as noted above. Several other numerical applications are described briefly below, and an application to image processing is presented in Section 7.5.


EXAMPLE 5 (The Condition Number (条件数))
Most numerical calculations involving an equation A x = b A\boldsymbol x =\boldsymbol b Ax=b are as reliable as possible when the SVD of A A A is used. The two orthogonal matrices U U U and V V V do not affect lengths of vectors or angles between vectors. Any possible instabilities in numerical calculations are identified in ∑ \sum . If the singular values of A A A are extremely large or small, roundoff errors are almost inevitable, but an error analysis is aided by knowing the entries in ∑ \sum and V V V .
If A A A is an invertible n × n n\times n n×n matrix, then the ratio σ 1 = σ n \sigma_1=\sigma_n σ1=σn of the largest and smallest singular values gives the condition number of A A A. Exercises 41–43 in Section 2.3 showed how the condition number affects the sensitivity of a solution of A x = b A\boldsymbol x =\boldsymbol b Ax=b to changes (or errors) in the entries of A. (Actually, a “condition number” of A A A can be computed in several ways, but the definition given here is widely used for studying A x = b A\boldsymbol x =\boldsymbol b Ax=b.)


EXAMPLE 6 (Bases for Fundamental Subspaces)
Given an SVD for an m × n m \times n m×n matrix A A A, let u 1 , . . . , u m \boldsymbol u_1,...,\boldsymbol u_m u1,...,um be the left singular vectors, v 1 , . . . , v n \boldsymbol v_1,...,\boldsymbol v_n v1,...,vn the right singular vectors, and σ 1 , . . . , σ n \sigma_1,...,\sigma_n σ1,...,σn the singular values, and let r r r be the rank of A A A. By Theorem 9,
{ u 1 , . . . , u r }      ( 5 ) \{\boldsymbol u_1,...,\boldsymbol u_r\}\ \ \ \ (5) { u1,...,ur}    (5)

is an orthonormal basis for C o l A ColA ColA.

Recall that ( C o l A ) ⊥ = N u l A T (Col A)^{\perp}= NulA^T (ColA)=NulAT . Hence
{ u r + 1 , . . . , u m }      ( 6 ) \{\boldsymbol u_{r+1},...,\boldsymbol u_m\}\ \ \ \ (6) { ur+1,...,um}    (6)

is an orthonormal basis for N u l A T NulA^T NulAT .

Since ∥ A v i ∥ = σ i \left\|A\boldsymbol v_i\right\| =\sigma_i Avi=σi for 1 ≤ i ≤ n 1\leq i\leq n 1in, and σ i \sigma_i σi is 0 if and only if i > r i > r i>r, the vectors v r + 1 , . . . , v n \boldsymbol v_{r+1},...,\boldsymbol v_n vr+1,...,vn span a subspace of N u l A NulA NulA of dimension n − r n - r nr. By the Rank Theorem, d i m N u l A = n − r a n k A = n − r dim NulA = n - rankA=n-r dimNulA=nrankA=nr. It follows that

{ v r + 1 , . . . , v n }      ( 7 ) \{\boldsymbol v_{r+1},...,\boldsymbol v_n\}\ \ \ \ (7) { vr+1,...,vn}    (7)

is an orthonormal basis for N u l A NulA NulA.

( N u l A ) ⊥ = C o l A T = R o w A (Nul A)^\perp= ColA^T = RowA (NulA)=ColAT=RowA. Hence, from ( 7 ) (7) (7),

{ v 1 , . . . , v r }      ( 8 ) \{\boldsymbol v_1,...,\boldsymbol v_r\}\ \ \ \ (8) { v1,...,vr}    (8)

is an orthonormal basis for R o w A RowA RowA.

Figure 4 summarizes ( 5 ) – ( 8 ) (5)–(8) (5)(8), but shows the orthogonal basis { σ 1 u 1 , . . . , σ r u r } \{\sigma_1\boldsymbol u_1,...,\sigma_r\boldsymbol u_r\} { σ1u1,...,σrur} for C o l A ColA ColA instead of the normalized basis, to remind you that A v i = σ i u i A\boldsymbol v_i= \sigma_i \boldsymbol u_i Avi=σiui for 1 ≤ i ≤ r 1\leq i \leq r 1ir.

在这里插入图片描述

The four fundamental subspaces and the concept of singular values provide the final statements of the Invertible Matrix Theorem.

在这里插入图片描述


EXAMPLE 7 (Reduced SVD and the Pseudoinverse of A A A (奇异值分解的简化和 A A A 的伪逆))
When ∑ \sum contains rows or columns of zeros, a more compact decomposition of A A A is possible. Using the notation established above, let r = r a n k A r= rankA r=rankA, and partition U U U and V V V into submatrices whose first blocks contain r r r columns:

在这里插入图片描述

Then U r U_r Ur is m × r m\times r m×r and V r V_r Vr is n × r n\times r n×r. (To simplify notation, we consider U m − r U_{m-r} Umr or V n − r V_{n-r} Vnr even though one of them may have no columns.) Then partitioned matrix multiplication shows that

在这里插入图片描述
This factorization of A A A is called a reduced singular value decomposition of A A A. Since the diagonal entries in D D D are nonzero, D D D is invertible. The following matrix is called the pseudoinverse (伪逆) (also, the Moore–Penrose inverse (穆尔-彭罗斯逆)) of A A A:

在这里插入图片描述

The next Supplementary exercises explore some of the properties of the reduced singular value decomposition and the pseudoinverse.

Supplementary EXERCISE 12
Verify the properties of A + A^+ A+:
a. For each y \boldsymbol y y in R m \R^m Rm, A A + y AA^+\boldsymbol y AA+y is the orthogonal projection of y \boldsymbol y y onto C o l A ColA ColA.
b. For each x \boldsymbol x x in R n \R^n Rn, A + A x A^+A\boldsymbol x A+Ax is the orthogonal projection of x \boldsymbol x x onto R o w A RowA RowA.
c. A A + A = A AA^+A = A AA+A=A and A + A A + = A + A^+AA^+ = A^+ A+AA+=A+.

Supplementary EXERCISE 13
Suppose the equation A x = b A\boldsymbol x =\boldsymbol b Ax=b is consistent, and let x + = A + b \boldsymbol x^+ = A^+\boldsymbol b x+=A+b. By Exercise 23 in Section 6.3, there is exactly one vector p \boldsymbol p p in R o w A RowA RowA such that A p = b A\boldsymbol p =\boldsymbol b Ap=b. The following steps prove that x + = p \boldsymbol x^+ =\boldsymbol p x+=p and x + \boldsymbol x^+ x+ is the minimum length solution of A x = b A\boldsymbol x=\boldsymbol b Ax=b.
a. Show that x + \boldsymbol x^+ x+ is in R o w A RowA RowA.
b. Show that x + \boldsymbol x^+ x+ is a solution of A x = b A\boldsymbol x =\boldsymbol b Ax=b.
c. Show that if u \boldsymbol u u is any solution of A x = b A\boldsymbol x =\boldsymbol b Ax=b, then ∥ x + ∥ ≤ ∥ u ∥ \left\|\boldsymbol x^+\right\|\leq\left\|\boldsymbol u\right\| x+u, with equality only if u = x + \boldsymbol u = \boldsymbol x^+ u=x+.
SOLUTION
a. x + = V r D − 1 U r T \boldsymbol x^+=V_rD^{-1}U_r^T x+=VrD1UrT. Since the columns of V r V_r Vr form an orthonormal basis for R o w A RowA RowA, x + \boldsymbol x^+ x+ is a linear combination of the R o w A RowA RowA's orthonormal basis. Thus x + \boldsymbol x^+ x+ is in R o w A RowA RowA.
b. A x + = A A + b = A A + A x = A x = b A\boldsymbol x^+=AA^+\boldsymbol b=AA^+A\boldsymbol x=A\boldsymbol x=\boldsymbol b Ax+=AA+b=AA+Ax=Ax=b
c. x + \boldsymbol x^+ x+ is the orthogonal projection of u \boldsymbol u u onto R o w A RowA RowA. …

Supplementary EXERCISE 14
Given any b \boldsymbol b b in R m \R^m Rm, adapt Exercise 13 to show that A + b A^+\boldsymbol b A+b is the least-squares solution of minimum length.
SOLUTION
[Hint: Consider the equation A x = b ^ A\boldsymbol x = \hat\boldsymbol b Ax=b^, where b ^ \hat\boldsymbol b b^ is the orthogonal projection of b \boldsymbol b b onto C o l A ColA ColA.]


EXAMPLE 8 (Least-Squares Solution)
Given the equation A x = b A\boldsymbol x =\boldsymbol b Ax=b, use the pseudoinverse of A A A to define

在这里插入图片描述
Then,

在这里插入图片描述
U r U r T b U_rU_r^T\boldsymbol b UrUrTb is the orthogonal projection b ^ \hat\boldsymbol b b^ of b \boldsymbol b b onto C o l A ColA ColA. Thus x ^ \hat\boldsymbol x x^ is a least-squares solution of A x = b A\boldsymbol x =\boldsymbol b Ax=b. In fact, this x ^ \hat \boldsymbol x x^ has the smallest length among all least-squares solutions of A x = b A\boldsymbol x=\boldsymbol b Ax=b. See Supplementary Exercise 14.


在这里插入图片描述
在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/weixin_42437114/article/details/108997195