7.1 Diagonalization of symmetric matrices (对称矩阵的对角化)

本文为《Linear algebra and its applications》的读书笔记

Diagonalization of symmetric matrices

A symmetric matrix is a matrix A A A such that A T = A A^T = A AT=A. Such a matrix is necessarily square.

To begin the study of symmetric matrices, it is helpful to review the diagonalization process of Section 5.3.

在这里插入图片描述
PROOF
Let v 1 \boldsymbol v_1 v1 and v 2 \boldsymbol v_2 v2 be eigenvectors that correspond to distinct eigenvalues, say, λ 1 \lambda_1 λ1 and λ 2 \lambda_2 λ2.

在这里插入图片描述
Since λ 1 ≠ λ 2 \lambda_1\neq\lambda_2 λ1=λ2, v 1 ⋅ v 2 = 0 \boldsymbol v_1\cdot\boldsymbol v_2=0 v1v2=0.

An n × n n\times n n×n matrix A A A is said to be orthogonally diagonalizable(正交对角化) if there are an orthogonal matrix P P P (with P − 1 = P T P^{-1} = P^T P1=PT ) and a diagonal matrix D D D such that

在这里插入图片描述
Such a diagonalization requires n n n linearly independent and orthonormal eigenvectors.

If A A A is orthogonally diagonalizable, then

在这里插入图片描述
Thus A A A is symmetric! Theorem 2 below shows that, conversely, every symmetric matrix is orthogonally diagonalizable. The main idea for a proof will be given after Theorem 3.

在这里插入图片描述
This theorem is rather amazing, because the work in Chapter 5 would suggest that it is usually impossible to tell when a matrix is diagonalizable. But this is not the case for symmetric matrices.

EXAMPLE 3
Orthogonally diagonalize the matrix

在这里插入图片描述
, whose characteristic equation is

在这里插入图片描述
SOLUTION

在这里插入图片描述

Although v 1 \boldsymbol v_1 v1 and v 2 \boldsymbol v_2 v2 are linearly independent, they are not orthogonal. Then use the projection of v 2 \boldsymbol v_2 v2 onto v 1 \boldsymbol v_1 v1 to produce an orthogonal set.

在这里插入图片描述
Then { v 1 , z 2 } \{\boldsymbol v_1,\boldsymbol z_2\} { v1,z2} is an orthogonal set in the eigenspace for λ = 7 \lambda= 7 λ=7. (Note that z 2 \boldsymbol z_2 z2 is a linear combination of the eigenvectors v 1 \boldsymbol v_1 v1 and v 2 \boldsymbol v_2 v2, so z 2 \boldsymbol z_2 z2 is in the eigenspace.)

Normalize v 1 \boldsymbol v_1 v1 and z 2 \boldsymbol z_2 z2 to obtain the following orthonormal basis for the eigenspace for λ = 7 \lambda= 7 λ=7:

在这里插入图片描述
An orthonormal basis for the eigenspace for λ = − 2 \lambda =-2 λ=2 is

在这里插入图片描述
By Theorem 1, u 3 \boldsymbol u_3 u3 is orthogonal to the other eigenvectors u 1 \boldsymbol u_1 u1 and u 2 \boldsymbol u_2 u2. Hence { u 1 , u 2 , u 3 } \{\boldsymbol u_1,\boldsymbol u_2,\boldsymbol u_3\} { u1,u2,u3} is an orthonormal set. Let

在这里插入图片描述
Then P P P orthogonally diagonalizes A A A, and A = P D P − 1 A = PDP^{-1} A=PDP1.

The Spectral Theorem 谱定理

The set of eigenvalues of a matrix A A A is sometimes called the s p e c t r u m spectrum spectrum of A A A, and the following description of the eigenvalues is called a s p e c t r a l spectral spectral t h e o r e m theorem theorem.

在这里插入图片描述

  • Part ( a ) (a) (a) follows from Supplementary exercises in Section 5.5.
  • Part ( b ) (b) (b) follows easily from part (d).
  • Part ( c ) (c) (c) is Theorem 1.
  • Because of ( a ) (a) (a), a proof of ( d ) (d) (d) can be found in the A p p e n d i x Appendix Appendix: proof of Theorem 3.

Spectral Decomposition 谱分解

Suppose A = P D P − 1 A= PDP^{-1} A=PDP1, where the columns of P P P are orthonormal eigenvectors u 1 , . . . , u n \boldsymbol u_1,..., \boldsymbol u_n u1,...,un of A A A and the corresponding eigenvalues λ 1 , . . . , λ n \lambda_1,...,\lambda_n λ1,...,λn are in the diagonal matrix D D D. Then, since P − 1 = P T P^{-1}= P^T P1=PT ,

在这里插入图片描述
在这里插入图片描述
This representation of A A A is called a spectral decomposition of A A A because it breaks up A A A into pieces determined by the spectrum (eigenvalues) of A A A.

  • Each term in (2) is an n × n n\times n n×n matrix of rank 1. For example, every column of λ 1 u 1 u 1 T \lambda_1\boldsymbol u_1\boldsymbol u_1^T λ1u1u1T is a multiple of u 1 \boldsymbol u_1 u1.
  • Furthermore, each matrix u j u j T \boldsymbol u_j\boldsymbol u_j^T ujujT is a projection matrix(投影矩阵) in the sense that for each x \boldsymbol x x in R n \R^n Rn, the vector ( u j u j T ) x (\boldsymbol u_j\boldsymbol u_j^T)\boldsymbol x (ujujT)x is the orthogonal projection of x \boldsymbol x x onto the subspace spanned by u j \boldsymbol u_j uj .
    PROOF
    ( u j u j T ) x = u j ( u j T x ) = ( u j T x ) u j = ( u j ⋅ x ) u j (\boldsymbol u_j\boldsymbol u_j^T)\boldsymbol x=\boldsymbol u_j(\boldsymbol u_j^T\boldsymbol x)=(\boldsymbol u_j^T\boldsymbol x)\boldsymbol u_j=(\boldsymbol u_j\cdot\boldsymbol x)\boldsymbol u_j (ujujT)x=uj(ujTx)=(ujTx)uj=(ujx)uj. ( u j T x \boldsymbol u_j^T\boldsymbol x ujTx is a scaler) This is the orthogonal projection of x \boldsymbol x x onto u \boldsymbol u u.
    在这里插入图片描述

EXERCISE
Let A A A be an n × n n\times n n×n symmetric matrix of rank r r r. Explain why the spectral decomposition of A A A represents A A A as the sum of r r r rank 1 matrices.
SOLUTION
[Hint: d i m N u l A = n − r dimNulA=n-r dimNulA=nr]

Appendix: proof of Theorem 3 (d)

The Schur factorization(舒尔分解) of an n × n n \times n n×n matrix A A A is in the form A = U R U T A= URU^T A=URUT , where U U U is an orthogonal matrix and R R R is an n × n n \times n n×n upper triangular matrix.

THEOREM
Let A A A be an n × n n\times n n×n matrix with n n n real eigenvalues, counting multiplicities, denoted by λ 1 , . . . , λ n \lambda_1,...,\lambda_n λ1,...,λn. It can be shown that A A A admits a (real) Schur factorization.
PROOF
Parts (a) and (b) show the key ideas in the proof. The rest of the proof amounts to repeating (a) and (b) for successively smaller matrices, and then piecing together the results.
a. Let u 1 \boldsymbol u_1 u1 be a unit eigenvector corresponding to λ 1 \lambda_1 λ1, let u 2 , . . . , u n \boldsymbol u_2,...,\boldsymbol u_n u2,...,un be any other vectors such that { u 1 , . . . , u n } \{\boldsymbol u_1,...,\boldsymbol u_n\} { u1,...,un} is an orthonormal basis for R n \R^n Rn, and then let U = [ u 1 u 2 . . . u n ] U =\begin{bmatrix}\boldsymbol u_1&\boldsymbol u_2&...&\boldsymbol u_n\end{bmatrix} U=[u1u2...un]. It can be shown that the first column of U T A U U^T AU UTAU is λ 1 e 1 \lambda_1\boldsymbol e_1 λ1e1, where e 1 \boldsymbol e_1 e1 is the first column of the n × n n\times n n×n identity matrix.
b. Part (a) implies that U T A U U^TAU UTAU has the form shown below.

在这里插入图片描述

Since d e t ( U T A U − λ I ) = d e t ( U T A U − λ U T U ) = d e t ( U T ( A − λ I ) U ) = d e t ( U T ) d e t ( A − λ I ) d e t ( U ) = d e t ( U − 1 ) d e t ( A − λ I ) d e t ( U ) = d e t ( A − λ I ) det(U^TAU-\lambda I)=det(U^TAU-\lambda U^TU)=det(U^T(A-\lambda I)U)=det(U^T)det(A-\lambda I)det(U)=det(U^{-1})det(A-\lambda I)det(U)=det(A-\lambda I) det(UTAUλI)=det(UTAUλUTU)=det(UT(AλI)U)=det(UT)det(AλI)det(U)=det(U1)det(AλI)det(U)=det(AλI), the characteristic polynomials of U T A U U^TAU UTAU and A A A are the same. Thus U T A U U^TAU UTAU and A A A have the same eigenvalues, which indicates that the eigenvalues of A 1 A_1 A1 are λ 2 , . . . , λ n \lambda_2,...,\lambda_n λ2,...,λn.

Similar to (a), let u 2 ′ \boldsymbol u_2' u2 be a unit eigenvector corresponding to λ 2 \lambda_2 λ2, let u 3 ′ , . . . , u n ′ \boldsymbol u_3',...,\boldsymbol u_n' u3,...,un be any other vectors such that { u 2 ′ , . . . , u n ′ } \{\boldsymbol u_2',...,\boldsymbol u_n'\} { u2,...,un} is an orthonormal basis for R n − 1 \R^{n-1} Rn1, and then let U ′ = [ u 2 ′ u 3 ′ . . . u n ′ ] U' =\begin{bmatrix}\boldsymbol u_2'&\boldsymbol u_3'&...&\boldsymbol u_n'\end{bmatrix} U=[u2u3...un]. It can be shown that the first column of U ′ T A 1 U ′ U'^T A_1U' UTA1U is λ 2 e 1 ′ \lambda_2\boldsymbol e_1' λ2e1, where e 1 ′ \boldsymbol e_1' e1 is the first column of the ( n − 1 ) × ( n − 1 ) (n-1)\times (n-1) (n1)×(n1) identity matrix. So U ′ T A 1 U ′ U'^TA_1U' UTA1U has the form similar to U T A U U^TAU UTAU.

Suppose U T A U = [ λ 1 x T 0 A 1 ] U^TAU=\begin{bmatrix}\lambda_1&\boldsymbol x^T\\\boldsymbol 0&A_1\end{bmatrix} UTAU=[λ10xTA1], then
[ 1 0 0 U 1 T ] U T A U [ 1 0 0 U 1 ] = [ 1 0 0 U 1 T ] [ λ 1 x T 0 A 1 ] [ 1 0 0 U 1 ] = [ λ 1 x T U 1 0 U 1 T A 1 U 1 ] \begin{aligned}\begin{bmatrix}1&\boldsymbol 0\\\boldsymbol 0&U_1^T\end{bmatrix}U^TAU\begin{bmatrix}1&\boldsymbol 0\\\boldsymbol 0&U_1\end{bmatrix}&=\begin{bmatrix}1&\boldsymbol 0\\\boldsymbol 0&U_1^T\end{bmatrix}\begin{bmatrix}\lambda_1&\boldsymbol x^T\\\boldsymbol 0&A_1\end{bmatrix}\begin{bmatrix}1&\boldsymbol 0\\\boldsymbol 0&U_1\end{bmatrix}\\&=\begin{bmatrix}\lambda_1&\boldsymbol x^TU_1\\\boldsymbol 0&U_1^TA_1U_1\end{bmatrix}\end{aligned} [100U1T]UTAU[100U1]=[100U1T][λ10xTA1][100U1]=[λ10xTU1U1TA1U1]

Let U ′ = U [ 1 0 0 U 1 ] U'=U\begin{bmatrix}1&\boldsymbol 0\\\boldsymbol 0&U_1\end{bmatrix} U=U[100U1], it can be shown that U ′ U' U is an orthogonal matrix. So
U ′ T A U ′ = [ λ 1 ∗ ∗ ∗ ∗ 0 λ 2 ∗ ∗ ∗ . . . 0 . . . . . . A 2 0 0 ] U'^TAU'=\begin{bmatrix}\lambda_1&*&*&*&*\\0&\lambda_2&*&*&*\\...&0\\...&...&&A_2\\0&0\end{bmatrix} UTAU=λ10......0λ20...0A2

Continue this process and we will finally get a (real) Schur factorization of A A A.


With the theorem above, the proof is quite easy.

Let A A A be a symmetric matrix. Since A A A has n n n real eigenvalues, counting multiplicities, A A A has a real Schur factorization U R U T URU^T URUT. Since A T = U R T U T = A = U R U T A^T=UR^TU^T=A=URU^T AT=URTUT=A=URUT, R = R T R=R^T R=RT, which indicates that R R R is in fact a diagonal matrix, with eigenvalues on its main diagonal.

Thus A = U R U − 1 A=URU^{-1} A=URU1, where U U U is an orthogonal matrix and R R R is a diagonal matrix. So A A A is orthogonally diaganalizable.

猜你喜欢

转载自blog.csdn.net/weixin_42437114/article/details/108959129
7.1