瑞利商

瑞利商


瑞利商


瑞利商

首先给出瑞利商的定义
R ( A , x ) = x T A x x T x R(A,x) = \frac{x^TAx}{x^Tx}
A A 为一个 n n n*n 的对称矩阵。

它经常在一些统计问题中出现,因此在此记录其性质

我们记 A A 的特征值以及对应的特征向量为 λ 1 , λ 2 , . . . , λ n ; v 1 , v 2 , . . . , v n \lambda_1,\lambda_2,...,\lambda_n;v_1,v_2,...,v_n ;且有
λ m i n = λ 1 λ 2 . . . λ n = λ m a x \lambda_{min} = \lambda_1 \leq \lambda_2 \leq ... \leq\lambda_n = \lambda_{max}

则瑞利商的性质为:

max x R ( A , x ) = λ n \max_xR(A,x) = \lambda_n min x R ( A , x ) = λ 1 \min_xR(A,x) = \lambda_1

下面给出证明

由于 A A 为对称矩阵,所以存在一个正交矩阵 U U 使其特征值分解为 M = U Γ U T M = U\Gamma U^T
其中 Γ = d i a g ( λ 1 , λ 2 , . . . , λ n ) \Gamma = diag(\lambda_1,\lambda_2,...,\lambda_n) 为只由 A A 的特征值构成对角线值的矩阵。

如此,我们将初始形式转化为

R ( A , x ) = x T U Γ U T x x T x = ( U T x ) T Γ ( U T x ) x T x \begin{aligned} R(A,x) & = \frac{x^TU\Gamma U^Tx}{x^Tx} \\ & = \frac{(U^Tx)^T\Gamma(U^Tx)}{x^Tx} \end{aligned}

y = U T x y = U^Tx

我们有

R ( A , x ) = y T Γ y x T x = i = 1 n λ i y i 2 i = 1 n x i 2 \begin{aligned} R(A,x) & = \frac{y^T\Gamma y}{x^Tx} \\ & = \frac{\sum^n_{i = 1}\lambda_i|y_i|^2}{\sum^n_{i=1}|x_i|^2} \end{aligned}

由于 λ i \lambda_i 的大小排序,可以得到:

λ 1 i = 1 n y i 2 i = 1 n λ i y i 2 λ n i = 1 n y i 2 \lambda_1\sum\limits_{i=1}^{n}|y_i|^2 \leq \sum\limits^n_{i = 1}\lambda_i|y_i|^2 \leq \lambda_n\sum\limits_{i=1}^{n}|y_i|^2

于是:

λ 1 i = 1 n y i 2 i = 1 n x i 2 R ( A , x ) λ n i = 1 n y i 2 i = 1 n x i 2 \frac{\lambda_1\sum\limits_{i=1}^{n}|y_i|^2}{\sum^n_{i=1}|x_i|^2} \leq R(A,x) \leq \frac{\lambda_n\sum\limits_{i=1}^{n}|y_i|^2}{\sum^n_{i=1}|x_i|^2}

下面我们将证明

i = 1 n y i 2 = i = 1 n x i 2 \sum\limits_{i=1}^{n}|y_i|^2 = \sum^n_{i=1}|x_i|^2

首先

y i = j = 1 n u j i x j y_i = \sum\limits_{j=1}^{n}u_{ji}x_j y i T = j = 1 n x j u i j y_i^T = \sum\limits_{j=1}^{n}x_ju_{ij}
y i 2 = y i T y i = j = 1 n k = 1 n x j u i j u k i x k \Rightarrow |y_i|^2 = y_i^Ty_i = \sum\limits_{j=1}^{n}\sum\limits_{k=1}^{n}x_ju_{ij}u_{ki}x_k
i = 1 n y i 2 = y i T y i = j = 1 n k = 1 n x j ( i = 1 n u i j u k i ) x k \Rightarrow \sum\limits_{i=1}^{n}|y_i|^2 = y_i^Ty_i = \sum\limits_{j=1}^{n}\sum\limits_{k=1}^{n}x_j(\sum\limits_{i=1}^{n}u_{ij}u_{ki})x_k

由于 U U 为一个正交矩阵,所以 U T U = I U^TU = I ,所以有 i = 1 n u i j u k i = I j k \sum\limits_{i=1}^{n}u_{ij}u_{ki} = I_{jk}

j k j\neq k 时, I j k = 0 I_{jk} = 0 ;当 j = k j = k 时, I j k = 1 I_{jk} = 1

所以我们有
i = 1 n y i 2 = i = 1 n x i 2 \sum\limits_{i=1}^{n}|y_i|^2 = \sum^n_{i=1}|x_i|^2

到此,我们有

λ 1 R ( A , x ) λ n \lambda_1 \leq R(A,x) \leq \lambda_n

成立

x = v 1 x = v_1 R ( A , x ) = λ 1 R(A,x) = \lambda_1 ,当 x = v n x = v_n R ( A , x ) = λ n R(A,x) = \lambda_n

R ( A , x ) R(A,x) 具有缩放不变的性质

x = c x x' = cx , R ( A , x ) = x T A x x T x = c x T A x c c x T x c = R ( A , x ) R(A,x') = \frac{x'^TAx'}{x'^Tx'} = \frac{cx^TAxc}{cx^Txc} = R(A,x)

同时我们从拉格朗日乘子法的角度来求解:
不妨设 x T x = 1 x^Tx = 1 ,且我们在此条件下求解 R ( A , x ) = x T A x R(A,x) = x^TAx 的极值
L ( x , λ ) = x T A x λ ( x T x 1 ) L(x,\lambda) = x^TAx - \lambda(x^Tx - 1)
x x 求导,令其为 0 0

A x λ x = 0 \Rightarrow Ax-\lambda x = 0

此即为 A A 特征值的定义形式,所以容易得到与之前相同的结果

下面给出广义瑞利商的形式:

R ( A , B , x ) = x T A x x T B x R(A,B,x) = \frac{x^TAx}{x^TBx}

拉格朗日乘子法的角度:

A x λ B x = 0 Ax-\lambda Bx = 0
B 1 A x = λ x \Rightarrow B^{-1}Ax = \lambda x

即有 R ( A , B , x ) R(A,B,x) 的极值在 B 1 A B^{-1}A 的特征向量上取得

或者将其转化成狭义瑞利商的形式

y = B 1 2 x y = B^{\frac{1}{2}}x ,我们有
R ( A , B , x ) = y T ( B 1 2 ) T A ( B 1 2 ) y y T y R(A,B,x) = \frac{y^T(B^{-\frac{1}{2}})^TA(B^{-\frac{1}{2}})y}{y^Ty}

即其取值范围为矩阵

( B 1 2 ) T A B 1 2 (B^{-\frac{1}{2}})^TAB^{-\frac{1}{2}}

的特征值区间内

而且 B 1 A B^{-1}A 的特征值与 ( B 1 2 ) T A B 1 2 (B^{-\frac{1}{2}})^TAB^{-\frac{1}{2}} 是一样的,区别只是对应的特征向量差了一个相同的变换

NOTE:
( B 1 2 ) T A ( B 1 2 ) (B^{-\frac{1}{2}})^TA(B^{-\frac{1}{2}}) 进一步进行特征值分解 V S V T VSV^T
y T ( B 1 2 ) T A ( B 1 2 ) y = y T V S V T y = Z T S Z y^T(B^{-\frac{1}{2}})^TA(B^{-\frac{1}{2}})y = y^TVSV^Ty = Z^TSZ
即有 Z = V T y Z = V^Ty ,且 S S 为对角线是特征值的矩阵。
此时有 Z T S Z = i = 1 n λ i z i 2 Z^TSZ = \sum\limits_{i=1}^{n}\lambda_iz_{i}^2 ,条件变为 Z T Z = y T y = 1 Z^TZ = y^Ty = 1
所以也容易得到上述结果

猜你喜欢

转载自blog.csdn.net/qq_37920823/article/details/90028307