Data Mining - Matrix Decomposition SVD

Although useful QR decomposition, and has stable properties, but there are disadvantages: QR decomposition of a set of orthogonal columns provides only a group of the original matrix A.
Now the SVD presentation may be provided respectively corresponding to the original matrix row, perpendicular to the base of the column.
Here Insert Picture Description
Column vector of the matrix U, the matrix V are singular vectors;
intermediate diagonal matrix of singular value diagonal element.

Here Insert Picture Description
Two representations of the figure above show two SVD, the former type is FULL, the latter THIN type (also known as economical)

% MATLAB函数
[U,S,V]=svd(A) %第一种的SVD分解
[U,S,V]=svd(A,0) %第二种的SVD分解,也就是thin型的SVD

矩阵的二范数保酉不变性:对矩阵A乘上酉矩阵U(也就是正交阵),前后的矩阵二范数不变。
(注:可以利用矩阵二范数||A||22=tr(ATA)+相似变换不改变矩阵的迹,相似变换不改变矩阵特征值来证明。)
于是有了以下结论,彩色笔涂的是重点。
Here Insert Picture Description
SVD分别给出了原矩阵A的值域、零空间的正交基,如下图。
Here Insert Picture Description
下面这张图告诉我们:SVD分解出的中间的那个对角矩阵中提供的奇异值可以反映出原矩阵A的秩。
Here Insert Picture Description
SVD还可以用来给矩阵去噪:
A=A0+N,其中A是我们实际得到的,A0是真实中不加噪声的原数据,N表示噪声,噪声比较小。我们考虑使用SVD的方法除去噪声,取出或者说近似的得出原始矩阵A0。具体方法是对矩阵A进行SVD分解,并舍弃其中较小的奇异值和对应的奇异向量,如下图,又称为”截断SVD“,低秩近似分解,图中AK就是原矩阵A的近似矩阵。
Here Insert Picture Description
低秩近似的二范数误差是:第k+1个奇异值
Here Insert Picture Description
低秩近似的F范数的误差是:
Here Insert Picture Description
下面的引理给出了矩阵内积的定义:
Here Insert Picture Description
SVD分解出的左右奇异向量证明了,矩阵A可以表示成几个秩为1的矩阵的和。
Here Insert Picture Description
而且,这里的秩1矩阵相互正交,如下图:
Here Insert Picture Description
只有当i=k且j=l时,两个秩1矩阵的内积才不为0,是1.
下图是,低秩近似结果的示意图:
Here Insert Picture Description
SVD解最小二乘问题:

  1. Case full rank of
    Here Insert Picture Description
    proof, as follows:
    Here Insert Picture Description
    Here Insert Picture Description
    Here Insert Picture Description
  2. A case where the original matrix rank-deficient row (row number> the number of columns) of the
    case, there is no single solution, but the number of the minimum norm solution:
    Here Insert Picture Description
    Here Insert Picture Description
    proved as follows:
    Here Insert Picture Description
  3. Solutions SVD underdetermined equation (the number of rows <columns)
    Firstly, the definition of underdetermined equation:
    Here Insert Picture Description
    Conclusion:
    Here Insert Picture Description
%MATLAB程序
[U,S,V]-svds(A,K) %针对大型稀疏矩阵所作的部分SVD分解。
%取前K个奇异值和奇异向量。

SVD drawback is the relatively high costs, poor reusability of updates (for example, you need to add a new matrix of rows and columns)
considered completely orthogonal decomposition:
Here Insert Picture Description
If you use the truncated SVD method, then how strange confirm how many items taken values and singular vectors of it (that is, how to determine k)?
Here Insert Picture Description
Using the reduced rank model:
Here Insert Picture Description
Here Insert Picture Description
Use Case:
The previously created text entry matrix, now seek distance query vector Q . 1 , Q 2 each low rank approximation k.
Here Insert Picture Description
Green font represented using the above reduced rank model:
Here Insert Picture Description
Here Insert Picture Description
assume different k the degree of approximation values of these two vectors of the query (as measured by the relative error), the following results:
Here Insert Picture Description

Published 109 original articles · won praise 30 · views 10000 +

Guess you like

Origin blog.csdn.net/qq_43448491/article/details/103035828