奇异值分解（Singular Value Decomposition, SVD）

提取信息的强大工具。简化数据、去除噪声、提高算法结果。

利用SVD实现，我们能够用小得多的数据集来表示原始数据集。这样做，实际上是去除了噪声和冗余信息。SVD时一种强大的降维工具，可以利用SVD来逼近矩阵并从中提取重要特征，通过保留矩阵80%~90%的能量，就可以得到重要特征并去掉噪声。

SVD的应用

这里先介绍SVD可能的用途，下一节介绍SVD相关知识。

隐语义索引

SVD的历史超过了上百个年头，但在最近几十年中，我们发现了它在计算机领域的更多的使用价值。最早的SVD应用之一就是信息检索。我们称利用SVD的方法为隐性语义索引（Latent Semantic Indexing, LSI） 或者 隐性语义分析（Latent Semantic Analysis, LSA）.

在LSI中，一个矩阵是由文档和词语组成的。当我们在该矩阵上应用SVD时，就会构建出多个奇异值。这些奇异值代表了文档中的概念或主题，这个特点可以用于更高效的文档检索。

见Blog 主题模型 https://blog.csdn.net/Shingle_/article/details/81989090

数据压缩、数据降维

如图像压缩

矩阵分解之SVD

这里写图片描述

{D a t a}_{m \times n} = U_{m \times m} Σ_{m \times n} V_{n \times n}^{T}

${Data}_{m\times n}=U_{m\times m}\mathrm{\Sigma}_{m\times n}V_{n\times n}^T$

import numpy as np
A = np.array([[4,0],[3,-5]])
U, Sigma, VT = np.linalg.svd(A)

SVD的求解过程

Step 1. Compute its transpose A^T and A^TA
Step 2. Determine the eigenvalues of A^TA and sort these in descending order, in the absolute sense. Square roots these to obtain the singular values of A.
Step 3. Construct diagnal matrix S by placing singular values in descending order along its diagonal. Compute its inverse, S^-1.
Step 4. Use the ordered eigenvalues from step2 and compute the eigenvectors of A^TA. Place these eigenvectors along the columns of V and compute its transposem, V^T.
Step 5. Compute U as U=AVS^-1. To complete the proof, compute the full SVD using A=USV^T.

U、V是正交矩阵，S是对角矩阵。

对比PCA：PCA中得到的是矩阵的特征值，得到数据的重要特征。奇异值是矩阵Data * Data^T特征值的平方根。

https://cs.fit.edu/~dmitra/SciComp/Resources/singular-value-decomposition-fast-track-tutorial.pdf

http://www.ce.yildiz.edu.tr/personal/banud/file/1201/latent-semantic-indexing-fast-track-tutorial.pdf

《Machine Learning in Action》 14.1 P253

《Introduce to LINEAR AlGEBRA》 6.7 P364