Singular Value Decomposition (SVD) can be regarded as a generalization of square matrix eigenvalue decomposition, which is suitable for matrices of arbitrary shapes.
For matrix $A\in \R^{m\times n}$, without loss of generality, assuming $m\geq n$, singular value decomposition is expected to achieve:
$A=U\Sigma V^T$
Where $U, V$ are orthogonal matrices of order $m, n$ respectively, where the vectors are called left/right singular vectors, $\Sigma$ is the $m\times n$ pair of non-negative main diagonal elements arranged in descending order Angular matrix, called singular value matrix. As shown below:
If $\Sigma$ has rank $r$, you can omit the zeros of the matrix and get a more compact result:
Singular value decomposition must exist, which can be proved by constructing the corresponding decomposition matrix $U,\Sigma,V$. For specific proof, see Li Hang's "Statistical Learning Methods". The proof process includes operations. Of course, you can directly look at the more concise and clear calculation method . Simply put, it is to calculate the eigenvalues of $AA^T$ and $A^TA$ and the corresponding orthogonal matrices, and use the square roots of the eigenvalues to form a singular value matrix.
Geometric meaning: In the case of the first picture above, for vector $x$, the transformation $Ax=U\Sigma V^Tx$ can be understood as first performing a rotation transformation of the orthogonal matrix $V^T$, and then $\ Sigma$ scaling transformation and mapping to $m$ dimensional space, and finally performing $U$ rotation transformation.
Usually the singular values decrease quickly, so the first few larger singular values can be taken and the smaller singular values can be ignored to achieve matrix compression.
The inverse, left inverse, right inverse and pseudo inverse of a matrix can be obtained through singular value decomposition, see here . Among them, the inverse matrix is only available for full-rank square matrices, the left inverse is only available for column-full-rank matrices, the right inverse is only available for row-full-rank matrices, and the pseudo-inverse is to solve an approximate inverse matrix when the rows and rows are not full rank. Pseudo-inversion cannot completely restore the operation of the original matrix, and information will be lost. You can also find the right pseudo-inverse when the rows are of full rank, and you can also find the left pseudo-inverse when the columns are of full rank.