[Bioinformatics] Singular value decomposition (SVD)

Table of contents

1. Singular value decomposition (SVD)

2. Python implementation

1. Adjust package np.linalg.svd()

2. Customize

3. SVD implements link prediction


1. Singular value decomposition (SVD)

        The core idea of ​​SVD decomposition is to extract the most important information by reducing the rank of the matrix to achieve dimensionality reduction and denoising of the data.

ChatGPT:

        SVD (Singular Value Decomposition) is a commonly used matrix decomposition method that can decompose a matrix into the product of three matrices: U, Σ and V^T. Here, U and V are orthogonal matrices and Σ is a diagonal matrix. The detailed process of SVD decomposition is as follows:

        Suppose we have an m×n matrix A, where m represents the number of rows and n represents the number of columns. The goal of SVD decomposition is to find three matrices U, Σ and V^T such that: A = UΣV^T

        Among them, U is an m×m orthogonal matrix, Σ is an m×n diagonal matrix, and V^T is an n×n orthogonal matrix. Orthogonal matrices satisfy U^TU = I and V^TV = I, where I is the identity matrix.

The elements on the diagonal of the diagonal matrix Σ are called singular values, usually represented by σ. The singular values ​​are ordered from largest to smallest, so they can be written as σ₁ ≥ σ₂ ≥ ⋯ ≥ σᵣ, where r is the rank of matrix A.

The steps of SVD decomposition are as follows:

  1. Calculate the transpose A^T of matrix A and the product AA^T of A to obtain an m×m symmetric matrix.

  2. Perform eigenvalue decomposition on the symmetric matrix AA^T to obtain the eigenvalues ​​and corresponding eigenvectors. Eigenvalues ​​are arranged from largest to smallest.

  3. According to the eigenvalues ​​and eigenvectors, construct an orthogonal matrix U. The column vector of U is the eigenvector corresponding to AA^T, arranged from large to small in eigenvalues.

  4. Calculate the product A^TA of matrix A to obtain an n×n symmetric matrix.

  5. Perform eigenvalue decomposition on the symmetric matrix A^TA to obtain the eigenvalues ​​and corresponding eigenvectors. Eigenvalues ​​are arranged from largest to smallest.

  6. According to the eigenvalues ​​and eigenvectors, construct the orthogonal matrix V. The column vector of V is the eigenvector corresponding to A^TA, arranged from large to small in eigenvalues.

  7. Calculate the singular values ​​from the eigenvalues. The square root of the singular value is the square root of the eigenvalues.

  8. Construct a diagonal matrix Σ from singular values.

Finally, decompose the matrix A into A = UΣV^T.

2. Python implementation

1. Adjust package np.linalg.svd()

In Python, you can use the NumPy library to implement SVD decomposition. Here is a sample code:

import numpy as np

# 定义一个矩阵A
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# 使用NumPy的svd函数进行SVD分解
U, s, Vt = np.linalg.svd(A)

# 打印分解结果
print("矩阵U:")
print(U)
print("奇异值:")
print(s)
print("矩阵V的转置:")
print(Vt)

        Running the above code will get the SVD decomposition result of matrix A. Among them, U is an orthogonal matrix, s is a one-dimensional array containing the singular values ​​of matrix A, and Vt is the transpose matrix of V.

2. Customize

import numpy as np

def svd_decomposition(A):
    # 计算 A 的转置与 A 的乘积
    AAT = np.dot(A, A.T)
    
    # 计算 A 的乘积与 A 的转置的乘积
    ATA = np.dot(A.T, A)
    
    # 计算 A 的转置与 A 的乘积的特征值和特征向量
    eigenvalues_U, eigenvectors_U = np.linalg.eig(AAT)
    
    # 计算 A 的乘积与 A 的转置的特征值和特征向量
    eigenvalues_V, eigenvectors_V = np.linalg.eig(ATA)
    
    # 对特征值进行排序,并获取排序索引
    sorted_indices_U = np.argsort(eigenvalues_U)[::-1]
    sorted_indices_V = np.argsort(eigenvalues_V)[::-1]
    
    # 获取奇异值
    singular_values = np.sqrt(np.sort(eigenvalues_U)[::-1])
    
    # 获取 U 矩阵
    U = eigenvectors_U[:, sorted_indices_U]
    
    # 获取 V 矩阵
    V = eigenvectors_V[:, sorted_indices_V]
    
    # 对 V 进行转置
    V = V.T
    
    return U, singular_values, V

# 定义一个矩阵 A
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# 调用自定义的 SVD 分解函数
U, s, Vt = svd_decomposition(A)

# 打印分解结果
print("矩阵 U:")
print(U)
print("奇异值:")
print(s)
print("矩阵 V 的转置:")
print(Vt)

        The custom  svd_decomposition function is based on the singular value decomposition algorithm and implements SVD decomposition by calculating the eigenvalues ​​and eigenvectors of the transpose of matrix A and the product of A, as well as the product of A and the transpose of A.

3. SVD implements link prediction

import numpy as np

# 定义邻接矩阵A,表示网络结构
A = np.array([[0, 1, 0, 1],
              [1, 0, 1, 0],
              [0, 1, 0, 1],
              [1, 0, 1, 0]])

# 进行奇异值分解
U, S, V = np.linalg.svd(A)

# 选择保留的前k个奇异值和对应的奇异向量
k = 2  # 选择保留的奇异值个数
U_k = U[:, :k]
S_k = np.diag(S[:k])
V_k = V[:k, :]

# 进行链路预测
A_pred = U_k @ S_k @ V_k

# 保留两位小数
A_pred = np.round(A_pred, decimals=2)

# 输出链路预测结果
print("链路预测结果:")
print(A_pred)

        First, an adjacency matrix A is defined, and then  np.linalg.svd the function is used to perform singular value decomposition. Then, we choose to retain the first k singular values ​​and corresponding singular vectors (k=2 is chosen in the example), and reconstruct the predicted adjacency matrix A_pred. Finally, the results of link prediction are output.

Guess you like

Origin blog.csdn.net/m0_63834988/article/details/133083925