Machine Learning Notes - SVD Singular Value Decomposition (1)

1. Rotation or scaling

1 Overview

        Eigen decomposition can only be done on square matrices. For non-square matrices, singular value decomposition (SVD) can be used.

        With SVD, you can decompose a matrix into three matrices. We treat these new matrices as subtransformations of space. Instead of doing the transition in one action, break it down into three actions.

        Finally, we'll apply SVD to image processing, seeing the effect of SVD on an example image.

        We decompose A into 3 matrices (instead of two using eigendecomposition):

Singular Value Decomposition (SVD)

         Matrices U, D and V have the following properties:

        U and V are orthogonal matrices and

        D is a diagonal matrix (all 0s except the diagonal). However D does not have to be square. 

        U is called the left singular vector of A, and V is called the right singular vector of A. Values ​​along the diagonal of D are singular values ​​of A.

        Here are the dimensions of the decomposition:

The dimension of singular value decomposition

         The diagonal matrix of singular values ​​is not square, but has the shape of A.

        Here is the example provided in the Numpy documentation, creating a zero matrix of the same shape as A and filling it with singular values:

smat = np.zeros((9, 6), dtype=complex)
smat[:6, :6] = np.diag(s)

        The intuition behind singular value decomposition requires some explanation of the idea of ​​matrix transformations. Here are a few examples showing how to transform spaces through 2D square matrices. Hope for a better understanding: A is a matrix and can be seen as a linear transformation. This transform can be broken down into three sub-transforms: 1. Rotate, 2. Rescale, 3. Rotate. These three steps correspond to the three matrices U, D and V.

        You can see detailed steps from the Wikipedia article on SVD.

https://en.wikipedia.org/wiki/Singular_value_decomposition https://en.wikipedia.org/wiki/Singular_value_decomposition         Every matrix can be seen as a linear transformation

        You can think of matrices as a specific linear transformation. When you apply this matrix to a vector or another matrix, you apply this linear transformation to it.

2, Example 1

        1、

        2、

        3、

         We see from the calculation above that applying the matrix: , just doubles each coordinate of the vector. Here is a graphical representation of v and its transform w:

apply matrix to vector multiply each coordinate by 2

3, example 2

         To represent linear transformations relative to matrices, we can also draw the unit circle and see how matrices transform it. The unit circle represents the coordinates of each unit vector (a vector of length 1).

unit circle

         You can then apply a matrix to all these unit vectors to see what kind of deformation it will produce.

        

Multiply each coordinate of the unit circle by 2

         We can see that the matrix doubles the size of the circle. However, in some transformations, the changes applied to the x-coordinate are not the same as those applied to the y-coordinate. Let's move on to what it means graphically.

4, example 3

         Apply the following matrix transformation to the unit circle

        

This time the matrix does not rescale each coordinate with the same weight

         We can check this with the equations associated with this matrix transformation. Suppose the coordinates of the new circle (after transformation) are x' and y'. The relationship between the old coordinates (x, y) and the new coordinates (x', y') is:

         We also know that the equation of the unit circle is (the norm of the unit vector is 1). By substitution, we end up with:

         We can check if this equation corresponds to our transform circle. Let's start by drawing the old circle. Its equation is:

x = np.linspace(-1, 1, 100000)
y = np.sqrt(1-(x**2))
plt.plot(x, y, sns.color_palette().as_hex()[0])
plt.plot(x, -y, sns.color_palette().as_hex()[0])
plt.xlim(-1.5, 1.5)
plt.ylim(-1.5, 1.5)
plt.show()

         Now let's add the circle we get after the matrix transformation. We see that it is defined as

x1 = np.linspace(-3, 3, 100000)
y1 = 2*np.sqrt(1-((x1/3)**2))
plt.plot(x, y, sns.color_palette().as_hex()[0])
plt.plot(x, -y, sns.color_palette().as_hex()[0])
plt.plot(x1, y1, sns.color_palette().as_hex()[1])
plt.plot(x1, -y1, sns.color_palette().as_hex()[1])
plt.xlim(-4, 4)
plt.ylim(-4, 4)
plt.show()
Transform circle of equations

         Note that these examples use a diagonal matrix (all zeros except the diagonal). The general rule is that transformations related to diagonal matrices only mean a rescaling of each coordinate and not a rotation. This is the first element of understanding SVD.

5, example 4

        Off-diagonal matrices can be rotated. It's easier to talk about rotation using trigonometric functions, we have the matrix

This matrix will rotate our vector or matrix counterclockwise by an angle θ . Our new vectors are let's start with a vector u with coordinates x=0 and y=1 and a vector v with coordinates x=1 and y=0. The vector u' v' is the rotation vector.

The unit vector rotates counterclockwise, θ=45

         First, let's create a function plotVectors() to plot vectors:

def plotVectors(vecs, cols, alpha=1):
    plt.figure()
    plt.axvline(x=0, color='#A9A9A9', zorder=0)
    plt.axhline(y=0, color='#A9A9A9', zorder=0)

    for i in range(len(vecs)):
        x = np.concatenate([[0,0],vecs[i]])
        plt.quiver([x[0]],
                   [x[1]],
                   [x[2]],
                   [x[3]],
                   angles='xy', scale_units='xy', scale=1, color=cols[i],
                   alpha=alpha)

        Plot the vectors u and v

orange = '#FF9A13'
blue = '#1190FF'

u = [1,0]
v = [0,1]

plotVectors([u, v], cols=[blue, blue])

plt.xlim(-1.5, 1.5)
plt.ylim(-1.5, 1.5)

plt.text(-0.25, 0.2, r'$\vec{u}$', color=blue, size=18)
plt.text(0.4, -0.25, r'$\vec{v}$', color=blue, size=18)
plt.show()

        They are the basis vectors of our space. We will compute the transformation of these vectors:

         We now draw these new vectors to check if they are the base vectors we rotated 45∘.

u1 = [-np.sin(np.radians(45)), np.cos(np.radians(45))]
v1 = [np.cos(np.radians(45)), np.sin(np.radians(45))]

plotVectors([u1, v1], cols=[orange, orange])
plt.xlim(-1.5, 1.5)
plt.ylim(-1.5, 1.5)

plt.text(-0.7, 0.1, r"$\vec{u'}$", color=orange, size=18)
plt.text(0.4, 0.1, r"$\vec{v'}$", color=orange, size=18)
plt.show()

         The numpy functions sin and cos take input in radians. We can convert the angle from degrees to radians using the function np.radians().

        We can also transform a circle. We'll take a rescaled circle so we can see the effect of the rotation.

x = np.linspace(-3, 3, 100000)
y = 2*np.sqrt(1-((x/3)**2))

x1 = x*np.cos(np.radians(45)) - y*np.sin(np.radians(45))
y1 = x*np.sin(np.radians(45)) + y*np.cos(np.radians(45))

x1_neg = x*np.cos(np.radians(45)) - -y*np.sin(np.radians(45))
y1_neg = x*np.sin(np.radians(45)) + -y*np.cos(np.radians(45))

u1 = [-2*np.sin(np.radians(45)), 2*np.cos(np.radians(45))]
v1 = [3*np.cos(np.radians(45)), 3*np.sin(np.radians(45))]

plotVectors([u1, v1], cols=['#FF9A13', '#FF9A13'])

plt.plot(x, y, '#1190FF')
plt.plot(x, -y, '#1190FF')

plt.plot(x1, y1, '#FF9A13')
plt.plot(x1_neg, y1_neg, '#FF9A13')

plt.xlim(-4, 4)
plt.ylim(-4, 4)
plt.show()

         We can see that the circle has been rotated by 45∘ angles. We chose the length of the vector from the rescaling weights of example 3 (factors 3 and 2) to match the circle.

2. Conclusion

        We saw how to transform vectors and matrices by rotating or scaling them. SVD can be seen as decomposing a complex transformation into 3 simpler transformations (rotate, scale, rotate).

        Note that we only use square matrices in this section. SVD can be done with non-square matrices, but it is difficult to represent transformations related to non-square matrices. For example, a 3 x 2 matrix maps 2D space to 3D space.

         In the next article we will use SVD for matrix factorization.

Guess you like

Origin blog.csdn.net/bashendixie5/article/details/124266777