[Linear Algebra] whitening matrix
For an arbitrary matrix X, under normal circumstances its covariance matrix is not diagonal matrix. It is whitened by a whitening matrix of matrix A, so that Y = A * X covariance matrix into a diagonal matrix.
First, here indicated Y = A * X X is valid when the elements are arranged in columns, arranged in a row if the X, Y = X * A.
1. mathematical derivation
Objective: whitening is to remove the correlation signal, as set whitening matrix V, the data X of the center of the V-linear transformation, a new signal satisfies obtained and is not related to unit variance.
V is solving process:
1. Obtain the transpose of a matrix dot product of X and X A.
2. A solving the eigenvalues and eigenvectors of E. D
3. The configuration of the diagonal matrix (top left bottom right) D2, the diagonal elements of the value of D.
4. The whitening matrix V is the dot product of the transpose of a matrix square root of the matrix E D2.
The dot product of X and V is obtained on the whitening matrix X.
Example: wherein X is a matrix of 2 * 500
X_mean = X.mean (Axis = -1)
X - = X_mean [:, newaxis]
#whiten
A = DOT (X, X.transpose ())
D, E = linalg. EIG (A)
D2 of linalg.inv = (Array ([[D [0], 0.0], [0.0, D [. 1]]], float32))
D2 of [0,0] = sqrt (D2 of [0,0] ); D2 of [1,1] = sqrt (D2 of [1,1])
V = DOT (D2 of, E.transpose ())
return DOT (V, X-), V
Depth learning portal --- PCA, Albino Python implementation
Depth learning portal -PCA, albino already fully set forth the principles and whitened PCA algorithm, this blog update its Python algorithm. Code is very complete notes.
#implement PCA
file=open('/notebooks/pcaData.txt','r')
dataSet=[]
for text in file:
tt=text.strip().split()
line=[]
for t in tt: line.append(float(t)) dataSet.append(line) dataSet=np.array(dataSet) dataSet.shape #(2,45) import matplotlib.pylab as plt %matplotlib inline #画出原数据 plt.figure(1) plt.scatter(dataSet[0,:],dataSet[1,:]) plt.title("origin data") #计算协方差矩阵sigma,以及特征向量矩阵u sigma=dataSet.dot(dataSet.T)/dataSet.shape[1] print(sigma.shape) #(2,2) [u,s,v] = np.linalg.svd(sigma) print(u.shape) #(2,2) #画出两个主成分方向 plt.figure(2) plt.plot([0, u[0,0]], [0, u[1,0]]) plt.plot([0, u[0,1]], [0, u[1,1]]) plt.scatter(dataSet[0,:],dataSet[1,:]) #PCA转换数据,不降维 xRot=u.T.dot(dataSet) xRot.shape #(2,45) #画出PCA转换后的数据 plt.figure(3) plt.scatter(xRot[0,:], xRot[1,:]) plt.title('xRot') k = 1; #降维度为1 #PCA降维,xRot[0:k,:] 为降维度后的数据 xRot[0:k,:] = u[:,0:k].T .dot(dataSet) #还原数据 xHat = u .dot(xRot) print(xHat.shape) plt.figure(4) plt.scatter(xHat[0,:], xHat[1, :]) plt.title('xHat') #PCA Whitening # Complute xPCAWhite and plot the results. epsilon = 1e-5 #这部分用到了技巧,利用s的元素运算后(防止数据不稳定或数据溢大,具体看原理),再恢复对角矩阵。具体见diag函数 xPCAWhite = np.diag(1./np.sqrt(s + epsilon)) .dot(u.T .dot(dataSet)) plt.figure(5) plt.scatter(xPCAWhite[0, :], xPCAWhite[1, :]) plt.title('xPCAWhite') #ZCA白化 xZCAWhite = u .dot(np.diag(1./np.sqrt(s + epsilon))) .dot(u.T .dot(dataSet)) plt.figure(6) plt.scatter(xZCAWhite[0, :], xZCAWhite[1, :]) plt.title('xZCAWhite') plt.show()
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
Wherein the data is file data into pcaData.txt:
-6.7644914e-01 -6.3089308e-01 -4.8915202e-01 -4.8005424e-01 -3.7842021e-01 -3.3788391e-01 -3.2023528e-01 -3.1108837e-01 -2.3145555e-01 -1.9623727e-01 -1.5678926e-01 -1.4900779e-01 -1.0861557e-01 -1.0506308e-01 -8.0899829e-02 -7.1157518e-02 -6.3251073e-02 -2.6007219e-02 -2.2553443e-02 -5.8489047e-03 -4.3935323e-03 -1.7309716e-03 7.8223728e-03 7.5386969e-02 8.6608396e-02 9.6406046e-02 1.0331683e-01 1.0531131e-01 1.1493296e-01 1.3052813e-01 1.6626253e-01 1.7901863e-01 1.9267343e-01 1.9414427e-01 1.9770003e-01 2.3043613e-01 3.2715844e-01 3.2737163e-01 3.2922364e-01 3.4869293e-01 3.7500704e-01 4.2830153e-01 4.5432503e-01 5.4422436e-01 6.6539963e-01 -4.4722050e-01 -7.4778067e-01 -3.9074344e-01 -5.6036362e-01 -3.4291940e-01 -1.3832158e-01 1.2360939e-01 -3.3934986e-01 -8.2868433e-02 -2.4759514e-01 -1.0914760e-01 4.2243921e-01 -5.2329327e-02 -2.0126541e-01 1.3016657e-01 1.2293321e-01 -3.4787750e-01 -1.4584897e-01 -1.0559656e-01 -5.4200847e-02 1.6915422e-02 -1.1069762e-01 9.0859816e-02 1.5269096e-01 -9.4416463e-02 1.5116385e-01 -1.3540126e-01 2.4592698e-01 5.1087447e-02 2.4583340e-01 -5.9535372e-02 2.9704742e-01 1.0168115e-01 1.4258649e-01 1.0662592e-01 3.1698532e-01 6.1577841e-01 4.3911172e-01 2.7156501e-01 1.3572389e-01 3.1918066e-01 1.5122962e-01 3.4979047e-01 6.2316971e-01 5.2018811e-01
- 1
- 2