数据清洗和特征选择→PCA→2.实战

数据清洗和特征选择→PCA→2.实战


  • 程序目的
对2*45的数据进行 读取、旋转(投影)(基为协方差矩阵的特征向量)、降维、白化操作
并展现可视化过程
  • 总结
    • 求解原数据协方差矩阵特征向量时,用的是奇异值分解,且左奇异向量为要投影的特征向量 1.17
    • 在进行数据的旋转时,要从投影的角度看 2.6
    • 3.9 的含义是:从线性表示的角度看,在数据的第一主元方向的坐标是xRot,xHat是在标准基下的坐标参考见线性代数→矩阵→矩阵的秩→矩阵的乘法
    • 在PCA白化时,有时一些特征值(奇异值)在数值上接近于0,这样在缩放步骤时我们除以将导致除
以一个接近0的值。因而在实践中,我们使用少量的正则化实现这个缩放过程,即给特征
值加上一个很小的常数ε  4.7
    • 使用plot画出向量(x,y): plot([0,x],[0,y]) 1.20
  • 错误赵

pacData.txt下载 参考见七牛云→picturetemp→机器学习工程师→pca实战→pcaData.txt
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
%% Step 0: Load data
% We have provided the code to load data from pcaData.txt into x.
% x is a 2 * 45 matrix, where the kth column x(:,k) corresponds to
% the kth data point.Here we provide the code to load natural image(困惑已久的自然函数) data into x.
% You do not need to change the code below.
x = load('pcaData.txt','-ascii');
figure(1);
scatter(x(1, :), x(2, :));
title('Raw data');%未处理的数据
%%================================================================
%% Step 1a: Implement PCA to obtain U
% Implement PCA to obtain the rotation matrix U, which is the eigenbasis
% sigma.
% -------------------- YOUR CODE HERE --------------------
u = zeros(size(x, 1)); % You need to compute this
sigm = x*x'./(size(x, 2));%covariance matrix
[u,s] = svd(sigm);
% --------------------------------------------------------
hold on
plot([0 u(1,1)], [0 u(2,1)]);%The eigenvector corresponding to the largest eigenvalue
plot([0 u(1,2)], [0 u(2,2)]);%The eigenvector corresponding to the second largest eigenvalue
hold off
%%================================================================
 1
2
3
4
5
6
7
8
9
10
11
12
13
%% Step 1b: Compute xRot, the projection on to the eigenbasis
% Now, compute xRot by projecting the data on to the basis defined
% by U. Visualize the points by performing a scatter plot.
% -------------------- YOUR CODE HERE --------------------
xRot = zeros(size(x));
xRot = u'*x;
% --------------------------------------------------------
% Visualise the covariance matrix. You should see a line across the
% diagonal against a blue background.
figure(2);
scatter(xRot(1, :), xRot(2, :));
title('xRot');
%%================================================================
 1
2
3
4
5
6
7
8
9
10
11
12
13
14
%% Step 2: Reduce the number of dimensions from 2 to 1. 
% Compute xRot again (this time projecting to 1 dimension).
% Then, compute xHat by projecting the xRot back onto the original axes
% to see the effect of dimension reduction
% -------------------- YOUR CODE HERE --------------------
k = 1; % Use k = 1 and project the data onto the first eigenbasis
xHat = zeros(size(x)); % You need to compute this
xRot = u(:,1:k)'*x;
xHat = u(:,1:k)*xRot;
% --------------------------------------------------------
figure(3);
scatter(xHat(1, :), xHat(2, :));
title('xHat');
%%================================================================
 1
2
3
4
5
6
7
8
9
10
11
12
%% Step 3: PCA Whitening
% Complute xPCAWhite and plot the results.
epsilon = 1e-5; % 1*10^(-5)
% -------------------- YOUR CODE HERE --------------------
xPCAWhite = zeros(size(x)); % You need to compute this
xRot = u'*x;%2x45 double
xPCAWhite = bsxfun(@rdivide,xRot,sqrt(diag(s)+epsilon));
% --------------------------------------------------------
figure(4);
scatter(xPCAWhite(1, :), xPCAWhite(2, :));
title('xPCAWhite');
%%================================================================




猜你喜欢

转载自www.cnblogs.com/LeisureZhao/p/9754579.html