Use PCA for coordinate system conversion
pca is a commonly used method of data dimensionality reduction, and the steps of dimensionality reduction are:
- Select the first k eigenvalues.
If we do not choose this step, then dimensionality reduction will not be performed, but coordinate system conversion will be performed.
Specific steps
1. First generate data with Gaussian two-dimensional distribution
matlab code
mul = [1 2];
SIGMA = [1 0.81; 0.81 1];
data1 = mvnrnd(mul,SIGMA,500);
plot(data1(:,1),data1(:,2),'*');
axis equal
2. Use PCA to select the coordinate axis
The optimization purpose of the new coordinates is to make the coordinate axes orthogonal, and the variance of the data along these coordinate directions is the largest.
clear;clc;close all;
mul = [1 2];
SIGMA = [1 0.81; 0.81 1];
data1 = mvnrnd(mul,SIGMA,500);
[pc,score,latent] = pca(data1);
figure(1)
axis equal
plot(data1(:,1),data1(:,2),'*');
hold on
quiver(1,2,pc(1,1),pc(2,1),5)
quiver(1,2,pc(1,2),pc(2,2),5)
plot(sore(:,1),score(:,2))
In this way, a new coordinate system can be established.
Introduction to the main process of 3.m code
- Generate random data with Gaussian two-dimensional distribution
- Use of pca function
- Draw vectors on coordinates
PCA performs multi-dimensional dimensionality reduction and evaluation of dimensionality reduction effects
Sometimes when pca reduces the dimensionality, the local manifold of the data is lost, causing bad results.
1. Generate data
First define a function to generate a series of regular points
%生成一系列园点
function [x1,y1] = creat_circle(r1 , r1_ratio,sita_ratio)
sita = 0:0.05:2*pi;
all_num = size(sita);
all_num = all_num(1,2);
%rand : sita
sita_p = randperm(all_num,floor(sita_ratio*all_num));
%rand : r
r_p = rand(1,floor(sita_ratio*all_num))*r1*r1_ratio;
r1_p = repmat(r1,1,floor(sita_ratio*all_num));
r1_p = r1_p - r_p;
x1 = r1_p.*cos(sita_p);
y1 = r1_p.*sin(sita_p);
scatter(x1,y1)
Then run the following code:
% 建立坐标点
clear;clc;close all;
[x1,y1] = creat_circle(3,0.05,0.95);
[x2,y2] = creat_circle(5,0.05,0.95);
[x3,y3] = creat_circle(9,0.05,0.95);
num = size(x1);
z1 = normrnd(5,1,1,num(1,2))+x1;
z2 = wgn(1,num(1,2),1)+4+y2;
z3 = rand(1,num(1,2))+2+x3;
% 画
figure(1)
scatter(x1,y1,'r')
hold on
scatter(x2,y2,'b')
scatter(x3,y3,'g')
figure(2)
scatter3(x1,y1,z1,'r')
hold on
scatter3(x2,y2,z2,'b');
scatter3(x3,y3,z3,'g');
After generation, we can look at the distribution of these points.
From another angle, we can see the pattern. We hope to preserve this law after dimensionality reduction.
However, in fact, after using PCA for dimensionality reduction (reduced to 2 dimensions), it is like this: In
this way, the effect of dimensionality reduction is not good.