Two tasks: principal component analysis: a step, and the application code. Are you familiar with the code can use any programming language.

Two tasks: principal component analysis: a step, and the application code. Are you familiar with the code can use any programming language.

Principal component analysis is a statistical multiple indicators as a comprehensive analysis of the few indicators, which generate such dimension reduction technique of principal components, it can reflect the original variables most of the information, usually expressed as a linear original variables combination. Thought PCA is to map n-dimensional feature on the m-dimensional (m <n), which is the m-dimensional orthogonal new feature, referred to as a main component, characterized in that m-dimensional re-constructed, not simply from the n wherein nm is subtracted dimensional feature dimension. The core idea of ​​PCA is the largest along the direction of projection data, the data is easier to distinguish.

PCM dimensionality reduction steps of:

1. we generally select a row is a characteristic averaged for each feature; original data obtained by subtracting the average value of the new data after centering;

2. seeking feature covariance matrix;

The covariance matrix Eigenvalue and eigenvector

4. eigenvalues ​​in descending order according to the arrangement also gives the corresponding feature vector, select several principal components, find the projection matrix.

The data obtained in accordance with our projection matrix dimensionality reduction.

PCM applications:

  1. 主成分分类
    

    The principal component analysis, a scatter plot can be seen, for the different indicators will be gathered, it can be classified

  2. 主成分回归
    

    When multiple independent variables to appear in common, principal component analysis is insufficient to overcome the classic regression

matlab code implementation

step:

1. The center of the data

2. seeking feature covariance matrix

3. find eigenvalues ​​of the covariance matrix and eigenvectors of

4. The feature values ​​in ascending order, taking the maximum values ​​of k, and the corresponding eigenvectors are characterized as column vectors composed of the matrix K

5. The selected sample points projected onto the selected feature vector K

all;

close all;

X=[2.5 2.4;

0.5 0.7;

2.2 2.9;

1.9 2.2;

3.1 3.0;

2.3 2.7;

2 1.6;

1 1.1;

1.5 1.6;

1.1 0.9;];

X=X’

% X=[74 87
84 88 74 86 69 73 64;

%
85 83 83 77 69 84 74 85 84;

%
83 91 89 85 87 86 83 86 85;

%
69 100 82 96 84 82 97 98 76;

%
97 48 89 36 46 53 88 89 97;

%
59 98 93 94 98 100 79 83 61;];

% X=X’;

% X=[2 0
-1.4;

%
2.2 0.2 -1.5;

%
2.4 0.1 -1;

%
1.9 0 -1.2;];

% X=X’;

[a,b]=size(X);

M=sum(X)/a;

for i=1:b

B(:,i)=X(:,i)-M(i);

%B=zscore(X);

end

S=1/(a-1)BB’;

[vector,value]=eig(S);

vector

value=diag(value)

varine=sum(value);

[value_sort,subscript]=sort(value,‘descend’);

value_sort;

subscript;

value_sort=value_sort/sum(value_sort);

compare=0;

sign=0;

for i=1:b

if compare<0.9

    sign=sign+1;

    compare=compare+value_sort(i);

end

end

for i=1:sign

P(:,i)=vector(:,subscript(i));

end

P

D=zeros(sign,sign);

for i=1:sign

D(i,i)=value(subscript(i));

end

D

Released two original articles · won praise 0 · Views 141

Guess you like

Origin blog.csdn.net/ManWen_Li/article/details/104383678