Eigenvalues and eigenvectors (finishing)

 

 

 

Third, the application examples of eigenvalues ​​and eigenvectors of

1, principal component analysis (Principle Component Analysis, PCA)

(1) variance, covariance, correlation coefficient, covariance matrix

    variance:

    Covariance: 

    ** univariate variance is a measure of the degree of dispersion, the covariance is a measure of the degree of correlation of two variables (closeness), the larger a covariance showed more similar the two variables (close), the smaller the mutual covariance between the two variables the greater degree of independence.

    The correlation coefficient: ,

    ** covariance and correlation coefficient indicates that the two can measure the degree of correlation, covariance does not eliminate the dimension, the size of the covariance between the different variables can not be directly compared, and the correlation coefficient eliminates dimension, you can compare between different variables the degree of correlation.

    Covariance matrix: If there are two variables X, Y, then the covariance matrix , the covariance matrix described Relationships between the variables in the sample.

Ideas and algorithms (2) Principal Component Analysis

  Principal component analysis was performed with the idea of dimension reduction, a plurality of variables into a few integrated variables (i.e., main component), wherein each of the main component is a linear combination of the original variables, among the principal components unrelated to the main component of most of the information to reflect the beginning of the variable, and the information contained not overlap. It is a linear transformation, this transformation transforming the data to a new coordinate system such that any projection of the data of the largest variance in the first coordinate (referred to as the first principal component), the second largest variance in the second coordinate (second main component), and so on. The most important feature variance principal component analysis is often used to reduce the dimensionality of the data set, while maintaining the data set contribution.

  Suppose that p variables described object, respectively X- . 1 , X- 2 ... X- p represented, this p variables consisting of p -dimensional random vector for the X-= (X- . 1 , X- 2 ... X- p ), n-samples configuration composed of n rows p matrix column A. Solving the main component as follows:

  The first step in solving the obtained covariance matrix A B;

  The second step, solving the covariance matrix B, obtained feature value vector is arranged in order of size , as a feature value vector of each of the feature values of a diagonal matrix consisting, U is a matrix of all the eigenvalues corresponding to eigenvectorsU, so there . Focus here , U are eigenvectors definite matrix, each row may be regarded as a vector of basis vectors, basis vectors of these matrices B after conversion obtained in each of the telescopic basis vectors, that is the size of the telescopic Feature vector.

A third step of selecting the number of main components, according to the size of the feature value, the feature value is larger as the main component, feature vectors corresponding to a vector-based screening feature value according to the actual circumstances, i.e. typically greater than 1 It can be considered as a main component.

 

Guess you like

Origin www.cnblogs.com/rswss/p/11441046.html