Dimensionality reduction
Use a low-dimensional vector to represent the original high-dimensional features to avoid dimensional disasters.
Dimensionality reduction method
Principal component analysis
Linear discriminant analysis
Isometric mapping
Local linear embedding
Laplace feature map
Locally preserved projection
PCA maximum variance theory
The high-dimensional vector of the original data contains redundancy and noise. Principal Components Analysis (PCA) is the most classic dimensionality reduction method, with linear, unsupervised, and global characteristics.
PCA needs to define principal components and design to extract principal components.
For example,
if a series of data points cross a plane in a three-dimensional space, if we use xyz to represent it, three dimensions are needed. And if we put it on a plane and use xy to represent, then there are only two dimensions, and there will be no loss of data. In this way, we have completed the dimensionality reduction, from three-dimensional to two-dimensional.
For a given set of data
, Where all vectors are column vectors, and are expressed as
,among them
。
PCA solution method.
Centralized processing of sample data.
Find the sample covariance matrix.
Perform eigenvalue decomposition on the covariance matrix, and arrange the eigenvalues from large to small.
Take the eigenvector corresponding to d before the eigenvalue
, The n-dimensional sample is mapped to d-dimensional through the following mapping