Machine learning day11 dimensionality reduction

Dimensionality reduction

Use a low-dimensional vector to represent the original high-dimensional features to avoid dimensional disasters.

Dimensionality reduction method

  • Principal component analysis

  • Linear discriminant analysis

  • Isometric mapping

  • Local linear embedding

  • Laplace feature map

  • Locally preserved projection

PCA maximum variance theory

The high-dimensional vector of the original data contains redundancy and noise. Principal Components Analysis (PCA) is the most classic dimensionality reduction method, with linear, unsupervised, and global characteristics.
PCA needs to define principal components and design to extract principal components.
For example,
if a series of data points cross a plane in a three-dimensional space, if we use xyz to represent it, three dimensions are needed. And if we put it on a plane and use xy to represent, then there are only two dimensions, and there will be no loss of data. In this way, we have completed the dimensionality reduction, from three-dimensional to two-dimensional.
For a given set of dataimage.png

, Where all vectors are column vectors, and are expressed asimage.png

,among themimage.png

PCA solution method.

  1. Centralized processing of sample data.

  2. Find the sample covariance matrix.

  3. Perform eigenvalue decomposition on the covariance matrix, and arrange the eigenvalues ​​from large to small.

  4. Take the eigenvector corresponding to d before the eigenvalue

  5. image.png
  6. , The n-dimensional sample is mapped to d-dimensional through the following mapping


Guess you like

Origin blog.51cto.com/15069488/2578598