The role of principal component analysis is dimensionality reduction. When the amount of data has multiple dimensions, some dimensions contribute greatly to the data, while others contribute little to the data. Through principal component analysis, finding important dimensions can greatly reduce the amount of calculation.
The central idea of PCA:
A Center: Reconstruction of the original feature space.
Two basic points: maximum projection variance, minimum reconstruction distance.
---------------------------------------------------------------------------------------------------------------------------------
The minimum reconstruction distance is constructed by the following formula.
Before reconstruction : (xn is each sample in decentralization)
Represents the original point, which can be expressed as the sum of d vectors (d dimensions). Through decomposition, it can be decomposed into two sets of vectors, PCA retains a part, discards a part, discards this part, and retains this part. a is the length of each decomposed vector u, multiplied and summed to reconstruct the original sample.
The effect of a point projected onto u_m, and the effect of all points projected onto the purple line (two-dimensional converted to one-dimensional)
After refactoring :
The cost of refactoring is to minimize the distance before and after refactoring: (after subtracting the two formulas, the latter part is left)
Here S is the covariance matrix.
Then the loss function is:
Using Lagrange multiplier constrained optimization, the formula becomes:
(1) After Lagrangian optimization
(2) Derivation
but:
(3) Results:
Denotes the eigenvector of S, denoting the eigenvalue .
---------------------------------------------------------------------------------------------------------------------------------
Then the steps of PCA are:
1. Averaging, Decentralization
2. Calculate the covariance matrix
3. Eigen decomposition
The process of matrix decomposition looks like this
4. Sort the columns of U by the eigenvalues
5. Select M eigenvectors to form
6. Make a projection