Principal component analysis is a dimensionality reduction algorithm that can convert multiple indicators into a few principal components. These principal components are linear combinations of the original variables and are not related to each other. They can reflect the large size of the original data . partial information .
Used when the research question involves multiple variables and there is a strong correlation between the variables
Not available for evaluation models
Thought
The first principal component feels generally a comprehensive component
Calculation steps (written in paper)
- Calculate the correlation coefficient matrix R of the above X matrix
- Compute eigenvalues and eigenvectors of R
- Calculate the cumulative contribution rate (more than 80% is selected as the main component)
- explain
Example of interpretation of principal components
Principal component analysis for clustering
First use matlab code to calculate and select several principal components, then import the data after calculating the values of the principal components into SPSS, first perform cluster analysis to draw a pedigree diagram, and prepare to divide it into several categories based on the pedigree diagram; then do another clustering Analyze, enter the number of categories to be classified in the "Save" option, and finally draw a picture in SPSS based on the classification results.
Principal component analysis for regression
First use matlab code to calculate and select several principal components, and then import the data after calculating the values of the principal components into stata for regression, test for heteroskedasticity, etc.