PCA (Principal Component Analysis) and FA (Factor Analysis)

1. The principle is different 

The basic principle of principal component analysis : using the idea of ​​​​dimension reduction (linear transformation), multiple indicators are converted into several unrelated comprehensive indicators (principal components) under the premise of losing little information, that is, each principal component is the original A linear combination of variables, and each principal component is not correlated with each other, so that the principal component has some superior performance than the original variable (the principal component must retain more than 90% of the information of the original variable), so as to simplify the system structure and grasp the the purpose of the question. 

The basic principle of factor analysis : using the idea of ​​dimensionality reduction, starting from the study of the internal dependencies of the original variable correlation matrix, some variables with intricate relationships are expressed as a few common factors and a linear combination of special factors that only have an effect on a certain variable. become. It is to extract a few common factors that explain variables from the data (factor analysis is a generalization of principal components, which is more inclined to describe the correlation between original variables than principal component analysis) 

2. Linear means different directions 

In principal component analysis , the principal component is expressed as a linear combination of variables.

Factor analysis is the expression of variables as a linear combination of common factors

3. Different assumptions 

Principal component analysis : no assumptions are required, 

Factor analysis: requires some assumptions. The assumptions of factor analysis include: there is no correlation between common factors, no correlation between specific factors, and no correlation between common factors and special factors.

4. Different solution methods 

The method of solving the principal components : starting from the covariance matrix (the covariance matrix is ​​known), starting from the correlation matrix (the correlation matrix R is known), the only method used is the principal component method. 

Methods for solving factor loadings : principal component method, principal axis factor method, maximum likelihood method, least squares method, a factor extraction method.

5. Principal components and factors vary differently 

Principal component analysis: When the eigenvalues ​​of a given covariance matrix or correlation matrix are unique, the principal components are generally fixed and unique; 

Factor analysis: The factors are not fixed and can be rotated to get different factors.

6. Number of factors and number of principal components 

Principal component analysis: The number of principal components is fixed. Generally, there are several principal components for several variables (only the amount of information explained by the principal components varies). In practical applications, the first few main components will be extracted according to the gravel diagram. main ingredient. 

Factor analysis: The number of factors needs to be specified by the analyst (SPSS and sas are automatically set according to certain conditions, as long as the factor with the eigenvalue greater than 1 can enter the analysis), the number of specified factors is different and the results are different;  

7. Explain that the emphasis is different:

Principal component analysis: The focus is on explaining the total variance of each variable, 

Factor analysis: focuses on explaining the covariance between variables.  

8. Algorithmic differences: 

Principal component analysis: the diagonal elements of the covariance matrix are the variances of the variables; 

Factor analysis: The diagonal element of the covariance matrix used is not the variance of the variable, but the degree of commonality corresponding to the variable (the part of the variance of the variable that is explained by each factor)

9. The advantages are different: 

Principal component analysis: 
First: If you just want to change the existing variables into a few new variables (the new variables almost have the information of all the original variables) to enter the subsequent analysis, you can use the principal component analysis, but In general, factor analysis can also be used; 
second: by calculating the comprehensive principal component function score, scientific evaluation of objective economic phenomena; 
third: it focuses on the comprehensive evaluation of information contribution and influence in application. 

Fourth: It has a wide range of applications. Principal component analysis does not require the data to come from a normal distribution population. Its technical sources are the technology of matrix operation and the technology of matrix diagonalization and matrix spectral decomposition. Therefore, all multi-dimensional problems can be applied. component dimensionality reduction;  

Factor analysis: For factor analysis, the rotation technique can be used to make the factors better explained, so factor analysis is more dominant in explaining the principal components; secondly, factor analysis is not a choice of the original variables, but based on the information of the original variables. Perform recombination, find common factors affecting variables, and simplify data; 

10. Different application scenarios: Several common combinations: 

Principal component analysis:

Principal component analysis + discriminant analysis, suitable for the case of many variables but few records; 

Principal component analysis + multiple regression analysis, principal component analysis can help determine whether there is collinearity, and be used to deal with collinearity problems; 

Principal component analysis + cluster analysis, but this combined factor analysis can give better play to its advantages. 

factor analysis: 

Firstly, factor analysis + multiple regression analysis can use factor analysis to solve the collinearity problem; 
secondly, factor analysis can be used to find the potential structure between variables; 
thirdly, factor analysis + cluster analysis can be used to find cluster variables through factor analysis , thereby simplifying the clustering variables; in 
addition, factor analysis can also be used to confirm the intrinsic structure 


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325564559&siteId=291194637