Pearson, Kendall, Spearman correlation

 First, Pearson correlation

In statistics , the Pearson correlation coefficient (Pearson correlation coefficient), also known as the Pearson product-moment correlation coefficient (Pearson product-moment correlation coefficient, or simply referred to PPMCC of PCCs), between the two variables are used to measure the X and Y correlation (linear correlation), a value between -1 and 1.

It is by Karl Pearson from Francis Galton from a similar proposed in the 1880s but slightly different idea of evolution. The correlation coefficient, also called "Pearson product-moment correlation coefficient."

definition

Pearson correlation coefficient between two variables between two variables is defined as the covariance and standard deviation of the quotient:
It is defined on the overall correlation coefficient, commonly used Greek lowercase letters   as a representative symbol. Estimated sample covariance and standard deviation obtained Pearson correlation coefficients, common English lowercase letters    representative of:
 
  Or by the sample point standard scores mean estimate, the above formula to give the equivalent expressions:  
Wherein    ,    and    each is   a standard sample of the fraction, the sample average value and the sample standard deviation .

The correlation coefficient    

0.8-1.0 highly relevant
0.6-0.8 strong correlation
0.4-0.6 moderate correlation
0.2-0.4 weak correlation
0.0-0.2 extremely weak correlation or no correlation

Conditions of Use

When the standard deviation of the two variables is not zero, only the definition of the correlation coefficient, Pearson correlation coefficient is suitable for:

(1), a linear relationship between two variables, are continuous data.

(2), two variables are generally normal or near-normal monomodal distribution.

(3), the observed values ​​for these variables are paired, each pair are independent observations.

 Second, the Kendall correlation (kendall)

Defined kendall (Kendall) coefficients: n a similar sort of statistical target specific properties, other properties are usually scrambled. Same sequence of (concordant pairs) and defined as the ratio of isobaric (discordant pairs) and the difference between the total number of (n * (n-1) / 2) is Kendall (Kendall) coefficients.

R=(P-(n*(n-1)/2-P))/(n*(n-1)/2)=(4P/(n*(n-1)))-1

applicability

Kendall correlation coefficient of Spearman correlation coefficient of data require the same conditions

 Third, the Spearman correlation (spearman)

Two variables of dependence of non-parametric indicators. It uses monotonic correlation evaluation equation two statistical variables. If no data value is repeated, and when the two completely monotonically related variables, Spearman correlation coefficient was +1 or -1.

Spearman correlation coefficient is defined as the level variable between Pearson correlation coefficient . For the sample size n of samples, n raw data is converted into gradation data of the correlation coefficient ρ
or

 

Raw data based on the average position in the descending order of the overall data, is assigned a respective level.

 

 Four, three dependency selection

http://www.datasoldier.net/archives/716


 

Extended:
Covariance (Covariance) in probability theory and statistics used to measure the overall two variables in error .
Expected value are E [ X ] and E [ Y two real random variables] The X and Y covariance between Cov (X, Y) is defined as:
Expectations : the mathematical expectation (mean) (or mean , also referred to as the expectation) is probably the result of each test probability multiplied by the result of the sum of
The following is the mathematical expectation of the important properties:
1.
2.
3.
4. When the X and Y independently of time,
And often called mean square deviation is the square of the deviation from the mean square root of the arithmetic mean, expressed as σ

 

Guess you like

Origin www.cnblogs.com/icase/p/11244591.html