Conquer Statistics 07|What is covariance? What is the correlation coefficient?

This article introduces covariance and correlation coefficient

Contents of this article

Covariance

Covariance describes the relationship between variables

Covariance VS correlation coefficient

Variance VS Covariance

Correlation coefficient

Correlation coefficient quantization strength of the correlation

p value and data volume measure correlation coefficient

Reference


Covariance

Covariance describes the relationship between variables

  • Covariance (Covariance) is mainly used to describe the following three types of relationships between variables :

Positive correlation , such as the figure above, the expression of Gene X and Gene Y in the same cell are positively correlated.

At this time, the covariance is positive.

Negative correlation , such as the figure above, the expression of Gene X in the same cell is negatively correlated with the expression of Gene Y.

At this time, the covariance is positive. 


No correlation , there is no trend relationship between Gene X expression and Gene Y expression at this time.

At this time, the covariance is zero.


Covariance VS correlation coefficient

  • Covariance (Covariance) Another function is to assist other statistical calculations, such as correlation coefficient (correlation).

As shown in the figure above, covariance can describe the correlation category of variables (positive/negative/no correlation), but it is very sensitive to the range of data values ​​and cannot be used to describe the degree of correlation between variables (if positive correlation, positive correlation What is the slope, etc.);

The correlation coefficient is not sensitive to the range of data values ​​and can be used to describe the degree of correlation between variables.


Variance VS Covariance

Variance is a special mode of covariance, which describes the relationship between a variable and itself, which can be easily understood by the formula in the figure below.


Correlation coefficient

The correlation coefficient calculation formula is as shown in the figure above, which relies on the covariance of the two variables and the variance of each variable, which also answers the covariance auxiliary calculation of the correlation coefficient mentioned in the previous section.

The correlation coefficient quantifies the strength of the correlation

Weak correlation corresponds to a smaller correlation coefficient.

Strong correlation corresponds to a larger correlation coefficient.

The correlation coefficient value range is between [-1,1], -1 means the strongest negative correlation, 1 means the strongest positive correlation, and 0 means no correlation.

p value and data volume measure correlation coefficient

The more data collected , the smaller the p-value, and the more reliable the predicted correlation coefficient .


Reference

Guess you like

Origin blog.csdn.net/qq_21478261/article/details/112069370