Covariance represents whether two variables deviate from the mean at the same time.
If there is a positive correlation, this calculation formula, each sample pair (Xi, Yi), and each summation term are mostly positive numbers, that is, the two deviate from their respective mean values in the same direction, but there are also some deviations that do not deviate at the same time, but less, so that when When there are many samples, the sum result is positive. The picture below is very intuitive. The following is reproduced from: http://blog.csdn.net/wuhzossibility/article/details/8087863
In probability theory, the relationship between two random variables X and Y generally has the following three situations:
When the joint distribution of X and Y is as shown in the figure above, we can see that there are roughly: the larger X is, the larger Y is, and the smaller X is, the smaller Y is. In this case, we call it "positive correlation".
When the joint distribution of X and Y is as shown in the figure above, we can see that there are roughly: the larger X is, the smaller Y is, and the smaller X is, the larger Y is. This situation is called "negative correlation".
How to express these three related situations with a simple number?
In the area (1) in the figure, there are X>EX, Y-EY>0, so (X-EX)(Y-EY)>0;
In the area (2) in the figure, there are X<EX, Y-EY>0, so (X-EX)(Y-EY)<0;
In the area (3) in the figure, there are X<EX , Y-EY<0 , so (X-EX)(Y-EY)>0;
In the area (4) in the figure, there are X>EX, Y-EY<0, so (X-EX)(Y-EY)<0.
When X and Y are positively correlated, their distribution is mostly in regions (1) and (3), and a small part in regions (2) and (4), so on average, there is E(X-EX)( Y-EY)>0.
When X and Y are negatively correlated, their distribution is mostly in regions (2) and (4), and a small part in regions (1) and (3), so on average, there is (X-EX)(Y -EY)<0 .
When X and Y are uncorrelated, they are distributed almost as much in regions (1) and (3) as in regions (2) and (4), so on average, there are (X-EX) (Y-EY)=0 .
So, we can define a numerical feature that represents the relationship between X, Y, that is , covarianceWhen cov(X, Y)>0, it means that X and Y are positively correlated;
When cov(X, Y)<0, it indicates that X and Y are negatively correlated;
When cov(X, Y)=0, it means that X and Y are not related.
That's what covariance means.