Covariance matrix and correlation coefficient matrix

foreword

  This blog mainly introduces the relevant knowledge of variance, covariance and correlation coefficient, and then introduces the covariance matrix and correlation coefficient matrix, and explains it with related examples.

1. Variance, covariance and correlation coefficient

  In "Probability Theory and Mathematical Statistics", variance is used to measure a single random variable XXThe degree of dispersion of X , denoted asDX DXD X , the calculation formula is as follows:
DX = E ( X − EX ) 2 = EX 2 − E 2 X \begin{aligned} DX &= E(X-EX)^2 \\[3pt] &= EX^2 - E^2X \end{aligned}DX=E ( XEX)2=EX2E2X _  The mathematical expression is: σ 2 ( x ) = 1 n − 1 ∑ i = 1 N ( xi − x ˉ ) 2 \sigma ^2(x) = \frac {1} {n-1}\sum _{i =1} ^N (x_i - \bar x)^2p2(x)=n11i=1N(xixˉ)2

  Right now方差 = 平方的期望 - 期望的平方

  Covariance is used to measure two random variables XXXYYThe degree of similarity between Y , denoted asC ov ( X , Y ) Cov(X,Y)C o v ( X ,Y ) ,Define as follows:
C ov ( X , Y ) = E [ ( X − EX ) ⋅ ( Y − EY ) ] = E ( XY ) − EX ⋅ EY \begin{aligned} Cov(X,Y) & = E[(X - EX) \cdot(Y - EY)] \\[3pt] &= E(XY) - EX \cdot EY \end{aligned}C o v ( X ,Y)=And [( XEX)(YEY)]=I ( XY ) _EXEY  Mathematical expression: σ ( x , y ) = 1 n − 1 ∑ i = 1 N ( xi − x ˉ ) ( yi − y ˉ ) \sigma (x , y ) = \frac {1} {n-1 }\sum _{i=1} ^N (x_i - \bar x) (y_i - \bar y)σ ( x ,y)=n11i=1N(xixˉ )(andiyˉ)

  From the formula point of view, the covariance is the difference between two variables and their own expectations, and then multiplies them, and then takes the expectation of the product. That is to say, when the value of one of the variables is greater than its own expectation, and the value of the other variable is also greater than its own expectation, that is, the trend of change of the two variables is the same, at this time, the covariance between the two variables takes a positive value . Conversely, that is, when one of the variables is greater than its own expectation, and the other variable is less than its own expectation, then the covariance between these two variables takes a negative value.

  Correlation coefficient, also called Pearson (Pearson)correlation coefficient, is used to measure two random variables XXXYYThe degree of correlation between Y , recorded asρ XY \rho_{XY}rXY, the calculation formula is:
ρ XY = C ov ( X , Y ) DXDY \rho_{XY} = \frac {Cov(X,Y)} {\sqrt {DX} \sqrt {DY}}rXY=DX DY C o v ( X ,Y)  If XY > 0 \rho_{XY} >rXY>0 , means the random variableXXXYYY is positively correlated;
  ifρ XY < 0 \rho_{XY} < 0rXY<0 , means the random variableXXXYYY is negatively correlated;
  ifρ XY = 0 \rho_{XY} = 0rXY=0 , means the random variableXXXYYY is uncorrelated, that is, independent of each other;
  ifρ XY = ± 1 \rho_{XY} = \pm1rXY=± 1 , means random variableXXXYYY is linearly related;

  The correlation coefficient can also be regarded as a covariance: a special covariance that eliminates the impact of the dimension of the two variables and is standardized. It eliminates the influence of the range of change of the two variables, but simply reflects the change of the two variables per unit. similarity.

2. Covariance matrix

  In actual scenarios, when we describe an object, we do not describe it from only one or two dimensions. For example, when describing the performance of a neural network model, we need to consider the size, accuracy, and inference time of the model, etc. dimension to measure. When performing multidimensional data analysis, the degree of correlation between different dimensions needs to (covariance matrix)be described by a covariance matrix. The degree of correlation between dimensions constitutes a covariance matrix, and the elements on the main diagonal of the covariance matrix are The variance of the data along each dimension.
  The expression for the covariance matrix is: ∑ = [ σ ( x 1 , x 1 ) … σ ( x 1 , xn ) ⋮ ⋱ ⋮ σ ( xn , x 1 ) … σ ( xn , xn ) ] \sum = \begin {bmatrix} \sigma (x_1, x_1) & \dots & \sigma (x_1, x_n) \\ \vdots & \ddots & \vdots \\ \sigma (x_n, x_1) & \dots & \sigma (x_n, x_n) ) \\ \end{bmatrix}= s ( x1,x1)s ( xn,x1)s ( x1,xn)s ( xn,xn)

3. Correlation coefficient matrix

  As the name implies, it is a matrix composed of correlation coefficients (correlation matrix), also called a coefficient matrix, and the value range of each element in the matrix is [-1, 1]​​.
  The expression of the correlation coefficient matrix is: C = [ ρ ( x 1 , x 1 ) … ρ ( x 1 , xn ) ⋮ ⋱ ⋮ ρ ( xn , x 1 ) … ρ ( xn , xn ) ] = [ 1 … ρ ( x 1 , xn ) ⋮ ⋱ ⋮ ρ ( xn , x 1 ) … 1 ] \begin{aligned} C &= \begin{bmatrix} \rho(x_1, x_1) & \dots & \rho(x_1, x_n) \\ \vdots & \ddots & \vdots \\ \rho(x_n, x_1) & \dots & \rho(x_n, x_n) \\ \end{bmatrix}\\[5pt] &= \begin{bmatrix} 1 & \dots & \rho(x_1, x_n) \\ \vdots & \ddots & \vdots \\ \rho(x_n, x_1) & \dots & 1 \\ \end{bmatrix} \end{aligned}C= p ( x1,x1)p ( xn,x1)p ( x1,xn)p ( xn,xn) = 1p ( xn,x1)p ( x1,xn)1

Guess you like

Origin blog.csdn.net/qq_42730750/article/details/122600973