The most commonly used matrix/vector derivation formula in machine learning


1 Trace of matrix

Matrix trace : For an n-th order square matrix A, the trace of A is the sum of the elements on the main diagonal, that is, tr(A)=Σ i∈[1,n] a ii . The nature of the trace:

(1)tr(AT)=tr(A);

(2) tr (A + B) = tr (A) + tr (B) ;

(3) tr (AB) = tr (BA) ;

(4) tr (ABC) = tr (BCA) = tr (CAB)。


2 The nature of the determinant

The properties of the determinant : Suppose A and B are square matrices of order n, and c is a constant. The properties of the determinant are as follows:

(1) | c · A | = c n | A |;

(2)|AT|=|A|;

(3)|A·B|=|A|·|B|;

(4) If A is an invertible matrix, then |A -1 |=1/|A|;

(5)|An|=|A|n


3 The derivative of a vector with respect to a scalar and the derivative of a scalar with respect to a vector

The derivative of a vector with respect to a scalar and the derivative of a scalar with respect to a vector: the derivative of vector α with respect to scalar x, and the derivative of scalar x with respect to vector α are vectors, and the i-th components are respectively:

(1)(∂α/∂x)i=∂αi/∂x;

(2)(∂x/∂α)i=∂x/∂αi


4 The derivative of a matrix with respect to a scalar and the derivative of a scalar with respect to a matrix

The derivative of a matrix with respect to a scalar and the derivative of a scalar with respect to a matrix: the derivative of matrix A with respect to scalar x, and the derivative of scalar x with respect to matrix A are all matrices, and the elements in the i-th row and j-th column are:

(1)(∂A/∂x)ij=∂Aij/∂x;

(2)(∂x/∂A)ij=∂x/∂Aij


5 The derivative of the function f(x) with respect to the vector x

The derivative of the function f(x) with respect to the vector x: Assuming that the function f(x) is derivable with respect to the elements of the vector x, then:

(1) The first derivative of f(x) with respect to vector x is a vector, and its i-th component is: (▽f(x)) i =∂f(x)/∂x i ;

(2) The second derivative of f(x) with respect to the vector x is a square matrix called the Hessian matrix, and the elements on the i-th row and j-th column are: (▽ 2 f(x)) ij =∂ 2 f (x)/∂x i ∂x j .


6 The derivatives of vectors and matrices satisfy the multiplication rule

The derivatives of vectors and matrices satisfy the multiplication rule :

(1) ∂x T α / ∂x = ∂α T x / ∂x = α ;

(2)∂AB/∂x=(∂A/∂x)·B+A·(∂B/∂x)。


7 Derivative of the inverse matrix

The derivative of the inverse matrix : ∂A -1 /∂x=-A -1 ·∂A/∂x·A -1 .


8 Derivative formula for the trace of a matrix

Regarding the derivation formula of the trace of the matrix :

(1) ∂tr (AB) / ∂A ij = B ji

(2) ∂tr (AB) / ∂A = B T

(3)∂tr(ATB)/∂A=B;

(4) ∂tr(A)/∂A=Ⅰ, where Ⅰ is the unit matrix;

(5) ∂tr (ABA T ) / ∂A = A · (B + B T ) ;


9 Find the derivative chain rule

Find the derivative chain rule : The chain rule is an important tool when calculating complex derivatives. If f(x)=g(h(x)), then:
Insert picture description here


10 The most commonly used derivation formula of matrix

The most commonly used derivation formula for a matrix :

(1)∂xTAx/∂x=(A+AT)x;

(2) The above formula should be the following special case. If you look at it this way, the following formula does not seem to be correct... Only when W is a symmetric matrix, that is, W T = W, the following formula is correct:
Insert picture description here


END

Guess you like

Origin blog.csdn.net/qq_40061206/article/details/113797236