【Mathematics】【Matrix】Trace and related properties

Many mathematical properties cannot be remembered firmly, and need to be re-derived every time they are used. In order to reduce this kind of waste of time, I decided to organize it thoroughly every time I use it in the future, so that it can also benefit readers after my own use.
All the content of this article has been strictly verified and derived, but it is limited to the level, and it is inevitable that there will be mistakes. Please correct me if you find any problems , thank you!

1. Definition of trace

In linear algebra, the nnn order square matrix (ien × nn\times nn×n matrix)A {\bf A}The sum of the elements on the main diagonal of A is called a square matrixA {\bf A}A trace (trace), recorded astr ( A ) {\rm tr}(\bf A)tr(A)

Note here that the trace is defined on a square matrix. If it is not a square, then there is no trace. In MATLAB, you can directly use the trace function on the matrix A to get its trace (code:trace(A)), but if the trace function is used for a non-square matrix, an error "matrix must be a square matrix" will be reported.

2. Basic properties of trace operations

(1) Transpose does not change the trace: tr ( AT ) = tr ( A ) {\rm tr}({\bf A}^{\rm T}) = {\rm tr}(\bf A)tr(AT)=tr ( A )
(2) The trace operation is a linear operation:tr ( a A + b B ) = a ⋅ tr ( A ) + b ⋅ tr ( B ) {\rm tr}(a{\bf A}+b{ \bf B}) = a\cdot{\rm tr}({\bf A}) + b\cdot{\rm tr}({\bf B})tr(aA+bB)=atr(A)+btr ( B )
(3) The order of matrix multiplication does not change the trace:
tr ( AB ) = tr ( BA ) {\rm tr}({\bf AB})={\rm tr}({\bf BA})tr(AB)=tr(BA)
t r ( A B C ) = t r ( C A B ) = t r ( B C A ) {\rm tr}({\bf ABC})={\rm tr}({\bf CAB})={\rm tr}({\bf BCA}) tr(ABC)=tr(CAB)=tr(BCA)

3. Common mixed operations of trace and partial derivative

(1) ∂ t r ( A B ) ∂ A = B T \frac{\partial{\rm tr}({\bf AB})}{\partial {\bf A}}= {\bf B}^{\rm T} Atr(AB)=BT, ∂ t r ( A B ) ∂ B = A T \frac{\partial{\rm tr}({\bf AB})}{\partial {\bf B}}= {\bf A}^{\rm T} Btr(AB)=AT
(2) ∂ t r ( A A T ) ∂ A = 2 A \frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T} )}{\partial {\bf A}}= 2{\bf A} Atr(AAT)=2A
证明: ∂ t r ( A A T ) ∂ A \frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T} )}{\partial {\bf A}} Atr(AAT)
= ∂ tr ( A unchanged AT ) ∂ A + ∂ tr ( AA unchanged T ) ∂ A =\frac{\partial {\rm tr}( {\bf A}_{{unchanged} {\bf A}^ {\rm T} )}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T}_{unchanged} ) }{\partial {\bf A}}=Atr(AUnchangedAT)+Atr(AAconstantT)
= 2 ∂ t r ( A A 不变 T ) ∂ A =2\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T}_{不变} )}{\partial {\bf A}} =2Atr(AAconstantT)(Using the property (1) in 2, there is ∂ tr ( A does not change AT ) ∂ A = ∂ tr ( AA does not change T ) ∂ A \frac{\partial {\rm tr}( {\bf A}_{not change} {\bf A}^{\rm T} )}{\partial {\bf A}}=\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T}_{unchanged} )}{\partial {\bf A}}Atr(AUnchangedAT)=Atr(AAconstantT)
= 2 A =2{\bf A} =2A
(3) ∂ t r ( A B A T C ) ∂ A = 2 A \frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}{\bf C} )}{\partial {\bf A}}= 2{\bf A} Atr(ABATC)=2A
证明: ∂ t r ( A B A T C ) ∂ A \frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}{\bf C} )}{\partial {\bf A}} Atr(ABATC)
= ∂ tr ( A unchanged BATC ) ∂ A + ∂ tr ( ABA unchanged TC ) ∂ A =\frac{\partial {\rm tr}( {\bf A}_{{unchanged} {\bf B} { \bf A}^{\rm T}{\bf C} )}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf AB} {\bf A}^{ \rm T}_{Unchanged}{\bf C} )}{\partial {\bf A}}=Atr(AUnchangedBATC)+Atr(ABAconstantTC)
= ∂ tr ( ATCA unchanged B ) ∂ A + ∂ tr ( ABA unchanged TC ) ∂ A =\frac{\partial {\rm tr}({\bf A}^{\rm T} {\bf C} {\bf A}_{unchanged} {\bf B})}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf AB} {\bf A}^{ \rm T}_{Unchanged}{\bf C} )}{\partial {\bf A}}=Atr(ATCAUnchangedB)+Atr(ABAconstantTC)(using the property (1) in 2)
= ∂ tr ( BTA unchanged TCTA ) ∂ A + ∂ tr ( ABA unchanged TC ) ∂ A =\frac{\partial {\rm tr}({\bf B}^{ \rm T} {\bf A}_{No change}^{\rm T} {\bf C}^{\rm T} {\bf A})}{\partial {\bf A}}+\frac {\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}_{unchanged}{\bf C} )}{\partial {\bf A}}=Atr(BT AconstantTCT A).+Atr(ABAconstantTC)(利用2中性质(3))
= ( B T A T C T ) T + ( B A T C ) T ={({\bf B}^{\rm T} {\bf A}^{\rm T} {\bf C}^{\rm T})}^{\rm T}+{({\bf B} {\bf A}^{\rm T}{\bf C})}^{\rm T} =(BT ATCT)T+(BATC)T (using result (1) in 3)
= CAB + CTABT ={\bf CAB} + {\bf C}^{\rm T} {\bf A}{\bf B}^{\rm T}=CAB+CTABT

Guess you like

Origin blog.csdn.net/AbaloneVH/article/details/127404326