Linear Algebra | Mathematics Fundamentals for Machine Learning

foreword

Linear algebra is a branch of mathematics concerned with vector spaces and linear maps. It includes the study of lines, surfaces, and subspaces, as well as the general properties of all vector spaces.

This article mainly introduces the core basic concepts of linear algebra used in machine learning , for readers to check for gaps in the learning stage or for quick learning reference .

Linear Algebra

determinant

1. Determinant by row (column) expansion theorem

(1) 设 A = ( a i j ) n × n A = ( a_{ {ij}} )_{n \times n} A=(aij)n×n,则: a i 1 A j 1 + a i 2 A j 2 + ⋯ + a i n A j n = { ∣ A ∣ , i = j 0 , i ≠ j a_{i1}A_{j1} +a_{i2}A_{j2} + \cdots + a_{ {in}}A_{ {jn}} = \begin{cases}|A|,i=j\\ 0,i \neq j\end{cases} ai 1Aj 1+ai2Aj 2++ainAjn={ A,i=j0,i=j

a 1 i A 1 j + a 2 i A 2 j + ⋯ + a n i A n j = { ∣ A ∣ , i = j 0 , i ≠ j a_{1i}A_{1j} + a_{2i}A_{2j} + \cdots + a_{ {ni}}A_{ {nj}} = \begin{cases}|A|,i=j\\ 0,i \neq j\end{cases} a1 iA1 j+a2 iA2 j++ait isAnj={ A,i=j0,i=j A A ∗ = A ∗ A = ∣ A ∣ E , AA^{*} = A^{*}A = \left| A \right|E, AA=AA=AE,其中: A ∗ = ( A 11 A 12 … A 1 n A 21 A 22 … A 2 n … … … … A n 1 A n 2 … A n n ) = ( A j i ) = ( A i j ) T A^{*} = \begin{pmatrix} A_{11} & A_{12} & \ldots & A_{1n} \\ A_{21} & A_{22} & \ldots & A_{2n} \\ \ldots & \ldots & \ldots & \ldots \\ A_{n1} & A_{n2} & \ldots & A_{ {nn}} \\ \end{pmatrix} = (A_{ {ji}}) = {(A_{ {ij}})}^{T} A= A11A21An 1A12A22An 2A1nA2 nAnn =(Aji)=(Aij)T

D n = ∣ 1 1 … 1 x 1 x 2 … x n … … … … x 1 n − 1 x 2 n − 1 … x n n − 1 ∣ = ∏ 1 ≤ j < i ≤ n   ( x i − x j ) D_{n} = \begin{vmatrix} 1 & 1 & \ldots & 1 \\ x_{1} & x_{2} & \ldots & x_{n} \\ \ldots & \ldots & \ldots & \ldots \\ x_{1}^{n - 1} & x_{2}^{n - 1} & \ldots & x_{n}^{n - 1} \\ \end{vmatrix} = \prod_{1 \leq j < i \leq n}^{}\,(x_{i} - x_{j}) Dn= 1x1x1n11x2x2n11xnxnn1 =1j<in(xixj)

(2) A , BA, BA,B isnnn阶方阵,则 ∣ A B ∣ = ∣ A ∣ ∣ B ∣ = ∣ B ∣ ∣ A ∣ = ∣ B A ∣ \left| {AB} \right| = \left| A \right|\left| B \right| = \left| B \right|\left| A \right| = \left| {BA} \right| AB=AB=BA=BA,但 ∣ A ± B ∣ = ∣ A ∣ ± ∣ B ∣ \left| A \pm B \right| = \left| A \right| \pm \left| B \right| A±B=A±B does not necessarily hold.

(3) ∣ k A ∣ = k n ∣ A ∣ \left| {kA} \right| = k^{n}\left| A \right| kA=knA, A A A isnnn order square matrix.

(4) AAA isnnn阶方阵, ∣ A T ∣ = ∣ A ∣ ; ∣ A − 1 ∣ = ∣ A ∣ − 1 |A^{T}| = |A|;|A^{- 1}| = |A|^{- 1} AT=A;A1=A1 (youngAAA reversible),∣ A ∗ ∣ = ∣ A ∣ n − 1 |A^{*}| = |A|^{n - 1}A=An1

n ≥ 2 n \geq 2n2

(5) ∣ A O O B ∣ = ∣ A C O B ∣ = ∣ A O C B ∣ = ∣ A ∣ ∣ B ∣ \left| \begin{matrix} & {A\quad O} \\ & {O\quad B} \\ \end{matrix} \right| = \left| \begin{matrix} & {A\quad C} \\ & {O\quad B} \\ \end{matrix} \right| = \left| \begin{matrix} & {A\quad O} \\ & {C\quad B} \\ \end{matrix} \right| =| A||B| AOOB = ACOB = AOCB =A∣∣B
A , B A,B A,B为方阵,但 ∣ O A m × m B n × n O ∣ = ( − 1 ) m n ∣ A ∣ ∣ B ∣ \left| \begin{matrix} {O} & A_{m \times m} \\ B_{n \times n} & { O} \\ \end{matrix} \right| = ({- 1)}^{ {mn}}|A||B| OBn×nAm×mO =(1)mnA∣∣B

(6) 范德蒙行列式 D n = ∣ 1 1 … 1 x 1 x 2 … x n … … … … x 1 n − 1 x 2 n 1 … x n n − 1 ∣ = ∏ 1 ≤ j < i ≤ n   ( x i − x j ) D_{n} = \begin{vmatrix} 1 & 1 & \ldots & 1 \\ x_{1} & x_{2} & \ldots & x_{n} \\ \ldots & \ldots & \ldots & \ldots \\ x_{1}^{n - 1} & x_{2}^{n 1} & \ldots & x_{n}^{n - 1} \\ \end{vmatrix} = \prod_{1 \leq j < i \leq n}^{}\,(x_{i} - x_{j}) Dn= 1x1x1n11x2x2n 11xnxnn1 =1j<in(xixj)

AA _A isnnn阶方阵, λ i ( i = 1 , 2 ⋯   , n ) \lambda_{i}(i = 1,2\cdots,n) li(i=1,2,n ) isAAA 'snnn special attacks, 则
∣ A ∣ = ∏ i = 1 n λ i |A| = \prod_{i = 1}^{n}\lambda_{i}A=i=1nli

matrix

Matrix: m × nm \times nm×n numberaij a_{ {ij}}aijArranged in mmm rownnn列的表格 [ a 11 a 12 ⋯ a 1 n a 21 a 22 ⋯ a 2 n ⋯ ⋯ ⋯ ⋯ ⋯ a m 1 a m 2 ⋯ a m n ] \begin{bmatrix} a_{11}\quad a_{12}\quad\cdots\quad a_{1n} \\ a_{21}\quad a_{22}\quad\cdots\quad a_{2n} \\ \quad\cdots\cdots\cdots\cdots\cdots \\ a_{m1}\quad a_{m2}\quad\cdots\quad a_{ {mn}} \\ \end{bmatrix} a11a12a1na21a22a2 n⋯⋯⋯⋯⋯am 1am 2amn called matrix, abbreviated as AAA,或者 ( a i j ) m × n \left( a_{ {ij}} \right)_{m \times n} (aij)m×n. If m = nm = nm=n , it is calledAAA isnnmatrix of order n ornnn order square matrix.

Linear operations on matrices

1. Matrix addition

A = ( a i j ) , B = ( b i j ) A = (a_{ {ij}}),B = (b_{ {ij}}) A=(aij),B=(bij) are twom × nm \times nm×n matrix, thenm × nm \times nm×n matrixC= cij ) = aij + bij C = c_{ {ij}}) = a_{ {ij}} + b_{ {ij}}C=cij)=aij+bijcalled matrix AAA andBBThe sum of B , denoted asA + B = CA + B = CA+B=C

2. Multiplication of matrices

A = ( aij ) A = (a_{ {ij}})A=(aij) ism × nm \times nm×n -matrix,kkk is a constant, thenm × nm \times nm×n matrix( kaij ) (ka_{ {ij}})( k aij) is called the numberkkk and matrixAAThe number multiplication of A , denoted ask A {kA}k A .

3. Multiplication of matrices

A = ( aij ) A = (a_{ {ij}})A=(aij) ism × nm \times nm×n matrix,B = (bij) B = (b_{ {ij}})B=(bij) isn × sn \times sn×s matrix, thenm × sm \times sm×s- matrixC = (cij) C = (c_{ {ij}})C=(cij),其中 c i j = a i 1 b 1 j + a i 2 b 2 j + ⋯ + a i n b n j = ∑ k = 1 n a i k b k j c_{ {ij}} = a_{i1}b_{1j} + a_{i2}b_{2j} + \cdots + a_{ {in}}b_{ {nj}} = \sum_{k =1}^{n}{a_{ {ik}}b_{ {kj}}} cij=ai 1b1 j+ai2b2 j++ainbnj=k=1naibkjcalled AB {AB}The product of A B is written asC = ABC = ABC=AB

4. A T \mathbf{A}^{\mathbf{T}} AT A − 1 \mathbf{A}^{\mathbf{-1}} A1A ∗ \mathbf{A}^{\mathbf{*}}A Relationship between the three

(1) (AT) T = A, (AB) T = BTAT, (k A) T = k AT, (A ± B) T = AT ± BT {(A^{T})}^{T} = A, {(AB)}^{T} = B^{T}A^{T},{(kA)}^{T} = kA^{T},{(A \pm B)}^{T } = A^{T} \pm B^{T}(AT)T=A,(AB)T=BT AT,( k A )T=to AT,(A±B)T=AT±BT

(2) ( A − 1 ) − 1 = A , ( A B ) − 1 = B − 1 A − 1 , ( k A ) − 1 = 1 k A − 1 , \left( A^{- 1} \right)^{- 1} = A,\left( {AB} \right)^{- 1} = B^{- 1}A^{- 1},\left( {kA} \right)^{- 1} = \frac{1}{k}A^{- 1}, (A1)1=A,(AB)1=B1A1,( k A )1=k1A1,

However, ( A ± B ) − 1 = A − 1 ± B − 1 {(A \pm B)}^{- 1} = A^{- 1} \pm B^{- 1}(A±B)1=A1±B1 does not necessarily hold.

(3) ( A ∗ ) ∗ = ∣ A ∣ n − 2   A    ( n ≥ 3 ) \left( A^{*} \right)^{*} = |A|^{n - 2}\ A\ \ (n \geq 3) (A)=An2 A  (n3) ( A B ) ∗ = B ∗ A ∗ , \left({AB} \right)^{*} = B^{*}A^{*}, (AB)=BA, ( k A ) ∗ = k n − 1 A ∗    ( n ≥ 2 ) \left( {kA} \right)^{*} = k^{n -1}A^{*}{\ \ }\left( n \geq 2 \right) ( k A )=kn1A  (n2)

( A ± B ) ∗ = A ∗ ± B ∗ \left( A \pm B \right)^{*} = A^{*} \pm B^{*} (A±B)=A±B Not necessarily established.

(4) ( A − 1 ) T = ( A T ) − 1 ,   ( A − 1 ) ∗ = ( A A ∗ ) − 1 , ( A ∗ ) T = ( A T ) ∗ {(A^{- 1})}^{T} = {(A^{T})}^{- 1},\ \left( A^{- 1} \right)^{*} ={(AA^{*})}^{- 1},{(A^{*})}^{T} = \left( A^{T} \right)^{*} (A1)T=(AT)1, (A1)=(AA)1,(A)T=(AT)

5.有关 A ∗ \mathbf{A}^{\mathbf{*}} A Conclusion

(1) A A ∗ = A ∗ A = ∣ A ∣ E AA^{*} = A^{*}A = |A|E AA=AA=AE

(2) ∣ A ∗ ∣ = ∣ A ∣ n − 1 ( n ≥ 2 ), ( k A ) ∗ = kn − 1 A ∗ , ( A ∗ ) ∗ = ∣ A ∣ n − 2 A ( n ≥ 3 ) . |A^{*}| = |A|^{n-1}\ (n \geq 2),\ \ \ \ {(kA)}^{*} = k^{n-1}A^{*},{ {\\ } \left( A^{*} \right)}^{*} = |A|^{n - 2}A(n \geq 3)A=An1 (n2),    ( k A )=kn1A,  (A)=An2A(n3)

(3) Young AAA reversible,A ∗ = ∣ A ∣ A − 1 , ( A ∗ ) ∗ = 1 ∣ A ∣ AA^{*} = |A|A^{- 1},{(A^{*})}^ {*} = \frac{1}{|A|}AA=AA1,(A)=A1A

(4) Young AAA isnnn order square matrix, then:

r ( A ∗ ) = { n , r ( A ) = n 1 , r ( A ) = n − 1 0 , r ( A ) < n − 1 r(A^*)=\begin{cases}n,\quad r(A)=n\\ 1,\quad r(A)=n-1\\ 0,\quad r(A)<n-1\end{cases} r(A)= n,r(A)=n1,r(A)=n10,r(A)<n1

6.有关 A − 1 \mathbf{A}^{\mathbf{- 1}} A1 conclusion

A A A可逆⇔ AB = E ; ⇔ ∣ A ∣ ≠ 0 ; ⇔ r(A) = n; \LeftRightArrow AB = E; \Leftrightarrow |A| \neq 0; \LeftRightArrow r(A) = n;AB=E;A=0;r(A)=n;

⇔ A \Leftrightarrow A A can be expressed as the product of elementary matrices;⇔ A ; ⇔ A x = 0 \Leftrightarrow A;\Leftrightarrow Ax = 0A;Ax=0

7. Conclusions about the rank of matrices

(1) Order r ( A ) r(A)r ( A ) = row rank = column rank;

(2) r ( A m × n ) ≤ min ⁡ ( m , n ) ; r(A_{m \times n}) \leq \min(m,n); r(Am×n)min(m,n);

(3) A ≠ 0 ⇒ r ( A ) ≥ 1 A \neq 0 \Rightarrow r(A) \geqA=0r(A)1

(4) r ( A ± B ) ≤ r ( A ) + r ( B ) ; r(A \pm B) \leq r(A) + r(B); r(A±B)r(A)+r(B);

(5) Elementary transformation does not change the rank of the matrix

(6) r ( A ) + r ( B ) − n ≤ r ( A B ) ≤ min ⁡ ( r ( A ) , r ( B ) ) , r(A) + r(B) - n \leq r(AB) \leq \min(r(A),r(B)), r(A)+r(B)nr(AB)min(r(A),r ( B )) , especially ifAB = O AB = OAB=O
则: r ( A ) + r ( B ) ≤ n r(A) + r(B) \leq n r(A)+r(B)n

(7) Young A − 1 A^{- 1}A1存在 ⇒ r ( A B ) = r ( B ) ; \Rightarrow r(AB) = r(B); r(AB)=r ( B ) ; ifB − 1 B^{- 1}B1存在
⇒ r ( A B ) = r ( A ) ; \Rightarrow r(AB) = r(A); r(AB)=r(A);

r ( A m × n ) = n ⇒ r ( A B ) = r ( B ) ; r(A_{m \times n}) = n \Rightarrow r(AB) = r(B); r(Am×n)=nr(AB)=r(B); r ( A m × s ) = n ⇒ r ( A B ) = r ( A ) r(A_{m \times s}) = n\Rightarrow r(AB) = r\left( A \right) r(Am×s)=nr(AB)=r(A)

(8) r ( A m × s ) = n ⇔ A x = 0 r(A_{m \times s}) = n \Leftrightarrow Ax = 0 r(Am×s)=nAx=0 has only zero solution

8. Block inversion formula

( A O O B ) − 1 = ( A − 1 O O B − 1 ) \begin{pmatrix} A & O \\ O & B \\ \end{pmatrix}^{- 1} = \begin{pmatrix} A^{-1} & O \\ O & B^{- 1} \\ \end{pmatrix} (AOOB)1=(A1OOB1) ( A C O B ) − 1 = ( A − 1 − A − 1 C B − 1 O B − 1 ) \begin{pmatrix} A & C \\ O & B \\\end{pmatrix}^{- 1} = \begin{pmatrix} A^{- 1}& - A^{- 1}CB^{- 1} \\ O & B^{- 1} \\ \end{pmatrix} (AOCB)1=(A1OA1CB1B1)

( A O C B ) − 1 = ( A − 1 O − B − 1 C A − 1 B − 1 ) \begin{pmatrix} A & O \\ C & B \\ \end{pmatrix}^{- 1} = \begin{pmatrix} A^{- 1}&{O} \\ - B^{- 1}CA^{- 1} & B^{- 1} \\\end{pmatrix} (ACOB)1=(A1B1CA1OB1) ( O A B O ) − 1 = ( O B − 1 A − 1 O ) \begin{pmatrix} O & A \\ B & O \\ \end{pmatrix}^{- 1} =\begin{pmatrix} O & B^{- 1} \\ A^{- 1} & O \\ \end{pmatrix} (OBAO)1=(OA1B1O)

ThisriAA _A B B B is a reversible square matrix.

vector

1. Linear representation of vector groups

(1) α 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,asLinear correlation ⇔ \Leftrightarrow At least one vector can be linearly represented by the rest.

(2) α 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,asLinearly independent, α 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,as, β \betaβ linear correlation⇔ β \Leftrightarrow \betaβ can be composed ofα 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,asUnique linear representation.

(3) β \betaβ can be composed ofα 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,aslinear representation⇔
r ( α 1 , α 2 , ⋯ , α s ) = r ( α 1 , α 2 , ⋯ , α s , β ) \Leftrightarrow r(\alpha_{1},\alpha_{2},\cdots ,\alpha_{s}) =r(\alpha_{1},\alpha_{2},\cdots,\alpha_{s},\beta)r ( a1,a2,,as)=r ( a1,a2,,as,b ) _

2. On the linear dependence of vector groups

(1) The part is related and the whole is related; the whole is irrelevant and the part is irrelevant.

(2) ① n n n nn_n- dimensional vector
α 1 , α 2 ⋯ α n \alpha_{1},\alpha_{2}\cdots\alpha_{n}a1,a2an∣ [ α 1 α 2 ⋯ α n ] ∣ ≠ 0 \Leftrightarrow \left|\left\lbrack \alpha_{1}\alpha_{2}\cdots\alpha_{n} \right\rbrack \right | \neq0[ a1a2an]=0 n n n nn_n- dimensional vectorα 1 , α 2 ⋯ α n \alpha_{1},\alpha_{2}\cdots\alpha_{n}a1,a2anLinear correlation
⇔ ∣ [ α 1 , α 2 , ⋯ , α n ] ∣ = 0 \Leftrightarrow |\lbrack\alpha_{1},\alpha_{2},\cdots,\alpha_{n}\rbrack| = 0[ a1,a2,,an]=0

n + 1 n + 1 n+1 nn_n- dimensional vectors are linearly related.

③ 若α 1 , α 2 ⋯ α S \alpha_{1},\alpha_{2}\cdots\alpha_{S}a1,a2aSLinearly independent, it is still linearly independent after adding components; or a set of vectors is linearly dependent, and still linearly dependent after removing some components.

3. Linear representation of vector groups

(1) α 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,asLinear correlation ⇔ \Leftrightarrow At least one vector can be linearly represented by the rest.

(2) α 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,asLinearly independent, α 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,as, β \betaβ linear correlation⇔ β \Leftrightarrow\betaβ can be composed ofα 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,asUnique linear representation.

(3) β \betaβ can be composed ofα 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,aslinear representation⇔
r ( α 1 , α 2 , ⋯ , α s ) = r ( α 1 , α 2 , ⋯ , α s , β ) \Leftrightarrow r(\alpha_{1},\alpha_{2},\cdots ,\alpha_{s}) =r(\alpha_{1},\alpha_{2},\cdots,\alpha_{s},\beta)r ( a1,a2,,as)=r ( a1,a2,,as,b )

4. The relationship between the rank of the vector group and the rank of the matrix

r ( A m × n ) = r r(A_{m \times n}) =r r(Am×n)=r , 则AAThe rank r of A ( A ) r(A)r ( A ) givenAAThe linear correlation relationship of the row-column vector group of A is:

(1) 若 r ( A m × n ) = r = m r(A_{m \times n}) = r = m r(Am×n)=r=m , 则AAThe set of row vectors of A is linearly independent.

(2) 若 r ( A m × n ) = r < m r(A_{m \times n}) = r < m r(Am×n)=r<m , 则AAThe set of row vectors of A is linearly dependent.

(3) 若 r ( A m × n ) = r = n r(A_{m \times n}) = r = n r(Am×n)=r=n , 则AAThe set of column vectors of A is linearly independent.

(4) 若 r ( A m × n ) = r < n r(A_{m \times n}) = r < n r(Am×n)=r<n , 则AAThe set of column vectors of A is linearly related.

5. n \mathbf{n}Basis Transformation Formula and Transition Matrix of n- Dimensional Vector Space

α 1 , α 2 , ⋯ , α n \ alpha_ {1},\alpha_{2},\cdots,\alpha_{n}a1,a2,,anβ 1 , β 2 , ⋯ , β n \beta_{1},\beta_{2},\cdots,\beta_{n}b1,b2,,bnis the vector space VVTwo sets of basis of V , then the basis transformation formula is:

( β 1 , β 2 , ⋯ , β n ) = ( α 1 , α 2 , ⋯ , α n ) [ c 11 c 12 ⋯ c 1 nc 21 c 22 ⋯ c 2 n ⋯ ⋯ ⋯ ⋯ cn 1 cn 2 ⋯ cnn ] = ( α 1 , α 2 , ⋯ , α n ) C (\beta_{1},\beta_{2},\cdots,\beta_{n}) = (\alpha_{1},\alpha_{2 },\cdots,\alpha_{n})\begin{bmatrix}c_{11}&c_{12}&\cdots&c_{1n}\c_{21}&c_{22}&\cdots&c_{ 2n}\\\cdots&\cdots&\cdots&\cdots\\c_{n1}&c_{n2}&\cdots&c_{{ nn}}\\\end{bmatrix} = (\alpha_{1} ,\alphabet_{2},\cdots,\alphabet_{n})C( b1,b2,,bn)=( a1,a2,,an) c11c21cn 1c12c22cn 2c1nc2 ncnn =( a1,a2,,an)C

where CCC is an invertible matrix, called by basisα 1 , α 2 , ⋯ , α n \alpha_{1},\alpha_{2},\cdots,\alpha_{n}a1,a2,,an到基β 1 , β 2 , ⋯ , β n \beta_{1},\beta_{2},\cdots,\beta_{n}b1,b2,,bntransition matrix.

6. Coordinate transformation formula

If the vector γ \gammaγ在基α 1 , α 2 , ⋯ , α n \alpha_{1},\alpha_{2},\cdots,\alpha_{n}a1,a2,,an与基β 1 , β 2 , ⋯ , β n \beta_{1},\beta_{2},\cdots,\beta_{n}b1,b2,,bnThe coordinates are
X = ( x 1 , x 2 , ⋯ , xn ) TX = {(x_{1},x_{2},\cdots,x_{n})}^{T}X=(x1,x2,,xn)T

Y = ( y 1 , y 2 , ⋯   , y n ) T Y = \left( y_{1},y_{2},\cdots,y_{n} \right)^{T} Y=(y1,y2,,yn)T即:γ = x 1 α 1 + x 2 α 2 + ⋯ + xn α n = y 1 β 1 + y 2 β 2 + ⋯ + yn β n \gamma =x_{1}\alpha_{1} + x_ {2}\alpha_{2} + \cdots + x_{n}\alpha_{n} = y_{1}\beta_{1} +y_{2}\beta_{2} + \cdots + y_{n}\ beta_{n}c=x1a1+x2a2++xnan=y1b1+y2b2++ynbn, then the vector coordinate transformation formula is X = CYX = CYX=C Y ψY = C − 1 XY = C^{- 1}XY=C1 X, whereCCC is from the basisα 1 , α 2 , ⋯ , α n \alpha_{1},\alpha_{2},\cdots,\alpha_{n}a1,a2,,an到基β 1 , β 2 , ⋯ , β n \beta_{1},\beta_{2},\cdots,\beta_{n}b1,b2,,bntransition matrix.

7. Inner product of vectors

( α , β ) = a 1 b 1 + a 2 b 2 + ⋯ + anbn = α T β = β T α (\alpha,\beta) = a_{1}b_{1} + a_{2}b_{ 2} + \cdots + a_{n}b_{n} = \alpha^{T}\beta = \beta^{T}\alpha( a ,b )=a1b1+a2b2++anbn=aTβ=bT a

8. Schmidt Orthogonalization

α 1 , α 2 , ⋯ , α s \ alpha_ {1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,asLinearly independent, then β 1 , β 2 , ⋯ , β s \beta_{1},\beta_{2},\cdots,\beta_{s} can be constructedb1,b2,,bsMake it pairwise orthogonal, and β i \beta_{i}biOnly α 1 , α 2 , ⋯ , α i \alpha_{1},\alpha_{2},\cdots,\alpha_{i}a1,a2,,aiLinear combination of ( i = 1 , 2 , ⋯ , n ) (i= 1,2,\cdots,n)(i=1,2,,n ) , recapturedβ i \beta_{i}biUnitization, γ i = β i ∣ β i ∣ \gamma_{i} =\frac{\beta_{i}}{\left| \beta_{i}\right|}ci=βibi,则γ 1 , γ 2 , ⋯ , γ i \gamma_{1},\gamma_{2},\cdots,\gamma_{i}c1,c2,,ciis a set of orthonormal vectors. Where
β 1 = α 1 \beta_{1} = \alpha_{1}b1=a1β 2 = α 2 − ( α 2 , β 1 ) ( β 1 , β 1 ) β 1 \beta_{2} = \alpha_{2} -\frac{(\alpha_{2},\beta_{1} )}{(\beta_{1},\beta_{1})}\beta_{1}b2=a2( b1, b1)( a2, b1)b1β 3 = α 3 − ( α 3 , β 1 ) ( β 1 , β 1 ) β 1 − ( α 3 , β 2 ) ( β 2 , β 2 ) β 2 \beta_{3} =\alpha_{3 } - \frac{(\alpha_{3},\beta_{1})}{(\beta_{1},\beta_{1})}\beta_{1} -\frac{(\alpha_{3}, \beta_{2})}{(\beta_{2},\beta_{2})}\beta_{2}b3=a3( b1, b1)( a3, b1)b1( b2, b2)( a3, b2)b2

β s = α s − ( α s , β 1 ) ( β 1 , β 1 ) β 1 − ( α s , β 2 ) ( β 2 , β 2 ) β 2 − ⋯ − ( α s , β s − 1 ) ( β s − 1 , β s − 1 ) β s − 1 \beta_{s} = \alpha_{s} - \frac{(\alpha_{s},\beta_{1})}{(\beta_{ 1},\beta_{1})}\beta_{1} - \frac{(\alpha_{s},\beta_{2})}{(\beta_{2},\beta_{2})}\beta_ {2} - \cdots - \frac{(\alpha_{s},\beta_{s - 1})}{(\beta_{s - 1},\beta_{s - 1})}\beta_{s - 1}bs=as( b1, b1)( as, b1)b1( b2, b2)( as, b2)b2( bs1, bs1)( as, bs1)bs1

9. Orthogonal bases and orthonormal bases

If the vectors in a set of bases of a vector space are orthogonal to each other, it is called an orthogonal base; if each vector in an orthogonal base is a unit vector, it is called an orthonormal base.

system of linear equations

1. Cramer's Law

线性方程组 { a 11 x 1 + a 12 x 2 + ⋯ + a 1 n x n = b 1 a 21 x 1 + a 22 x 2 + ⋯ + a 2 n x n = b 2 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ a n 1 x 1 + a n 2 x 2 + ⋯ + a n n x n = b n \begin{cases} a_{11}x_{1} + a_{12}x_{2} + \cdots +a_{1n}x_{n} = b_{1} \\ a_{21}x_{1} + a_{22}x_{2} + \cdots + a_{2n}x_{n} =b_{2} \\ \quad\cdots\cdots\cdots\cdots\cdots\cdots\cdots\cdots\cdots \\ a_{n1}x_{1} + a_{n2}x_{2} + \cdots + a_{ {nn}}x_{n} = b_{n} \\ \end{cases} a11x1+a12x2++a1nxn=b1a21x1+a22x2++a2 nxn=b2⋯⋯⋯⋯⋯⋯⋯⋯⋯an 1x1+an 2x2++annxn=bn, if the coefficient determinant D = ∣ A ∣ ≠ 0 D = \left| A \right| \neq 0D=A=0 , then the system of equations has a unique solution,x 1 = D 1 D , x 2 = D 2 D , ⋯ , xn = D n D x_{1} = \frac{D_{1}}{D},x_{2 } = \frac{D_{2}}{D},\cdots,x_{n} =\frac{D_{n}}{D}x1=DD1,x2=DD2,,xn=DDn, where D j D_{j}Djis the DDjjthin DThe determinant obtained by replacing the elements in column j with the constant column at the right end of the equation system.

2. n n n -order matrixAAA可逆 ⇔ A x = 0 \Leftrightarrow Ax = 0 Ax=0 has only zero solutions. ⇔ ∀ b , A x = b \Leftrightarrow\forall b, Ax = bb,Ax=b always has a unique solution, in general,r ( A m × n ) = n ⇔ A x = 0 r(A_{m \times n}) = n \Leftrightarrow Ax= 0r(Am×n)=nAx=0 has only zero solutions.

3. Sufficient and necessary conditions for non-odd order linear equations to have solutions, properties of solutions of linear equations and structure of solutions

(1) Set AAA m × n m \times n m×n matrix, ifr ( A m × n ) = mr(A_{m \times n}) = mr(Am×n)=m , then forA x = b Ax = bAx=For b , r ( A ) = r ( A ⋮ b ) = mr(A) = r(A \vdots b) = mr(A)=r(Ab)=m , so thatA x = b Ax = bAx=b has a solution.

(2) 设 x 1 , x 2 , ⋯ x s x_{1},x_{2},\cdots x_{s} x1,x2,xsFor A x = b Ax = bAx=The solution of b , thenk 1 x 1 + k 2 x 2 ⋯ + ksxs k_{1}x_{1} + k_{2}x_{2}\cdots + k_{s}x_{s}k1x1+k2x2+ksxs k 1 + k 2 + ⋯ + k s = 1 k_{1} + k_{2} + \cdots + k_{s} = 1 k1+k2++ks=1 is stillA x = b Ax = bAx=The solution of b ; but whenk 1 + k 2 + ⋯ + ks = 0 k_{1} + k_{2} + \cdots + k_{s} = 0k1+k2++ks=0 , thenA x = 0 Ax =0Ax=0 solution. Specificallyx 1 + x 2 2 \frac{x_{1} + x_{2}}{2}2x1+x2For A x = b Ax = bAx=b -wise solution;2 x 3 − ( x 1 + x 2 ) 2x_{3} - (x_{1} +x_{2})2x _3(x1+x2) isA x = 0 A x = 0Ax=0 solution.

(3) Inhomogeneous linear equation system A x = b {Ax} = bAx=b无解 ⇔ r ( A ) + 1 = r ( A ‾ ) ⇔ b \Leftrightarrow r(A) + 1 =r(\overline{A}) \Leftrightarrow b r(A)+1=r(A)b cannot be assigned byAAThe column vector of A α 1 , α 2 , ⋯ , α n \alpha_{1},\alpha_{2},\cdots,\alpha_{n}a1,a2,,anLinear representation.

4. Basic solution systems and general solutions of odd linear equations, solution space, general solutions of non-odd linear equations

(1) Homogeneous equation system A x = 0 {Ax} = 0Ax=0 always has a solution (there must be a zero solution). When there are non-zero solutions, since any linear combination of solution vectors is still the solution vector of the homogeneous equation system,A x = 0 {Ax}= 0Ax=All the solution vectors of 0 constitute a vector space, which is called the solution space of the system of equations, and the dimension of the solution space isn − r ( A ) n - r(A)nr ( A ) , a set of basis of the solution space is called the basis solution system of the homogeneous equation system.

(2) η 1 , η 2 , ⋯ , η t \eta_{1},\eta_{2},\cdots,\eta_{t}the1,the2,,thetis A x = 0 {Ax} = 0Ax=The basic solution system of 0 , namely:

  1. η 1 , η 2 , ⋯ , η t \eta_{1},\eta_{2},\cdots,\eta_{t}the1,the2,,thetis A x = 0 {Ax} = 0Ax=0 solution;

  2. η 1 , η 2 , ⋯ , η t \eta_{1},\eta_{2},\cdots,\eta_{t}the1,the2,,thetlinearly independent;

  3. A x = 0 {Ax} = 0 Ax=Any solution of 0 can be obtained by η 1 , η 2 , ⋯ , η t \eta_{1},\eta_{2},\cdots,\eta_{t}the1,the2,,thet线性表出.
    k 1 η 1 + k 2 η 2 + ⋯ + kt η t k_{1}\eta_{1} + k_{2}\eta_{2} + \cdots + k_{t}\eta_{t}k1the1+k2the2++ktthetis A x = 0 {Ax} = 0Ax=The general solution of 0 , where k 1 , k 2 , ⋯ , kt k_{1},k_{2},\cdots,k_{t}k1,k2,,ktis an arbitrary constant.

Eigenvalues ​​and Eigenvectors of Matrix

1. The concepts and properties of eigenvalues ​​and eigenvectors of matrices

(1) Let λ \lambdaλ isAAA special attack,k A , a A + b E , A 2 , A m , f ( A ) , AT , A − 1 , A ∗ {kA},{aA} + {bE},A^{ 2},A^{m},f(A),A^{T},A^{-1},A^{*}k A ,aA+bE,A2,Am,f(A),AT,A1,A Let be a function of the infinitive
k λ , a λ + b , λ 2 , λ m , f ( λ ) , λ , λ − 1 , ∣ A ∣ λ , {kλ},{aλ} + b,\lambda^ {2},\lambda^{m},f(\lambda),\lambda,\lambda^{- 1},\frac{|A|}{\lambda},,al+b,l2,lm,f ( λ ) ,l ,l1,lA, and the corresponding eigenvectors are the same (ATA^{T}AT exception).

(2)Plant 1 , λ 2 , ⋯ , λ n \lambda_{1},\lambda_{2},\cdots,\lambda_{n}l1,l2,,lnfor AAA 'snni = 1 n λ i = ∑ i = 1 naii , ∏ i = 1 n λ i = ∣ A ∣ \sum_{i= 1}^{n}\lambda_{i} = \ sum_ {i = 1}^{n}a_{ {ii}},\prod_{i = 1}^{n}\lambda_{i}= |A|i=1nli=i=1naii,i=1nli=A ,denote∣A ∣ ≠ 0 ⇔ A |A| \right 0 \Leftrightarrow AA=0A has no eigenvalues.

(3) λ 1 , λ 2 , ⋯ , λ s \lambda_{1},\lambda_{2},\cdots,\lambda_{s}l1,l2,,lsfor AAA 'ssss eigenvalues, the corresponding eigenvectors areα 1 , α 2 , ⋯ , α s \alpha_{1},\alpha_{2},\cdots,\alpha_{s}a1,a2,,as

Young: α = k 1 α 1 + k 2 α 2 + ⋯ + ks α s \alpha = k_{1}\alpha_{1} + k_{2}\alpha_{2} + \cdots + k_{s}\ alpha_{s}a=k1a1+k2a2++ksas ,

则: A n α = k 1 A n α 1 + k 2 A n α 2 + ⋯ + ks A n α s = k 1 λ 1 n α 1 + k 2 λ 2 n α 2 + ⋯ ks λ sn α s A^{n}\alpha = k_{1}A^{n}\alpha_{1} + k_{2}A^{n}\alpha_{2} + \cdots +k_{s}A^{n} \alpha_{s} = k_{1}\lambda_{1}^{n}\alpha_{1} +k_{2}\lambda_{2}^{n}\alpha_{2} + \cdots k_{s} \lambda_{s}^{n}\alpha_{s}An a=k1An a1+k2An a2++ksAn as=k1l1na1+k2l2na2+kslsnas

2. Concept and properties of similarity transformation and similarity matrix

(1) Young A ∼ BA \sim BAB , then

  1. A T ∼ B T , A − 1 ∼ B − 1 , , A ∗ ∼ B ∗ A^{T} \sim B^{T},A^{- 1} \sim B^{- 1},,A^{*} \sim B^{*} ATBT,A1B1,,AB

  2. ∣ A ∣ = ∣ B ∣ , ∑ i = 1 n A i i = ∑ i = 1 n b i i , r ( A ) = r ( B ) |A| = |B|,\sum_{i = 1}^{n}A_{ {ii}} = \sum_{i =1}^{n}b_{ {ii}},r(A) = r(B) A=B,i=1nAii=i=1nbii,r(A)=r(B)

  3. ∣ λ E − A ∣ = ∣ λ E − B ∣ |\lambda E - A| = |\lambda E - B|λEA=λEB , pair∀ λ \forall\lambdaλ established

3. Sufficient and necessary conditions for matrices to be similarly diagonalizable

(1) Set AAA isnnn order square matrix, thenAAA is diagonalizable⇔ \Leftrightarrow for eachki k_{i}kiRepeated root eigenvalue λ i \lambda_{i}li, with n − r ( λ i E − A ) = ki nr(\lambda_{i}E - A) = k_{i}nr ( liEA)=ki

(2) Set AAA can be diagonalized, then byP − 1 AP = Λ , P^{- 1}{AP} = \Lambda,P1AP=Λ ,A = P Λ P − 1 A = {PΛ}P^{-1}A=PΛP1 ,letA n = P Λ n P − 1 A^{n} = P\Lambda^{n}P^{- 1}An=P LnP1

(3) Important conclusions

  1. A ∼ B , C ∼ D A \sim B,C \sim D AB,CD,则 [ A O O C ] ∼ [ B O O D ] \begin{bmatrix} A & O \\ O & C \\\end{bmatrix} \sim \begin{bmatrix} B & O \\ O & D \\\end{bmatrix} [AOOC][BOOD].

  2. Young A ∼ BA \sim BAB,则 f ( A ) ∼ f ( B ) , ∣ f ( A ) ∣ ∼ ∣ f ( B ) ∣ f(A) \sim f(B),\left| f(A) \right| \sim \left| f(B)\right| f(A)f(B),f(A)f ( B ) , wheref ( A ) f(A)f ( A ) is aboutnnn -order square matrixAAA polynomial.

  3. Young AAA is a diagonalizable matrix, then the number of its non-zero eigenvalues ​​(repeated calculation of repeated roots) = rank (AAA)

4. Eigenvalues, eigenvectors and similar diagonal matrices of real symmetric matrices

(1) Similarity matrix: Let A , BA, BA,B is twonnn order square matrix, if there is an invertible matrixPPP , such thatB = P − 1 APB =P^{- 1}{AP}B=P1 APis established, then the matrixAAA andBBB is similar, recorded asA ∼ BA \sim BAB

(2) The nature of the similarity matrix: if A ∼ BA \sim BAB has:

  1. A T ∼ B T A^{T} \sim B^{T} ATBT

  2. A − 1 ∼ B − 1 A^{- 1} \sim B^{- 1} A1B1 (youngAAA B B B are reversible)

  3. A k ∼ B k A^{k} \sim B^{k} AkBk k k k is a positive integer)

  4. ∣ λ E − A ∣ = ∣ λ E − B ∣ \left| {λE} - A \right| = \left| {λE} - B \right| λEA=λEB , so thatA , BA,BA,B
    have the same eigenvalues

  5. ∣ A ∣ = ∣ B ∣ \left| A \right| = \left| B \right| A=B , soA , BA,BA,B is reversible or irreversible at the same time

  6. ( A ) = \left( A \right) = (A)= ( B ) , ∣ λ E − A ∣ = ∣ λ E − B ∣ \left( B \right),\left| {λE} - A \right| =\left| {λE} - B \right| (B),λEA=λEB A , B A,B A,B is not necessarily similar

secondary type

1. n \mathbf{n}n个variablex1 , x 2 , ⋯ , xn \mathbf{x}_{\mathbf{1}}\mathbf{,}\mathbf{x}_{\mathbf{2}}\mathbf{,\cdots, }\mathbf{x}_{\mathbf{n}}x1,x2,,xnThe quadratic homogeneous function of

f ( x 1 , x 2 , ⋯   , x n ) = ∑ i = 1 n ∑ j = 1 n a i j x i y j f(x_{1},x_{2},\cdots,x_{n}) = \sum_{i = 1}^{n}{\sum_{j =1}^{n}{a_{ {ij}}x_{i}y_{j}}} f(x1,x2,,xn)=i=1nj=1naijxiyj,其中 a i j = a j i ( i , j = 1 , 2 , ⋯   , n ) a_{ {ij}} = a_{ {ji}}(i,j =1,2,\cdots,n) aij=aji(i,j=1,2,,n ) , callednnn元二次型,简称二次型. Letx = [ x 1 x 1 ⋮ xn ] , A = [ a 11 a 12 ⋯ a 1 and 21 a 22 ⋯ a 2 n ⋯ ⋯ ⋯ ⋯ an 1 an 2 ⋯ ann ] x = \ \begin{bmatrix} x_{1}\\x_{1}\\\vdots\\x_{n}\\\end{bmatrix},A = \begin{bmatrix} a_{11}& a_{12}& \cdots & a_{ 1n} \\ a_{21}& a_{22}& \cdots & a_{2n} \\ \cdots &\cdots &\cdots &\cdots \\ a_{n1}& a_{n2} & \cdots & a_ { {nn}} \\\end{bmatrix}x=  x1x1xn ,A= a11a21an 1a12a22an 2a1na2 nann , this quadratic form fff can be rewritten into a matrix-vector formf = x TA xf =x^{T}{Ax}f=xT Ax._ Among themAAA is called a quadratic matrix becauseaij = aji ( i , j = 1 , 2 , ⋯ , n ) a_{ { ij}} =a_{ {ji}}(i,j =1,2,\cdots, n)aij=aji(i,j=1,2,,n ) , so the quadratic form matrices are all symmetric matrices, and the quadratic form corresponds to the symmetric matrix one by one, and the matrixAAThe rank of A is called the rank of the quadratic form.

2. Inertia theorem, standard form and canonical form of quadratic form

(1) Inertia theorem

For any quadratic form, no matter what kind of contract transformation is selected to convert it into a standard form containing only square terms, its positive and negative inertial indices have nothing to do with the selected transformation, which is the so-called inertia theorem.

(2) Standard form

二次型 f = ( x 1 , x 2 , ⋯   , x n ) = x T A x f = \left( x_{1},x_{2},\cdots,x_{n} \right) =x^{T}{Ax} f=(x1,x2,,xn)=xT Axundergoes contract transformationx = C yx = {Cy}x=Cy化为 f = x T A x = y T C T A C f = x^{T}{Ax} =y^{T}C^{T}{AC} f=xTAx=yTCTAC

y = ∑ i = 1 r d i y i 2 y = \sum_{i = 1}^{r}{d_{i}y_{i}^{2}} y=i=1rdiyi2called f ( r ≤ n ) f(r \leq n)f(rn ) canonical form. In the general number field, the canonical form of the quadratic form is not unique, it is related to the contract transformation, but the number of square terms whose coefficients are not zero is given by r ( A ) r(A)r ( A ) is uniquely determined.

(3) Canonical form

Any real quadratic form fff can be transformed into canonical formf = z 1 2 + z 2 2 + ⋯ zp 2 − zp + 1 2 − ⋯ − zr 2 f = z_{1}^{2} + z_{2}^{ 2} + \cdots z_{p}^{2} - z_{p + 1}^{2} - \cdots -z_{r}^{2}f=z12+z22+zp2zp+12zr2, where rrr forAArank of A , ppp is the positive inertia index,r − pr -prp is a negative inertial index, and the canonical type is unique.

3. Use orthogonal transformation and formula to normalize the quadratic form, and the positive definiteness of the quadratic form and its matrix

AA _A正定 ⇒ k A ( k > 0 ) , A T , A − 1 , A ∗ \Rightarrow {kA}(k > 0),A^{T},A^{- 1},A^{*} k A ( k>0),AT,A1,A positive definite;∣ A ∣ > 0 |A| >0A>0, A A A reversible;aii > 0 a_{ {ii}} > 0aii>0,且 ∣ A i i ∣ > 0 |A_{ {ii}}| > 0 Aii>0

A A A B B B positive definite⇒ A + B \Rightarrow A +BA+B is positive definite, butAB{AB}A BBA {BA}B A is not necessarily positive definite

A AA正定 ⇔ f ( x ) = x T A x > 0 , ∀ x ≠ 0 \Leftrightarrow f(x) = x^{T}{Ax} > 0,\forall x \neq 0 f(x)=xTAx>0,x=0

⇔ A \Leftrightarrow A The principal subforms of each order of A are all greater than zero

⇔ A \Leftrightarrow A All eigenvalues ​​of A are greater than zero

⇔ A \Leftrightarrow AThe positive inertia index of A is nnn

⇔ \Leftrightarrow There is a reversible matrixPPP使 A = P T P A = P^{T}P A=PTP

⇔ \Leftrightarrow There is an orthogonal matrixQQQ ,fromQTAQ = Q − 1 AQ = ( λ 1 ⋱ λ n ) , Q^{T}{AQ} = Q^{- 1}{AQ} =\begin{pmatrix}\lambda_{1} & & \ \\begin{matrix}&\\&\\\end{matrix}&\ddots&\\&&\lambda_{n}\\\end{pmatrix},QTAQ=Q- 1 AQ_= l1ln ,

其中 λ i > 0 , i = 1 , 2 , ⋯   , n . \lambda_{i} > 0,i = 1,2,\cdots,n. li>0,i=1,2,,n.正定 ⇒ k A ( k > 0 ) , A T , A − 1 , A ∗ \Rightarrow {kA}(k >0),A^{T},A^{- 1},A^{*} k A ( k>0),AT,A1,A positive definite;∣ A ∣ > 0 , A |A| > 0,AA>0,A reversible;aii > 0 a_{ {ii}} >0aii>0,且 ∣ A i i ∣ > 0 |A_{ {ii}}| > 0 Aii>0

overall framework

Linear Algebra

Operation nature

Operation and properties

reference article

Basic Concepts of Linear Algebra for Machine Learning Mathematical Foundations of Machine Learning (itdiffer.com)

Linear Algebra in Machine Learning - Zhihu (zhihu.com)

Basic Knowledge of Linear Algebra-Mind Map_Linear Algebra Mind Map_Arrow's Blog-CSDN Blog

recommended reading

[Mathematical Basis of Machine Learning] (1) Linear Algebra (Part 1)_linear algebra for everyone csdn_Binary Artificial Intelligence Blog-CSDN Blog

[Mathematical Basis of Machine Learning] (2) Linear Algebra (Medium)_Binary Artificial Intelligence Blog-CSDN Blog

[Mathematical basis of machine learning] (3) Linear Algebra (Linear Algebra) (below)_ordered basis line generation_Binary Artificial Intelligence Blog-CSDN Blog

The most complete knowledge points of linear algebra for the postgraduate entrance examination to sort out the mind map - Zhihu (zhihu.com)

LQLab: Coding Learning Writing — LQLab

This article is published by OpenWrite, a multi-post platform for blogging !

Guess you like

Origin blog.csdn.net/m0_63748493/article/details/132062009