[Deep Learning Basics] Matrix Operations

Matrix Operations

what is a matrix

A matrix is ​​a set of arranged vectors, where the dimension (characteristic number) of the vector is the row of the matrix, and the number of vectors is the column of the matrix .

For example an n × mn \times mn×The matrix of m means that there aremmm withnnA vector of n- dimensional features:

insert image description here

  • This is a 2 × 3 2\times32×The matrix of 3 means that it consists of 3 vectors with 2 features:

x = [ a 11 a 12 a 13 a 21 a 22 a 23 ] x=\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \end{bmatrix} x=[a11a21a12a22a13a23]

Understand operations on matrices and vectors

First, let's look at the operation of a square matrix and a vector:

w = M × v = [ m 11 m 12 m 21 m 22 ] × [ v 1 v 2 ] = [ v 1 m 11 + v 2 m 12 v 2 m 21 + v 2 m 22 ] w=M\times v=\begin{bmatrix} m_{11} & m_{12}\\ m_{21} & m_{22}\\ \end{bmatrix} \times \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} =\begin{bmatrix} v_1 m_{11}+ v_2m_{12}\\ v_2m_{21}+ v_2 m_{22}\end{bmatrix} w=M×v=[m11m21m12m22]×[v1v2]=[v1m11+v2m12v2m21+v2m22]

insert image description here

Intuitively, you can first put the vector vvv dumps to the left, then sums the matrix MMonceThe corresponding elements of each row in M ​​are multiplied and added to produce a new result each time one level is processed down. Therefore, the final multiplication result is still a vector. The eigennumber of this vector is equal to the matrixMMThe number of rows (features) of M , the number of vectors is equal to the vectorvvthe number of v .

Why multiply vectors and matrices? To say the answer directly is to transform the space where the vector is located.

The essence of matrix operations: vector space transformation (linear transformation)

In a two-dimensional plane, we generally default to its basis vector ex = ( 1 , 0 ) , ey = ( 0 , 1 ) e_x=(1,0),e_y=(0,1)ex=(1,0),ey=(0,1 ) , for example, we choose one of the vectorsw = 5 ex + 2 eyw=5e_x+2e_yw=5e _x+2e _y

insert image description here

Now, we need to change a set of basis vectors to represent this two-dimensional space. During the replacement process, we need to keep some conditions unchanged:

  • The origin position remains unchanged;
  • Parallel lines are still parallel lines after transformation;
  • A straight line is still a straight line after transformation;

The above transformation conditions are linear transformations. It can be imagined as a geometric distortion of the spatial coordinates.

insert image description here

Vectors in any space can be obtained by base vector operations, such as a vector v = − 1 ex + 2 eyv=-1e_x+2e_yv=1 ex+2e _y

After we transform the space, v = − 1 e ^ x + 2 e ^ yv=-1{\hat e}_x+2{\hat e}_yv=1e^x+2e^y

e x = ( 1 , 0 ) , e y = ( 0 , 1 ) e_x=(1,0),e_y=(0,1) ex=(1,0),ey=(0,1)

e ^ x = ( 1 , − 2 ) , e ^ y = ( 3 , 0 ) {\hat e}_x=(1,-2),{\hat e}_y=(3,0) e^x=(1,2),e^y=(3,0)

v ^ = − 1 e ^ x + 2 e ^ y = − 1 ( 1 , − 2 ) + 2 ( 3 , 0 ) = ( 5 , 2 ) {\hat v}=-1{\hat e}_x+2{\hat e}_y=-1(1,-2)+2(3,0)=(5,2) v^=1e^x+2e^y=1(1,2)+2(3,0)=(5,2)

We now represent vectors as vertical matrices:

v = − 1 [ 1 0 ] + 2 [ 0 1 ] = [ − 1 2 ] v = -1\begin{bmatrix} 1\\0\end{bmatrix} +2\begin{bmatrix} 0\\1\end{bmatrix} =\begin{bmatrix} -1\\2\end{bmatrix} v=1[10]+2[01]=[12]

v ^ = − 1 [ 1 − 2 ] + 2 [ 3 0 ] = [ 5 2 ] {\hat v} = -1\begin{bmatrix} 1\\-2\end{bmatrix} +2\begin{bmatrix} 3\\0\end{bmatrix} =\begin{bmatrix} 5\\2\end{bmatrix} v^=1[12]+2[30]=[52]

Generally, we represent the transformed basis vectors as a matrix:

[ 3 2 − 2 1 ] \begin{bmatrix} 3 & 2\\-2 & 1\end{bmatrix} [3221]

More generally, we denote by letters:

[ a b c d ] \begin{bmatrix} a & b\\c & d\end{bmatrix} [acbd]

We put the first column [ ac ] \begin{bmatrix} a \\c\end{bmatrix}[ac] as the foothold of the first benchmark vector, the second column[ bb ] \begin{bmatrix} b \\b\end{bmatrix}[bb] as the foothold of the second datum vector. If this transformation is applied to the vector[ xy ] \begin{bmatrix} x \\y\end{bmatrix}[xy] ,就会得到 [ a x + b y c x + d y ] \left[\begin{array}{l}a x+b y \\c x+d y\end{array}\right] [ax+bycx+dy]

Further, we define it as matrix multiplication:

[ a b c d ] [ x y ] = x [ a c ] + y [ b d ] = [ a x + b y c x + d y ] {\color{Red}\left[\begin{array}{ll}a & b \\c & d\end{array}\right]\left[\begin{array}{l}x \\y\end{array}\right]=x\left[\begin{array}{l}a \\c\end{array}\right]+y\left[\begin{array}{l}b \\d\end{array}\right]=\left[\begin{array}{l}a x+b y \\c x+d y\end{array}\right]} [acbd][xy]=x[ac]+y[bd]=[ax+bycx+dy]

matrix multiplication

What we said above is to use a matrix to represent a linear transformation of a vector in two-dimensional space. If we want to perform multiple transformations, that is, compound transformations, what should we do?

We need to perform a transformation first, get the transformed vector, and perform the same transformation. But the final result should be the same as the result of the transformation of the composite matrix to the vector.

insert image description here

We might as well call this the product of two matrices .

insert image description here

How to understand? It can be understood as acting on the matrix on the right first, and then acting on the matrix on the left.

For a general matrix multiplication:

[ a b c d ] [ e f g h ] \left[\begin{array}{ll}a & b \\c & d\end{array}\right]\left[\begin{array}{ll}e & f \\g & h\end{array}\right] [acbd][egfh]

We can first consider the benchmark ex e_xexFirst falls to [ eg ] \begin{bmatrix} e \\g\end{bmatrix}[eg] [ e g ] \begin{bmatrix} e \\g\end{bmatrix} [eg] and then transformed to:

[ a b c d ] [ e g ] = [ a e + b g c e + d g ] \left[\begin{array}{ll}a & b \\c & d\end{array}\right]\left[\begin{array}{ll}e \\g\end{array}\right]=\left[\begin{array}{ll}ae+bg\\ce+dg\end{array}\right] [acbd][eg]=[ae+bgce+dg]

same benchmark ey e_yeyAfter two transformations:

[ a b c d ] [ f h ] = [ a f + b h c f + d h ] \left[\begin{array}{ll}a & b \\c & d\end{array}\right]\left[\begin{array}{ll}f \\h\end{array}\right]=\left[\begin{array}{ll}af+bh\\cf+dh\end{array}\right] [acbd][fh]=[a f+bhcf+dh]

Ultimately, this matrix multiplication transforms into:

[ a b c d ] [ e f g h ] = [ a e + b g a f + b h c e + d g c f + d h ] {\color{Red}\left[\begin{array}{ll}a & b \\c & d\end{array}\right]\left[\begin{array}{ll}e & f \\g & h\end{array}\right]=\left[\begin{array}{ll}a e+b g & a f+b h \\c e+d g & c f+d h\end{array}\right]} [acbd][egfh]=[ae+bgce+dga f+bhcf+dh]

References

How to understand matrix operations

The Essence of Linear Algebra - 04 - Combination of Matrix Multiplication and Linear Transformation_哔哩哔哩_bilibili

Guess you like

Origin blog.csdn.net/weixin_46421722/article/details/127245171