(Matrix for machine learning) (Vectors, matrices and multiple linear regression)

Machine learning matrix

When waiting for the bus, we expect everyone to queue up. If people are replaced by numbers, it means queuing up numbers. This is a matrix.

Matrix expression

Matrix rows and columns

The matrix is ​​composed of row and clo (abbreviation for column).

\bigl(\begin{smallmatrix} 1 &2 &3 \\ 4 &5 &6 \\ 7 &8 &8 \end{smallmatrix}\bigr)

row is translated as row and col is translated as column. m\times nIn the matrix, m represents row and n represents column.

Matrix variable name

The variable names of matrices are usually represented by uppercase English letters. The following is the variable name of the matrix set to A.

A=\bigl(\begin{smallmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \end{smallmatrix}\bigr)

Common matrix expressions

Other matrix expressions:

\begin{bmatrix} 1 &2 \\ 3 &4 \end{bmatrix}                \begin{vmatrix} 1 & 2\\ 3 & 4 \end{vmatrix}                \begin{Vmatrix} 1 & 2\\ 3 & 4 \end{Vmatrix}

Matrix element expression

Matrix elements are often represented by subscripts. You can refer to the following writing methods:

a_{ij}

i is the row number and j is the column number.

\begin{pmatrix} a_{11} & \cdots & a_{1n}\\ \vdots & \ddots & \vdots \\ a_{m1} & \cdots & a_{mn} \end{pmatrix}                        \begin{pmatrix} a_{1,1} & \cdots & a_{1,n}\\ \vdots & \ddots & \vdots \\ a_{m,1} & \cdots & a_{m,n} \end{pmatrix}

Matrix addition and subtraction

basic concepts

There are 2 matrices as follows:

A=\begin{pmatrix} a_{1,1} & \cdots &a_{1,n} \\ \vdots & \ddots &\vdots \\ a_{m,1} & \cdots & a_{m,n} \end{pmatrix}                        B=\begin{pmatrix} b_{1,1} & \cdots &b_{1,n} \\ \vdots & \ddots &\vdots \\ b_{m,1} & \cdots & b_{m,n} \end{pmatrix}

Matrix addition or subtraction is equivalent to adding or subtracting elements at the same position, so matrices of different sizes cannot be added or subtracted, as shown below:

A+B=\begin{pmatrix} a_{1,1}+b_{1,1} & \cdots &a_{1,n}+b_{1,n} \\ \vdots & \ddots &\vdots \\ a_{m,1}+b_{m,1} & \cdots & a_{m,n}+b_{m,n} \end{pmatrix}

A-B=\begin{pmatrix} a_{1,1}-b_{1,1} & \cdots &a_{1,n}-b_{1,n} \\ \vdots & \ddots &\vdots \\ a_{m,1}-b_{m,1} & \cdots & a_{m,n}-b_{m,n} \end{pmatrix}

The commutative and associative laws of matrix addition and subtraction are established.

Commutative law: A+B=B+A

Associative law: (A+B)+C=A+(B+C)

Python practice

To define a matrix, you can use numpy's matrix() method. There is a matrix as follows:

A=\begin{pmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \end{pmatrix}

>>> import numpy as np
>>> A=np.matrix([[1,2,3],[4,5,6]])
>>> A
matrix([[1, 2, 3],
        [4, 5, 6]])
>>>

Applications of matrix addition and subtraction

import numpy as np

A = np.matrix([[1, 2, 3], [4, 5, 6]])
B = np.matrix([[4, 5, 6], [7, 8, 9]])

print('A + B = {}'.format(A + B))
print('A - B = {}'.format(A - B))

The running results are as follows:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
A + B = [[ 5  7  9]
 [11 13 15]]
A - B = [[-3 -3 -3]
 [-3 -3 -3]]

[Done] exited with code=0 in 4.159 seconds

Moment multiplied by real number

A matrix can be multiplied by a real number by multiplying each matrix element by the real number. The following is an example of multiplying a matrix by a real number k.

kA=\begin{pmatrix} ka_{1,1} & \cdots & ka_{1,n}\\ \vdots & \ddots & \vdots \\ ka_{m,1} & \cdots &ka_{m,n} \end{pmatrix}

The commutative, associative and distributive laws of matrix multiplication by real numbers are established.

Commutative law: kA=Ak

Associative law: jkA=j(kA)

Distributive law: (j+k)A=jA+kA

k(A+B)=kA+kB

The matrix is ​​multiplied by 2.

import numpy as np

A = np.matrix([[1, 2, 3], [4, 5, 6]])

print('2 * A   = {}'.format(2 * A))
print('0.5 * A = {}'.format(0.5 * A))

operation result:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
2 * A   = [[ 2  4  6]
 [ 8 10 12]]
0.5 * A = [[0.5 1.  1.5]
 [2.  2.5 3. ]]

[Done] exited with code=0 in 4.375 seconds

Matrix multiplication

An important point for matrix multiplication is that the number of columns of the left matrix must be the same as the number of rows of the right matrix before matrix multiplication can be performed.

Basic rules of multiplication

A matrix is i\times j, B matrix isj\times k

AB_{ik}=\sum_{i=1}^{n}a_{i,j}b_{j,k}

Using the numpy module, you can perform matrix multiplication using the * or @ operators.

import numpy as np

A = np.matrix([[1, 2], [3, 4]])
B = np.matrix([[5, 6], [7, 8]])
print('A * B = {}'.format(A * B))

C = np.matrix([[1, 0, 2], [-1, 3, 1]])
D = np.matrix([[3, 1], [2, 1], [1, 0]])
print('C @ D = {}'.format(C @ D))

operation result:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
A * B = [[19 22]
 [43 50]]
C @ D = [[5 1]
 [4 2]]

[Done] exited with code=0 in 2.09 seconds

Multiplication case

The following table shows the quantity of fruit that A and B want to buy:

name banana mango apple
First 2 3 1
Second 3 2 5

The following table shows the prices of fruits in supermarkets and department stores:

Fruit name supermarket price department store prices
banana 30 50
mango 60 80
apple 50 60

Calculate how much money A and B each need to buy in supermarkets and department stores.

import numpy as np

A = np.matrix([[2, 3, 1], [3, 2, 5]])
B = np.matrix([[30, 50], [60, 80], [50, 60]])
print('A * B = {}'.format(A * B))

operation result:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
A * B = [[290 400]
 [460 610]]

[Done] exited with code=0 in 2.403 seconds

Assume that the calories of various fruits are as follows:

fruit heat
banana 30 calories
mango 50 calories
apple 20 calories

A and B each eat the following amounts and calculate how many calories they will produce.

name banana mango apple
First 1 2 1
Second 2 1 2

code show as below:

import numpy as np

A = np.matrix([[1, 2, 1], [2, 1, 2]])
B = np.matrix([[30], [50], [20]])
print('A * B = {}'.format(A * B))

The running results are as follows:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
A * B = [[150]
 [150]]

[Done] exited with code=0 in 2.34 seconds

Matrix multiplication rules

During matrix operations, the associative law and the distributive law are established.

Associative law:A\times B\times C=(A\times B)\times C=A\times (B\times C)

Distributive law:A\times (B-C)=A\times B-A\times C

When doing matrix operations, the commutative law does not hold.

A\times B not equal to B\times A

Verification A\times B is not equal to B\times A

code show as below:

import numpy as np

A = np.matrix([[1, 2], [3, 4]])
B = np.matrix([[5, 6], [7, 8]])
print('A * B = {}'.format(A * B))
print('B * A = {}'.format(B * A))

operation result:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
A * B = [[19 22]
 [43 50]]
B * A = [[23 34]
 [31 46]]

[Done] exited with code=0 in 4.095 seconds

square matrix

A matrix is ​​a square matrix if the number of rows is equal to the number of columns.

Identity matrix

If the diagonal elements from the upper left to the lower right of a square matrix are all 1 and the other elements are 0, this matrix is ​​called an identity matrix.

The identity matrix is ​​sometimes represented by a capital E or I.

The identity matrix is ​​similar to the Arabic numeral 1. Any matrix multiplied by the identity matrix will result in the original matrix.

Verify that the result of multiplying with the identity matrix remains unchanged.

import numpy as np

A = np.matrix([[1, 2], [3, 4]])
B = np.matrix([[1, 0], [0, 1]])
print('A * B = {}'.format(A * B))
print('B * A = {}'.format(B * A))

operation result:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
A * B = [[1 2]
 [3 4]]
B * A = [[1 2]
 [3 4]]

[Done] exited with code=0 in 2.165 seconds

inverse matrix

basic concepts

Only a square matrix can have an inverse matrix. If a matrix is ​​multiplied by its inverse matrix, the identity matrix E can be obtained. You can refer to the following concepts.

A\times A^{-1}=E         Or        A^{-1}\times A=E

If a matrix, its inverse matrix formula is as follows:

A=\begin{pmatrix} a_{1,1} &a_{1,2} \\ a_{2,1} &a_{2,2} \end{pmatrix}        A^{-1}=\frac{1}{a_{1,1}a_{2,2}-a_{1,2}a_{2,1}}\begin{pmatrix} a_{2,2} &-a_{1,2} \\ -a_{2,1} & a_{1,1} \end{pmatrix}

Another condition for the existence of the inverse matrix is ​​that a_{1,1}a_{2,2}-a_{1,2}a_{2,1}it is not equal to 0. The following is A^{-1}an example of a matrix A and its inverse matrix.

A=\begin{pmatrix} 2 &3 \\ 5 &7 \end{pmatrix}                A^{-1}=\frac{1}{14-15}\begin{pmatrix} 7 &-3 \\ -5 & 2 \end{pmatrix}=\begin{pmatrix} -7 &3 \\ 5 &-2 \end{pmatrix}

Python practice

Import the numpy module and use the inv() method to calculate the inverse matrix.

import numpy as np

A = np.matrix([[2, 3], [5, 7]])
B = np.linalg.inv(A)
print('A_inv = {}'.format(B))
print('E     = {}'.format((A * B).astype(np.int64)))

operation result:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
A_inv = [[-7.  3.]
 [ 5. -2.]]
E     = [[1 0]
 [0 0]]

[Done] exited with code=0 in 4.747 seconds

Solve simultaneous equations using inverse matrices

Suppose there is a simultaneous equation as follows:

3x+2y=5

x+2y=-1

Express the above simultaneous equations using the following matrix.

\begin{pmatrix} 3 &2 \\ 1 &2 \end{pmatrix}\begin{pmatrix} x\\ y \end{pmatrix}=\begin{pmatrix} 5\\ -1 \end{pmatrix}

\begin{pmatrix} 3 &2 \\ 1 &2 \end{pmatrix}The inverse matrix of is \begin{pmatrix} 0.5 &-0.5 \\ -0.25 &0.75 \end{pmatrix}multiplied by the same inverse matrix on both sides of the equal sign to get the following result.

\begin{pmatrix} 0.5 &-0.5 \\ -0.25 &0.75 \end{pmatrix}\begin{pmatrix} 3 &2 \\ 1 &2 \end{pmatrix}\begin{pmatrix} x\\ y \end{pmatrix}=\begin{pmatrix} 0.5 &-0.5 \\ -0.25 &0.75 \end{pmatrix}\begin{pmatrix} 5\\ -1 \end{pmatrix}

The following results can be obtained by derivation.

\begin{pmatrix} x\\ y \end{pmatrix}=\begin{pmatrix} 3\\ -2 \end{pmatrix}

It can be obtained that the solution to the above simultaneous equations is x=3, y=-2.

Use the inverse matrix concept to verify the above execution results.

import numpy as np

A = np.matrix([[3, 2], [1, 2]])
A_inv = np.linalg.inv(A)
B = np.matrix([[5], [-1]])
print('{}'.format(A_inv * B))

operation result:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
[[ 3.]
 [-2.]]

[Done] exited with code=0 in 2.025 seconds

Tensor

Tensors are mathematical stack structures.

Tensors are represented in axial space.

A scalar is a 0-axis tensor, a vector is a 1-axis tensor, a matrix is ​​a 2-axis tensor, and a 3-dimensional space is a 3-axis tensor.

Define 3D data and use the shape() method to list the shape of the data.

import numpy as np

A = np.array([[[1, 2],
                [3, 4]],
               [[5, 6],
                [7, 8]],
               [[9, 10],
                [11, 12]]])

print('{}'.format(A))
print('shape = {}'.format(np.shape(A)))

Results of the:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\ch21_10.py"
[[[ 1  2]
  [ 3  4]]

 [[ 5  6]
  [ 7  8]]

 [[ 9 10]
  [11 12]]]
shape = (3, 2, 2)

[Done] exited with code=0 in 1.479 seconds

Transpose matrix

basic concepts

The concept of transposing a matrix is ​​to swap the column elements and row elements of the matrix, so n\times mthe matrix can be converted into m\times na matrix of .

The matrix is ​​A, and the expression of the transposed matrix isA^T

Python practice

When designing the transposed matrix, you can use transpose() of the numpy module, or you can use T.

Applications of transposed matrices.

import numpy as np

A = np.array([[0, 2, 4, 6],
              [1, 3, 5, 7]])              
B = A.T
print('{}'.format(B))
C = np.transpose(A)
print('{}'.format(C))

operation result:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
[[0 1]
 [2 3]
 [4 5]
 [6 7]]
[[0 1]
 [2 3]
 [4 5]
 [6 7]]

[Done] exited with code=0 in 2.633 seconds

Rules for transposing matrices 

The transposed matrix can be transposed again to restore the matrix contents.

(A^T)^T=A

Adding matrices and then transposing them is equivalent to transposing each matrix and then adding it.

(A+B)^T=A^T+B^T

Multiplying a scalar c by a matrix and then transposing it has the same result as first transposing and then multiplying by a scalar.

(cA)^T=cA^T

Transpose the matrix and then inverse it, which is equal to inverting the matrix and then transposing it.

(A^T)^{-1}=(A^{-1})^T

Multiplying matrices and then transposing them is equivalent to transposing the matrices, exchanging their order and then multiplying them.

(AB)^T=B^TA^T

Application of transposed matrix

omission

Vectors, matrices, and multiple linear regression

Vector application in linear regression

The simple linear equation is as follows:

y=ax+b

x represents annual visit data, and y represents annual international license sales data. If the data is huge and collected over n years, you can use vectors to express this data.

x=(x_1\; x_2\cdots x_n)        #The subscript represents the nth year, x_nwhich is the number of customer visits in the nth year

y=(y_1\; y_2\cdots y_n)        #The subscript represents the nth year, y_nwhich is the number of sales examination papers in the nth year

Since there will be errors when substituting the above x_nsum ,y_n a subscript can be added to the error, so that the error can be represented by an error vector:y=ax+b\varepsilon

\varepsilon =(\varepsilon_1\;\varepsilon_2\;\cdots \;\varepsilon_n)

Now the linear equation:y=ax+b+\varepsilon

Now the slope a and intercept b are scalars. Since the slope a multiplied by the vector x will be an n-dimensional vector, the scalar b must be changed to a vector, as follows:b=(b_1\;b_2\;\cdots \;b_n)

The entire linear equation performs the derivation:

y=ax+b+\varepsilon

\varepsilon =y-ax-b

Calculate the sum of squared errors using the least squares method:\varepsilon _{i}^{2}=\sum_{i=1}^{n}\varepsilon _{i}^{2}

Inner product of error vectors \varepsilon, derived formula:\varepsilon _{i}^{2}=\sum_{i=1}^{n}\varepsilon _{i}^{2}=\left \| \varepsilon \right \|^2

Performing error square minimization is equivalent to computing the vector inner product:\varepsilon \cdot \varepsilon =(y-ax-b)\cdot (y-ax-b) 

Vector application in multiple linear regression

In multiple regression, it is customary to use \betathe coefficient as the slope, and the intercept is \beta_0replaced by. The entire multiple regression general formula can be expressed by the following formula:y=\beta _1x_1+\beta _2x_2+\cdots +\beta _nx_n+\beta _0+\varepsilon

Matrix application in multiple linear regression

omission

Put intercept into matrix

omission

simple linear regression

omission

Guess you like

Origin blog.csdn.net/DXB2021/article/details/127196611