Matrix & Python propagation matrix operation mechanism

 

Python matrix propagation mechanism (Broadcasting)

Recent study of neural networks. We know that in the depth of learning often have to operate a variety of matrix (matrix).
Recall that when we manipulate arrays (list), often used to operate on each element of the array with a for loop (for-loop). E.g:

my_list = [1,2,3,4]
new_list = []
for each in my_list:
    new_list.append(each*2)
print(new_list)  
# 输出 [2,3,4,5]

If it is a matrix:

my_matrix = [[1,2,3,4],
             [5,6,7,8]]
new_matrix = [[],[]]
for i in range(2):
    for j in range(4):
        new_matrix[i].append(my_matrix[i][j]*2)
print(new_matrix)
# 输出 [[2, 4, 6, 8], [10, 12, 14, 16]]

In fact, the above approach is very inefficient! A small amount of data if not obvious, if a large amount of data, in particular the matrix depth study we deal with often huge, that use a for loop to run a matrix, you might be a few hours or even days.

Python Considering this point, which is this paper would like to introduce the "Python's broadcasting" that is spreading mechanism.
First to say, python defined matrix, the processing matrix, we generally use numpy library.

The following shows what is python propagation mechanisms:

import numpy as np

# 先定义一个3×3矩阵 A:
A = np.array( [[1,2,3], [4,5,6], [7,8,9]]) print("A:\n",A) print("\nA*2:\n",A*2) # 直接用A乘以2 print("\nA+10:\n",A+10) # 直接用A加上10 

operation result:

A:
 [[1 2 3] [4 5 6] [7 8 9]] A*2: [[ 2 4 6] [ 8 10 12] [14 16 18]] A+10: [[11 12 13] [14 15 16] [17 18 19]] 

Next, look at the matrix × (+) matrix:

#定义一个3×1矩阵(此时也可叫向量了)
B = np.array([[10], [100], [1000]]) print("\nB:\n",B) print("\nA+B:\n",A+B) print("\nA*B:\n",A*B) 

operation result:

B:
 [[  10] [ 100] [1000]] A+B: [[ 11 12 13] [ 104 105 106] [1007 1008 1009]] A*B: [[ 10 20 30] [ 400 500 600] [7000 8000 9000]] 

Be seen, although the shape of A and B are not the same, is a 3 × 3, is a 3 × 1, but we can directly in python addition, multiplication, subtraction may be divided phase.

Maybe see this, we all have a feeling of broadcasting.
FIG schematically with a look:

 
Schematic propagation mechanism


The so-called "spread" is the number of a vector or a "copy", and thus acts on each element of the matrix.

 

With this mechanism, it performs vector and matrix operations, it is too convenient!
Appreciated communication mechanisms, can be arbitrarily matrix variety of convenient operation.

Numpy use of built-in functions for matrix operations:

numpy built a lot of mathematical functions, such as np.log (), np.abs (), np.maximum () and so on hundreds. Direct throw into the matrix, you can calculate the new matrix!
Example:

print(np.log(A))

A new matrix output the resulting matrix of each element of the request log:

array([[0.        , 0.69314718, 1.09861229], [1.38629436, 1.60943791, 1.79175947], [1.94591015, 2.07944154, 2.19722458]]) 

 

 

Another example common depth learning ReLU activation function is y = max (0, x),

 
ReLU function

It can also be directly operational matrix:

X = np.array([[1,-2,3,-4], [-9,4,5,6]]) Y = np.maximum(0,X) print(Y) 

get:

[[1 0 3 0] [0 4 5 6]] 

More numpy mathematical functions can be found in the document:
https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html

Define your own functions to handle matrix

In fact, this is the purpose of this article I wrote. . . Front pull so much, just to be a foreshadowing ( / ω\ )

Yesterday I encountered a problem, what I want to ReLU derivation function, easy to know, y = max (0, x ) is the derivative of the function:
the y-'the X-IF = 0 <0
the y-' = IF the X-1> 0
but this y '(x) numpy which is not defined, need to build their own.
That is, I need to be smaller than the elements of the matrix X 0 becomes 0, element 1 becomes greater than 0.
Engage in a long time did not get out, then see the solution on StackOverflow:

def relu_derivative(x): x[x<0] = 0 x[x>0] = 1 return x X = np.array([[1,-2,3,-4], [-9,4,5,6]]) print(relu_derivative(X)) 

Output:

[[1 0 1 0] [0 1 1 1]] 

Came out could be so simple! ! ! San ミ ゚ Д ゚ (゚ Д ゚ #)

This function relu_derivative the most difficult to understand where is the x [x> 0] up.
So I tried it:

X = np.array([[1,-2,3,-4], [-9,4,5,6]]) print(X[X>0]) print(X[X<0]) 

Output:

[1 3 4 5 6] [-2 -4 -9] 

It directly to the matrix elements of X that satisfies a condition to take out! The original python This operation matrix as well!

 


I was shocked for a long time -
it can be understood, X [X> 0] is equivalent to a "selector" to meet the conditions of the elements chosen, then all the direct assignment.
In this way, we can customize various functions we need, and then update the whole matrix!

 

In summary

As can be seen, python and numpy matrix operation is simply marvelous, convenient and affordable. In fact, I forget to write the above point, that is the efficiency of a computer matrix operations is much higher than with a for-loop to operation,
do not believe can be used to run a race:

# vetorization vs for loop
# define two arrays a, b:
a = np.random.rand(1000000) b = np.random.rand(1000000) # for loop version: t1 = time.time() c = 0 for i in range(1000000): c += a[i]*b[i] t2 = time.time() print(c) print("for loop version:"+str(1000*(t2-t1))+"ms") time1 = 1000*(t2-t1) # vectorization version: t1 = time.time() c = np.dot(a,b) t2 = time.time() print(c) print("vectorization version:"+str(1000*(t2-t1))+"ms") time2 = 1000*(t2-t1) print("vectorization is faster than for loop by "+str(time1/time2)+" times!") 

operation result:

249765.8415288075
for loop version:627.4442672729492ms
249765.84152880745
vectorization version:1.5032291412353516ms
vectorization is faster than for loop by 417.39762093576525 times!

可见,用for方法和向量化方法,计算结果是一样,但是后者比前者快了400多倍!
因此,在计算量很大的时候,我们要尽可能想办法对数据进行Vectorizing,即“向量化”,以便让计算机进行矩阵运算。

原文 https://www.jianshu.com/p/e26f381f82ad

Guess you like

Origin www.cnblogs.com/php-linux/p/11949961.html