Article directory
Five, Numpy arithmetic operations
1. Elements are multiplied correspondingly
A = np.array([[1, 2], [-1, 4]])
B = np.array([[2, 0], [3, 4]])
print(A)
>>>[[ 1 2]
[-1 4]]
print(B)
>>>[[2 0]
[3 4]]
print(A*B)
>>>[[ 2 0]
[-3 16]]
print(np.multiply(A,B))
>>>[[ 2 0]
[-3 16]]
2. Array and scalar operations
A = np.array([[1, 2], [-1, 4]])
print(A)
>>>[[ 1 2]
[-1 4]]
print(A/2.0)
>>>[[ 0.5 1. ]
[-0.5 2. ]]
print(A*2.0)
>>>[[ 2. 4.]
[-2. 8.]]
3. Dot product (inner product of elements)
X1 = np.array([[1,2],[3,4]])
X2 = np.array([[1,2,3],[4,5,6]])
X3 = np.dot(X1,X2)
print(X3)
>>>[[ 9 12 15]
[19 26 33]]
Six, array deformation
In matrix or array operations, it is often encountered that multiple vectors or matrices need to be merged and expanded along a certain axis. For example, in a convolutional or recurrent neural network, the matrix needs to be flattened before the fully connected layer.
(1) Change shape
1.reshape
change the dimensions of a vector without modifying the vector itself
# reshape
arr = np.arange(10)
print(arr)
>>>[0 1 2 3 4 5 6 7 8 9]
# 将arr变换为2*5
print(arr.reshape(2,5))
>>>[[0 1 2 3 4]
[5 6 7 8 9]]
# 指定列数或行数,其他用-1代替
print(arr.reshape(5,-1))
>>>[[0 1]
[2 3]
[4 5]
[6 7]
[8 9]]
print(arr.reshape(-1,5))
>>>[[0 1 2 3 4]
[5 6 7 8 9]]
2.resize
change the dimensions of a vector, and modify the vector itself
# resize
arr = np.arange(10)
print(arr)
>>>[0 1 2 3 4 5 6 7 8 9]
# 将向量arr维度变换为2行5列
arr.resize(2,5)
print(arr)
>>>[[0 1 2 3 4]
[5 6 7 8 9]]
3. T (transpose)
# T(转置)
arr = np.arange(12).reshape(3,4)
print(arr)
>>>[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
print(arr.T)
>>>[[ 0 4 8]
[ 1 5 9]
[ 2 6 10]
[ 3 7 11]]
4.ravel (flatten)
# ravel(展平)
arr = np.arange(6).reshape(2,-1)
print(arr)
>>>[[0 1 2]
[3 4 5]]
# 按列优先展平
print(arr.ravel('F'))
>>>[0 3 1 4 2 5]
# 按行优先展平
print(arr.ravel())
>>>[0 1 2 3 4 5]
5.flatten
convert matrix to vector
# flatten
a = np.floor(10*np.random.random((3,4)))
print(a)
>>>[[0. 0. 6. 8.]
[3. 1. 8. 8.]
[6. 4. 1. 8.]]
print(a.flatten())
>>>[0. 0. 6. 8. 3. 1. 8. 8. 6. 4. 1. 8.]
6.squeeze
The function used to reduce the dimension, remove the dimension containing 1 in the matrix.
arr = np.arange(3).reshape(3,1)
print(arr)
>>>[[0]
[1]
[2]]
print(arr.squeeze().shape)
>>>(3,)
arr1 = np.arange(6).reshape(3,1,2,1)
print(arr1.shape)
>>>(3, 1, 2, 1)
print(arr1.squeeze().shape)
>>>(3, 2)
7.transpose
Axis-swap the high-risk matrix. For example, change RGB to GBR
arr = np.arange(24).reshape(2,3,4)
print(arr.shape)
>>>(2, 3, 4)
print(arr.transpose(1,2,0).shape)
>>>(3, 4, 2)
(2) Merge arrays
1.append
There is an axis parameter to control merging by row/column, which takes up a lot of memory.
# 合并一维数组
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.append(a,b)
print(c)
>>>[1 2 3 4 5 6]
# 合并多维数组
a = np.arange(4).reshape(2,2)
b = np.arange(4).reshape(2,2)
print(np.append(a,b,axis = 0))
>>>[[0 1]
[2 3]
[0 1]
[2 3]]
print(np.append(a,b,axis = 1))
>>>[[0 1 0 1]
[2 3 2 3]]
2.concatenate
a = np.arange(4).reshape(2,2)
b = np.arange(4).reshape(2,2)
print(np.concatenate((a,b),axis = 0))
>>>[[0 1]
[2 3]
[0 1]
[2 3]]
print(np.concatenate((a,b),axis = 1))
>>>[[0 1 0 1]
[2 3 2 3]]
3.stack
Stack arrays/matrices along the specified axis
requiring the same shape
a = np.arange(4).reshape(2,2)
b = np.arange(4,8).reshape(2,2)
print(np.stack((a,b),axis = 0))
>>>[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
print(np.stack((a,b),axis = 1))
>>>[[[0 1]
[4 5]]
[[2 3]
[6 7]]]
4.hstack 、vstack 和 dstack
a = np.arange(4).reshape(2,2)
b = np.arange(4,8).reshape(2,2)
print(np.hstack((a,b)))
>>>[[0 1 4 5]
[2 3 6 7]]
print(np.vstack((a,b)))
>>>[[0 1]
[2 3]
[4 5]
[6 7]]
print(np.dstack((a,b)))
>>>[[[0 4]
[1 5]]
[[2 6]
[3 7]]]
5. vsplit and hsplit
Split by row and by column
a = np.arange(12).reshape(3,4)
print(a)
>>>[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
print(np.vsplit(a,[1,2]))
>>>[array([[0, 1, 2, 3]]), array([[4, 5, 6, 7]]), array([[ 8, 9, 10, 11]])]
print(np.hsplit(a,[1,2]))
>>>[array([[0],
[4],
[8]]),
array([[1],
[5],
[9]]),
array([[ 2, 3],
[ 6, 7],
[10, 11]])]
7. Batch processing
General steps:
1. Shuffle data randomly
2. Define batch size
3. Batch data
# 生成10000个形状为2*3的矩阵(一个10000*2*3形状的矩阵,第1维为样本数,后2维是数据)
data_train = np.random.randn(10000,2,3)
# 打乱数据
np.random.shuffle(data_train)
# 定义批大小
batch_size = 1000
# 进行批处理数据
for i in range(0,len(data_train)//batch_size):
x_batch_sum = np.sum(data_train[i:i+batch_size])
print("第{}批次,该批次数据和为:{}".format(i+1,x_batch_sum))
>>>第1批次,该批次数据和为:19.831597230692044
第2批次,该批次数据和为:25.740728092864465
第3批次,该批次数据和为:27.055236935523748
第4批次,该批次数据和为:33.117813938724495
第5批次,该批次数据和为:25.877849245176712
第6批次,该批次数据和为:28.51526070095471
第7批次,该批次数据和为:23.45418417276757
第8批次,该批次数据和为:29.85505288703944
第9批次,该批次数据和为:32.270674668876
第10批次,该批次数据和为:34.17904037831163
8. Universal function (ufunc)
Many ufunc functions are written in C language, which is relatively fast.
More flexible than math. Math is generally a scalar, and Numpy can be a vector and a matrix, reducing the use of loop statements.
function name | Function |
---|---|
sqrt | Calculate the square root of serialized data |
sin、cos | Trigonometric functions |
abs | Calculate the absolute value of serialized data |
dot | Matrix calculations (dot product, vector inner product) |
log,log10,log2 | Logarithmic function |
exp | exponential function |
cumsum,cumproduct | cumulative sum, product |
sum | sum serialized data |
mean | Calculate mean |
median | Calculate the median |
std | Calculate standard deviation |
was | Calculate the variance |
corrcoef | Calculate the correlation coefficient |
a = np.arange(10).reshape(2,5)
b = np.arange(-5,5).reshape(5,2)
print(a)
>>>[[0 1 2 3 4]
[5 6 7 8 9]]
print(b)
>>>[[-5 -4]
[-3 -2]
[-1 0]
[ 1 2]
[ 3 4]]
print(np.sqrt(a))
>>>[[0. 1. 1.41421356 1.73205081 2. ]
[2.23606798 2.44948974 2.64575131 2.82842712 3. ]]
print(np.sin(a))
>>>[[0. 1. 1.41421356 1.73205081 2. ]
[2.23606798 2.44948974 2.64575131 2.82842712 3. ]]
print(np.cos(a))
>>>[[ 1. 0.54030231 -0.41614684 -0.9899925 -0.65364362]
[ 0.28366219 0.96017029 0.75390225 -0.14550003 -0.91113026]]
print(np.abs(b))
>>>[[5 4]
[3 2]
[1 0]
[1 2]
[3 4]]
print(np.dot(a,b))
>>>[[ 10 20]
[-15 20]]
print(np.log2(a))
>>>[[ -inf 0. 1. 1.5849625 2. ]
[2.32192809 2.5849625 2.80735492 3. 3.169925 ]]
print(np.exp(a))
>>>[[1.00000000e+00 2.71828183e+00 7.38905610e+00 2.00855369e+01
5.45981500e+01]
[1.48413159e+02 4.03428793e+02 1.09663316e+03 2.98095799e+03
8.10308393e+03]]
print(np.cumsum(a))
>>>[ 0 1 3 6 10 15 21 28 36 45]
print(np.cumproduct(a))
>>>[0 0 0 0 0 0 0 0 0 0]
print(np.sum(a))
>>>45
print(np.mean(a))
>>>4.5
print(np.median(a))
>>>4.5
print(np.std(a))
>>>2.8722813232690143
print(np.var(a))
>>>8.25
print(np.corrcoef(a))
>>>[[1. 1.]
[1. 1.]]
9. Broadcast mechanism
Numpy's Universal functions require that the shape of the input array be consistent. When inconsistent, the broadcast mechanism is used.
There are the following rules:
(1) Align with the longest array of shape, and add 1 if it is insufficient;
(2) Take the maximum value of each axis
(3) When the length of a certain axis of the input array and the corresponding axis of the output array are the same or 1 , used for calculation, otherwise, an error occurs.
(4) When the length of a certain axis of the input array is 1, the operations along this axis all use the first set of values.
a = np.arange(0, 40, 10).reshape(4, 1)
b = np.arange(0, 3)
c = a+b
print(a)
>>>[[ 0]
[10]
[20]
[30]]
print(b)
>>>[0 1 2]
print(c)
>>>[[ 0 1 2]
[10 11 12]
[20 21 22]
[30 31 32]]
According to rule 2, the output should be 4*3
According to rule 4, a should be: [[ 0 1 2] , b should be: [[ 0 0 0]
[ 0 1 2] , [10 10 10]
[ 0 1 2 ] , [20 20 20]
[ 0 1 2]] , [30 30 30]]
related suggestion
[pyTorch Study Notes ①] Numpy Basics Part 1
[pyTorch Study Notes ②] Numpy Basics Part 2