Summary of numpy broadcast mechanism

Project github address: bitcarmanlee easy-algorithm-interview-and-practice
welcome everyone to star, leave a message, and learn and progress together

1.Broadcast (broadcast) mechanism

Broadcasting in numpy is very common, and its usage is to broadcast the smaller ndarray into a larger ndarray to match the corresponding shape when performing corresponding numerical calculations for the ndarray of different shapes, so that the two seem to have different shapes. The matched array can perform corresponding numerical operations.

2. Broadcasting rules

1.All input arrays with ndim smaller than the input array of largest ndim, have 1’s prepended to their shapes.

2.The size in each dimension of the output shape is the maximum of all the input sizes in that dimension.

3.An input can be used in the calculation if its size in a particular dimension either matches the output size in that dimension, or has value exactly 1.

4.If an input has a dimension size of 1 in its shape, the first data entry in that dimension will be used for all calculations along that dimension. In other words, the stepping machinery of the ufunc will simply not step along that dimension (the stride will be 0 for that dimension).

The translation is
1. Let all the input arrays align with the longest shape of the array, and the insufficient parts
of the shape are filled in by adding 1 to the front 2. The shape of the output array is the maximum value on each axis of the input array shape
3. .If the length of an axis of the input array and the corresponding axis of the output array are the same or its length is 1, this array can be used for calculation, otherwise an error
occurs. 4. When the length of an axis of the input array is 1, along Use the first set of values ​​on this axis when calculating this axis

The above explanation is actually quite abstract, let's understand it through some examples.

3. Case analysis

First, let's look at the simplest way of broadcasting, the operation of vectors on scalars.

import numpy as np


def t1():
    array = np.arange(5)
    print("array is: ", array)
    array = array * 4
    print("after broadcast, array is: ", array)

The result of the operation is

array is:  [0 1 2 3 4]
after broadcast, array is:  [ 0  4  8 12 16]

There is nothing to say about this. The more typical element-wise calculation method is essentially that each position of the array is multiplied by 4. The calculation process can be understood as broadcasting the scalar 4 into a dimension of 5*1.

def t2():
    array = np.arange(12).reshape(4, 3)
    print("array is: ", array)

    print(array.mean(0))
    print(array.mean(0).shape)
    print(array.mean(1))
    print(array.mean(1).shape)
    array = array - array.mean(0)
    print("after broadcast, array is: ", array)
    array = array - array.mean(1)
    print("after broadcast, array is: ", array)

The output result is

array is:  [[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
[4.5 5.5 6.5]
(3,)
[ 1.  4.  7. 10.]
(4,)
after broadcast, array is:  [[-4.5 -4.5 -4.5]
 [-1.5 -1.5 -1.5]
 [ 1.5  1.5  1.5]
 [ 4.5  4.5  4.5]]
Traceback (most recent call last):
  File "/Users/wanglei/wanglei/code/python/tfpractice/basic/Broadcast.py", line 81, in <module>
    t2()
  File "/Users/wanglei/wanglei/code/python/tfpractice/basic/Broadcast.py", line 29, in t2
    array = array - array.mean(1)
ValueError: operands could not be broadcast together with shapes (4,3) (4,) 

In the above code, the dimension of mean(0) is (3,). When operating with an array of dimensions (4, 3), the last dimension of mean(0) is 3, which is equal to the last dimension of (4,3). , So it can be broadcast smoothly. But the dimension of mean(1) is (4, ), which is not equal to the last dimension of (4,3), so it cannot be broadcast.

def t3():
    array = np.arange(12).reshape(4, 3)
    print("array is: ", array)
    print(array.mean(0))
    print(array.mean(0).shape)
    print(array.mean(1))
    print(array.mean(1).shape)
    array = array - array.mean(0).reshape(1, 3)
    print("after broadcast, array is: ", array)

The results are as follows

array is:  [[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
[4.5 5.5 6.5]
(3,)
[ 1.  4.  7. 10.]
(4,)
after broadcast, array is:  [[-4.5 -4.5 -4.5]
 [-1.5 -1.5 -1.5]
 [ 1.5  1.5  1.5]
 [ 4.5  4.5  4.5]]
def t4():
    array = np.arange(12).reshape(4, 3)
    print("array is: ", array)
    print(array.mean(1).reshape(4, 1))
    array = array - array.mean(1).reshape(4, 1)
    print("after broadcast, array is: ", array)

The results are as follows

array is:  [[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
[[ 1.]
 [ 4.]
 [ 7.]
 [10.]]
after broadcast, array is:  [[-1.  0.  1.]
 [-1.  0.  1.]
 [-1.  0.  1.]
 [-1.  0.  1.]]

When mean(1) is reshaped to (4,1), compared with the dimensions of array (4, 3), the first dimension is the same, and the second dimension is 1, so it can be broadcast smoothly.

def t5():
    array = np.arange(24).reshape(2, 3, 4)
    print("in the beginning, array is: ", array)
    arrayb = np.arange(12).reshape(3, 4)
    print("arrayb is: ", arrayb)
    array = array - arrayb
    print("after broadcast, array is: ", array)

operation result

in the beginning, array is:  [[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
arrayb is:  [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
after broadcast, array is:  [[[ 0  0  0  0]
  [ 0  0  0  0]
  [ 0  0  0  0]]

 [[12 12 12 12]
  [12 12 12 12]
  [12 12 12 12]]]

The dimension of the above array is (2, 3, 4), and the dimension of arrayb is (3, 4), so when broadcasting, it is equivalent to copying on the dimension of shape[0].

def t6():
    array = np.arange(24).reshape(2, 3, 4)
    print("in the beginning, array is: ", array)
    arrayb = np.arange(8).reshape(2, 1, 4)
    print("arrayb is: ", arrayb)
    array = array - arrayb
    print("after broadcast, array is: ", array)


def t7():
    array = np.arange(24).reshape(2, 3, 4)
    print("in the beginning, array is: ", array)
    arrayb = np.arange(6).reshape(2, 3, 1)
    print("arrayb is: ", arrayb)
    array = array - arrayb
    print("after broadcast, array is: ", array)

The result of the operation is

in the beginning, array is:  [[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
arrayb is:  [[[0 1 2 3]]

 [[4 5 6 7]]]
after broadcast, array is:  [[[ 0  0  0  0]
  [ 4  4  4  4]
  [ 8  8  8  8]]

 [[ 8  8  8  8]
  [12 12 12 12]
  [16 16 16 16]]]
in the beginning, array is:  [[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
arrayb is:  [[[0]
  [1]
  [2]]

 [[3]
  [4]
  [5]]]
after broadcast, array is:  [[[ 0  1  2  3]
  [ 3  4  5  6]
  [ 6  7  8  9]]

 [[ 9 10 11 12]
  [12 13 14 15]
  [15 16 17 18]]]

The above two methods are broadcast on the dimensions of shape[1] and shape[2] respectively

4. Summary

Combining the above examples, we summarize the basic principles of broadcasting:
1. The two arrays are compared in dimension from the end.
2. If the dimensions are equal or one of the values ​​is 1, it is considered to be broadcastable.
3. If the dimension is missing, it can be ignored.
4. Broadcasting is performed on the missing dimension or the dimension of 1.

Guess you like

Origin blog.csdn.net/bitcarmanlee/article/details/108566910