Are you sure you understand the broadcast mechanism in Numpy correctly?

guide

Numpy is a basic data analysis toolkit in Python, which provides a large number of commonly used numerical calculation functions. Of course, most of these numerical calculation functions depend on its core data structure: ndarray, which is an N-dimensional array. Regarding this ndarray, an important feature is the broadcast mechanism, which is the entire broadcast mechanism that makes the numerical calculation functions in Numpy richer and more powerful. So the question is, have you understood the broadcast mechanism correctly?

This article is an excerpt from the numpy introductory detailed tutorial .

The broadcast mechanism is an important feature in Numpy, which refers to when performing certain numerical calculations on ndarray (here refers to the numerical calculation between matrices, and the corresponding position element 1 to 1 performs scalar operations, rather than matrix operations in nonlinear algebra) , which can ensure that when the shapes of the arrays are not exactly the same, they can automatically spread to the same shape through the broadcast mechanism, and then perform the corresponding calculation function.

Of course, the broadcasting mechanism here is conditional, rather than automatic broadcasting for any array with different shapes. Obviously, understanding the "condition" here is the core principle of understanding the broadcasting mechanism.

In order to explore the limitations of the broadcasting mechanism, we turn to numpy's official documentation. For example, if you open the doc folder in the numpy source code, you can see a numpy/doc/broadcasting.py file, which is actually full of annotative documents. You can find this paragraph:

The condition is very simple, that is, the comparison starts from the last dimension of the two arrays. If the dimensions are equal or one of them has a size of 1, broadcasting can be realized. Of course, when the dimensions are equal , it means that there is no need for broadcasting , so strictly speaking, broadcasting is only applicable to broadcasting from 1 to N in a certain dimension; if the current dimension meets the broadcasting requirements, move forward one dimension at the same time and continue the comparison until one of the matrices is completed first. All dimensions - if there is any remaining matrix, it doesn't really matter, which is easy to understand.

In order to intuitively understand this broadcast condition, for example, the following situations all satisfy the broadcast condition:

The following example cannot complete the broadcast:

Of course, the above examples are actually derived from the numpy/doc/broadcasting.py file just now. In addition, the doc package also includes a lot of documentation, which is of great benefit to a deep understanding of the numpy operating mechanism.

Further exploration: It may be worth wondering why it is necessary to broadcast 1 to N. Isn't any factor of N (such as N/2, N/3, etc.) "reasonable" broadcast to N? In this regard, I have also been confused. My understanding is that the "reasonable" here is only reasonable at the mathematical level, but it is often no longer reasonable when considering the business meaning behind the array: for example, the values ​​of the same dimension of two matrices They are 2 and 12 respectively, so if 2 is broadcast to 12, how to understand the meaning of the broadcast? For example, broadcast according to odd and even? What about broadcasting 3 to 12? What about broadcasting from 4 to 12? Ultimately, it lacks explanation. So the numpy limit must be 1 broadcast to N or the two are equal before it can be broadcast.

In fact, not only numpy, but tensor in torch or tf actually has a similar broadcast mechanism!

Related Reading:

Guess you like

Origin blog.csdn.net/weixin_43841688/article/details/119861034