Disclaimer: This article is a blogger original article, shall not be reproduced without the bloggers allowed. https://blog.csdn.net/a19990412/article/details/90745159
Brief
Use Needless to say, very common
- In the case of DataFrame, but the operation is different. Note contrast.
- Pandas.DataFrame rows required percentage (logarithmic scale)
problem
- Suppose data A
>>> A
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
The ratio of the number becomes how to operate? Direct line dividing the sum?
>>> A / A.sum(axis=1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (4,5) (4,)
Solution
- numpy Generally, only the corresponding bit of the operation, or the value (in fact, understood as a vector of length will be more accurate) and vector operations.
- Thus it requires np.newaxis operable to shape the same into numpy.array
>>> A / A.sum(axis=1)[:, np.newaxis]
array([[0. , 0.1 , 0.2 , 0.3 , 0.4 ],
[0.14285714, 0.17142857, 0.2 , 0.22857143, 0.25714286],
[0.16666667, 0.18333333, 0.2 , 0.21666667, 0.23333333],
[0.17647059, 0.18823529, 0.2 , 0.21176471, 0.22352941]])
A.sum(axis=1)[:, np.newaxis]
What is? It is to copy a lot of times it's vector
>>> A.sum(axis=1)
array([10, 35, 60, 85])
>>> A.sum(axis=1)[:, np.newaxis]
array([[10],
[35],
[60],
[85]])