代码来自于《Python数据科学手册》的代码复现。
来自和鲸科技（科赛）的K-lab项目

文章目录

慢循环
通用函数介绍
探索通用函数

数组的计算
绝对值
三角函数
指数和对数函数

专用的通用函数
高级的通用函数特性
聚合
外积
最小值、最大值和其他值
最大值最小值
多维度聚合

通用函数
NumPy数组的计算有时候很快有时候很慢，利用向量化是使其变快的关键，通常是通过其通用函数(usunc)中实现的

慢循环

import numpy as np
np.random.seed(0)

def compute_reciprocals(values):
    output = np.empty(len(values))
    for i in range(len(values)):
        output[i] = 1.0 / values[i]
    return output
values = np.random.randint(1, 10, size = 5)
compute_reciprocals(values)

array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

通用函数介绍

NumPy为很多类型的操作提供了非常方便的、静态类型的、可编译程序的借口，也被称为向量操作，比较以下两个结果：

print(compute_reciprocals(values))
print(1.0 / values)

[0.16666667 1.         0.25       0.25       0.125     ]
[0.16666667 1.         0.25       0.25       0.125     ]

通用函数可以对数组进行运算的：

np.arange(5) / np.arange(1, 6)

array([0.        , 0.5       , 0.66666667, 0.75      , 0.8       ])

也可以进行多维数组的运算：

x = np.arange(16).reshape((4, 4))
2 ** x

array([[    1,     2,     4,     8],
       [   16,    32,    64,   128],
       [  256,   512,  1024,  2048],
       [ 4096,  8192, 16384, 32768]])

探索通用函数

数组的计算

x = np.arange(4)
print("x     =", x)
print("x + 5 =", x + 5)
print("x - 5 =", x - 5)
print("x * 2 =", x * 2)
print("x / 2 =", x / 2)
print("x // 2 =", x // 2)

x     = [0 1 2 3]
x + 5 = [5 6 7 8]
x - 5 = [-5 -4 -3 -2]
x * 2 = [0 2 4 6]
x / 2 = [0.  0.5 1.  1.5]
x // 2 = [0 0 1 1]

还有球负数，指数和模运算的一元通用函数：

print("-x     =", -x)
print("x ** 2 =", x ** 2)
print("x % 2  =", x % 2)

-x     = [ 0 -1 -2 -3]
x ** 2 = [0 1 4 9]
x % 2  = [0 1 0 1]

封装器

np.add(x, 3)

array([3, 4, 5, 6])

NumPy实现算数运算符

运算符	对应的通用函数	描述
+	np.add	加法运算
-	np.subtract	减法运算
-	np.negative	负数运算
*	np.multiply	乘法运算
/	np.divide	除法运算
//	np.floor_divide	地板除法运算
**	np.power	指数运算
%	np.mod	模、余数

绝对值

Python的内置绝对值函数

x = np.array([-2, -1, 0, 1, 2])
abs(x)

array([2, 1, 0, 1, 2])

NumPy通云函数是np.absolute,也可以用别名np.abs：

np.absolute(x)

array([2, 1, 0, 1, 2])

np.abs(x)

array([2, 1, 0, 1, 2])

这个通用函数也可以用来处理复数，档处理复数时候，绝对值返回的是改函数的模：

x = np.array([3 - 4j, 4 - 3j, 2 + 0j, 0 + 1j])
np.abs(x)

array([5., 5., 2., 1.])

三角函数

theta = np.linspace(0, np.pi, 3)
theta
array([0.        , 1.57079633, 3.14159265])
print("theta     =", theta)
print("sin(theta =", np.sin(theta))
print("cos(theta =", np.cos(theta))
print("tan(theta =", np.tan(theta))

theta     = [0.         1.57079633 3.14159265]
sin(theta = [0.0000000e+00 1.0000000e+00 1.2246468e-16]
cos(theta = [ 1.000000e+00  6.123234e-17 -1.000000e+00]
tan(theta = [ 0.00000000e+00  1.63312394e+16 -1.22464680e-16]

指数和对数函数

x = [1, 2, 3]
print("x    =", x)
print("e^x  =", np.exp(x))
print("2^x  =", np.exp2(x))
print("3^x  =", np.power(3, x))

x    = [1, 2, 3]
e^x  = [ 2.71828183  7.3890561  20.08553692]
2^x  = [2. 4. 8.]
3^x  = [ 3  9 27]

x = [1, 2, 3, 4, 10]
print("x        =", x)
print("ln(x)    =", np.log(x))
print("log2(x)  =", np.log2(x))
print("log10(x) =", np.log10(x))

x        = [1, 2, 3, 4, 10]
ln(x)    = [0.         0.69314718 1.09861229 1.38629436 2.30258509]
log2(x)  = [0.         1.         1.5849625  2.         3.32192809]
log10(x) = [0.         0.30103    0.47712125 0.60205999 1.        ]

对于非常小的书也是非常好的保留精度的

x = [0, 0.001, 0.01, 0.1]
print("exp(x) - 1 =", np.expm1(x))
print("log(1 + x) =", np.log1p(x))

exp(x) - 1 = [0.         0.0010005  0.01005017 0.10517092]
log(1 + x) = [0.         0.0009995  0.00995033 0.09531018]

专用的通用函数

from scipy import special
# Gamma函数和相关函数
x = [1, 5, 10]
print("gamma(x)     =", special.gamma(x))
print("li|gamma(x)| =", special.gammaln(x))
print("beta(x, 2)   =", special.beta(x, 2))

gamma(x)     = [1.0000e+00 2.4000e+01 3.6288e+05]
li|gamma(x)| = [ 0.          3.17805383 12.80182748]
beta(x, 2)   = [0.5        0.03333333 0.00909091]

# 误差函数，实现及其逆实现
x = np.array([0, 0.3, 0.7, 1.0])
print("erf(x)    =", special.erf(x))
print("erfc(x)   =", special.erfc(x))
print("erfinc(x) =", special.erfinv(x))

erf(x)    = [0.         0.32862676 0.67780119 0.84270079]
erfc(x)   = [1.         0.67137324 0.32219881 0.15729921]
erfinc(x) = [0.         0.27246271 0.73286908        inf]

高级的通用函数特性

指定输出

x = np.arange(5)
y = np.empty(5)
print(x)
print(y)
np.multiply(x, 10, out = y)
print(y)

[0 1 2 3 4]
[0.0e+000 4.9e-324 9.9e-324 1.5e-323 2.0e-323]
[ 0. 10. 20. 30. 40.]

y = np.zeros(10)
print(y)
np.power(2, x, out = y[::2])
print(y)

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 1.  0.  2.  0.  4.  0.  8.  0. 16.  0.]

聚合

x = np.arange(1, 6)
print(x)
np.add.reduce(x)

[1 2 3 4 5]
15

np.multiply.reduce(x)

如果需要计算的中间结果，可以使用accumulate

print(np.add.accumulate(x))
print(np.multiply.accumulate(x))

[ 1  3  6 10 15]
[  1   2   6  24 120]

外积

x = np.arange(1, 6)
np.multiply.outer(x, x)

array([[ 1,  2,  3,  4,  5],
       [ 2,  4,  6,  8, 10],
       [ 3,  6,  9, 12, 15],
       [ 4,  8, 12, 16, 20],
       [ 5, 10, 15, 20, 25]])

最小值、最大值和其他值

数组值求和

import numpy as np

L = np.random.random(100)
print(L)
sum(L)

[0.99582517 0.12262206 0.87372235 0.54930356 0.22098344 0.42400088
 0.75555836 0.14429492 0.14954931 0.0836442  0.1971993  0.2737172
 0.80664559 0.12795214 0.74832818 0.15328873 0.64007825 0.56112099
 0.99771693 0.59142874 0.1258379  0.26427913 0.21400439 0.56670611
 0.03711501 0.77855492 0.12333906 0.97831986 0.91493149 0.48018112
 0.64199802 0.72634578 0.76189613 0.73617636 0.27554977 0.51399161
 0.31250207 0.51614311 0.33375313 0.07894331 0.05119731 0.93837673
 0.47768444 0.78235034 0.12059267 0.75252218 0.986168   0.31698481
 0.07241729 0.09302211 0.1062065  0.65226978 0.63679941 0.56501659
 0.50732646 0.74612829 0.551229   0.75045644 0.11738258 0.85625695
 0.14358165 0.48963091 0.5616225  0.20271625 0.48569236 0.08226467
 0.8402376  0.21585936 0.62580422 0.09991539 0.43570458 0.54809679
 0.58970373 0.58213233 0.62527206 0.25535607 0.360616   0.78876727
 0.45002187 0.86374775 0.22482424 0.82505022 0.41668365 0.3928502
 0.68689608 0.43244067 0.70490621 0.01694    0.22488122 0.64832461
 0.24518352 0.51967699 0.62710206 0.20753252 0.75102491 0.00642055
 0.05857505 0.3295187  0.4754157  0.71728071]
46.63620765216888

np.sum(L)

46.63620765216889

big_array = np.random.rand(1000000)
%timeit sum(big_array)
%timeit np.sum(big_array)

96.9 ms ± 527 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
442 µs ± 1.81 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

最大值最小值

min(big_array), max(big_array)

(1.890489775835391e-07, 0.9999993031657582)

np.min(big_array), np.max(big_array)

(1.890489775835391e-07, 0.9999993031657582)

%timeit min(big_array)
%timeit np.min(big_array)

75.7 ms ± 575 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
360 µs ± 1.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

print(big_array.min(), big_array.max(), big_array.sum())

1.890489775835391e-07 0.9999993031657582 499766.70505024446

多维度聚合

M = np.random.random((3, 4))
print(M)

[[0.30954309 0.43679222 0.86953481 0.11957794]
 [0.56586598 0.44348423 0.66370113 0.6035834 ]
 [0.29607204 0.72450252 0.44696634 0.6116325 ]]

M.sum()

6.091256195578881

找每一列最小

M.min(axis=0)

array([0.29607204, 0.43679222, 0.44696634, 0.11957794])

找每一行最大

M.max(axis=1)

array([0.86953481, 0.66370113, 0.72450252])

NumPy入门(3)_通用函数