numpy高级教程之np.where和np.piecewise

欢迎关注“勇敢AI”公众号，更多python学习、数据分析、机器学习、深度学习原创文章与大家分享，还有更多电子资源、教程、数据集下载。勇敢AI，一个专注于人工智能AI的公众号。

==================================================================================

关于numpy的教程，前面已经总结了不少文章，且前面已经写过了numpy的高级应用之select和choose，需要的同学可以看我的博客或者是在我博客里面的微信公众平台，对这两个函数有特别清晰的介绍。今天的文章主要是看一下np.where和np.piecewise,以及掩码数组。

一、np.where（）函数详解

顾名思义，该函数所实现的也是类似于查找的功能，where即在哪里的意思，前面的select和choose即选择的意思，这有一些类似。实际上，where函数提供了“查找”的更高级的功能。

1、where函数的定义

numpy.where(condition[, x, y])

If only condition is given, return condition.nonzero().

Parameters:

Parameters:	condition : array_like, bool When True, yield x, otherwise yield y. x, y : array_like, optional Values from which to choose. x, y and condition need to be broadcastable to some shape.
Returns:	out : ndarray or tuple of ndarrays If both x and y are specified, the output array contains elements of x where condition is True, and elements from y elsewhere. If only condition is given, return the tuple `condition.nonzero()`, the indices where condition is True.

condition : array_like, bool

When True, yield x, otherwise yield y.

x, y : array_like, optional

Values from which to choose. x, y and condition need to be broadcastable to some shape.

Returns:

out : ndarray or tuple of ndarrays

If both x and y are specified, the output array contains elements of x where condition is True, and elements from y elsewhere.

If only condition is given, return the tuple condition.nonzero(), the indices where condition is True.

解释：

该函数可以接受一个必选参数condition，注意该参数必须是array型的，只不过元素是true或者是false

x,y是可选参数：如果条件为真，则返回x,如果条件为false，则返回y，注意condition、x、y三者必须要能够“广播”到相同的形状

返回结果：返回的是数组array或者是元素为array的tuple元组，如果只有一个condition，则返回包含array的tuple，如果是有三个参数，则返回一个array。后面会详细介绍。

总结：numpy实际上是条件操作的高度封装，可以说numpy.where函数是三元表达式x if condition else y的矢量化版本。定义一个布尔数组和两个值数组，它返回的结果实际上就是查找的元素的“位置”或者是“索引”或者是“坐标”。位置、索引、坐标是等价的意思，通俗的说就是返回的值就是元素到底在哪里。

2、只含有一个参数condition的实际案例

（1）condition是一维的

a = np.array([2,4,6,8,10])
a_where=np.where(a > 5)             # 返回索引，a>5得到的是array。
print(a_where)
print(a[np.where(a > 5)])        # 等价于 a[a>5]，布尔索引

b=np.array([False,True,False,True,False])
print(np.where(b))

打印的结果为如下：

(array([2, 3, 4], dtype=int64),) #返回的2,3,4即对应于元素大于5（True）的元素的索引位置，即第3、4、5个元素，以array得形式返回
[ 6 8 10]
(array([1, 3], dtype=int64),) #返回元素中为True的元素所对应的位置索引，即第2个和第4个元素，以array的形式返回。

（2）condition是二维的

a=np.array([[1,2,3,4,5],[2,3,4,5,6],[3,4,5,6,7],[4,5,6,7,8]])  #二维数组
a_where=np.where(a>3)
print(a_where)
print(a[a_where])

运行结果为：

(array([0, 0, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3], dtype=int64), array([3, 4, 2, 3, 4, 1, 2, 3, 4, 0, 1, 2, 3, 4], dtype=int64))
[4 5 4 5 6 4 5 6 7 4 5 6 7 8]

对于第一个返回的结果，我们可以看到，它返回的是一个元组tuple，元组的第一个元素是一个array，它的值来源于满足条件的元素的行索引，两个0行的，三个1行的，四个2行的，五个3行的；元祖的第二个元素也是一个array，它的值来源于满足条件的元素的列索引。

再看一个例子

b=np.where([[0, 3,2], [0,4, 0]])
print(b)

返回结果为：

(array([0, 0, 1], dtype=int64), array([1, 2, 1], dtype=int64))

#返回的依然是一个元组tuple，第一个元素是一个array，来源于行，第二个元素也是一个array，来源于列。注意，这一没有删选条件啊，因为where的参数就是b而不是什么，b>1,b>2之类的布尔表达式，那是怎么回事呢，实际上，它是经过进一步处理了的，将0元素看成是false，将大于0的全部当成是true，与下面的运行结果是完全等价的。

c=np.array([[False,True,True],[False,True,False]])
print(np.where(c))

输出结果为： (array([0, 0, 1], dtype=int64), array([1, 2, 1], dtype=int64)) #同上面完全一样

（3）condition是多维的——以三维为例

a=[
    [
        [1,2,3],[2,3,4],[3,4,5]
    ],
    [
        [0,1,2],[1,2,3],[2,3,4]
    ]
]
a=np.array(a)
print(a.shape)  #形状为（2,3,3)
a_where=np.where(a>3)
print(a_where)

运行结果为：

(array([0, 0, 0, 1], dtype=int64),

array([1, 2, 2, 2], dtype=int64),

array([2, 1, 2, 2], dtype=int64))

同上，返回的是一个元组，第一个元素是array类型，来源于第一个维度满足条件的索引，第二个元素是array类型，来源于第二个维度满足条件的索引，第三个元素是array类型，来源于第三个维度满足条件的索引。

总结：针对上面的讲述，where的作用就是返回一个数组中满足条件的元素（True）的索引，且返回值是一个tuple类型，tuple的每一个元素均为一个array类型，array的值即对应某一纬度上的索引。

在之给定一个参数condition的时候，np.where(condition)和condition.nonzero()是完全等价的。

3、包含三个参数condition、x、y 的实际案例

a = np.arange(10)
a_where=np.where(a,1,-1)
print(a_where)      

a_where_1=np.where(a > 5,1,-1)
print(a_where_1)

b=np.where([[True,False], [True,True]],    #第一个参数
             [[1,2], [3,4]],               #第二个参数
             [[9,8], [7,6]]                #第三个参数
            )             

print(b)

运行结果为：

[-1 1 1 1 1 1 1 1 1 1]
[-1 -1 -1 -1 -1 -1 1 1 1 1]
[[1 8]
[3 4]]

对于第一个结果，因为只有第一个是0，即false，故而用-1替换了，后面的都大于0，即true，故而用1替换了

对于第二个结果，前面的6个均为false，故而用-1替换，后面的四个为true，则用1替换

对于第三个结果，

第一个True对应的索引位置为（0,0），true在第二个参数中寻找，（0,0）对应于元素1

第二个false对应的索引位置为（0,1），false在第三个参数中寻找，（0,1）对应于元素8

第三个True对应的索引位置为（1,0），true在第二个参数中寻找，（0,0）对应于元素3

第四个True对应的索引位置为（1,1），true在第二个参数中寻找，（0,0）对应于元素4

总结：在使用三个参数的时候，要注意，condition、x、y必须具有相同的维度或者是可以广播成相同的形状，否则会报错，它返回的结果是一个列表，同原始的condition具有相同的维度和形状。

总结：通过上面的讲述，已经了解到了np.where函数的强大之处，它的本质上就是选择操作，但是如果我们自己编写条件运算，使用if-else或者是列表表达式这样的语句，效率低下，故而推荐使用np.where。

二、np.piecewise函数详解

np.piecewise也和前面讲过的where、select、choose一样，属于高级应用，而且实现的功能也有类似的，即根据相关的条件，进行筛选，然后对漫步不同条件的元素进行相关的操作，这个操作可以来源与函数、lambda表达式等，并得到新的结果。

1、np.piecewise的定义

numpy.piecewise(x, condlist, funclist, *args, **kw)

Evaluate a piecewise-defined function.

Given a set of conditions and corresponding functions, evaluate each function on the input data wherever its condition is true.

Parameters:

Parameters:	x : ndarray or scalar The input domain. condlist : list of bool arrays or bool scalars Each boolean array corresponds to a function in funclist. Wherever condlist[i] is True, funclist[i](x) is used as the output value. Each boolean array in condlist selects a piece of x, and should therefore be of the same shape as x. The length of condlist must correspond to that of funclist. If one extra function is given, i.e. if`len(funclist) == len(condlist) + 1`, then that extra function is the default value, used wherever all conditions are false. funclist : list of callables, f(x,args,kw), or scalars Each function is evaluated over x wherever its corresponding condition is True. It should take a 1d array as input and give an 1d array or a scalar value as output. If, instead of a callable, a scalar is provided then a constant function (`lambda x: scalar`) is assumed. args* : tuple, optional Any further arguments given to `piecewise` are passed to the functions upon execution, i.e., if called `piecewise(..., ..., 1, 'a')`, then each function is called as `f(x, 1, 'a')`. kw : dict, optional Keyword arguments used in calling `piecewise` are passed to the functions upon execution, i.e., if called`piecewise(..., ..., alpha=1)`, then each function is called as `f(x, alpha=1)`.
Returns:	out : ndarray The output is the same shape and type as x and is found by calling the functions in funclist on the appropriate portions of x, as defined by the boolean arrays in condlist. Portions not covered by any condition have a default value of 0.

x : ndarray or scalar

The input domain.

condlist : list of bool arrays or bool scalars

Each boolean array corresponds to a function in funclist. Wherever condlist[i] is True, funclist[i](x) is used as the output value.

Each boolean array in condlist selects a piece of x, and should therefore be of the same shape as x.

The length of condlist must correspond to that of funclist. If one extra function is given, i.e. iflen(funclist) == len(condlist) + 1, then that extra function is the default value, used wherever all conditions are false.

funclist : list of callables, f(x,*args,**kw), or scalars

Each function is evaluated over x wherever its corresponding condition is True. It should take a 1d array as input and give an 1d array or a scalar value as output. If, instead of a callable, a scalar is provided then a constant function (lambda x: scalar) is assumed.

args : tuple, optional

Any further arguments given to piecewise are passed to the functions upon execution, i.e., if called piecewise(..., ..., 1, 'a'), then each function is called as f(x, 1, 'a').

kw : dict, optional

Keyword arguments used in calling piecewise are passed to the functions upon execution, i.e., if calledpiecewise(..., ..., alpha=1), then each function is called as f(x, alpha=1).

Returns:

out : ndarray

The output is the same shape and type as x and is found by calling the functions in funclist on the appropriate portions of x, as defined by the boolean arrays in condlist. Portions not covered by any condition have a default value of 0.

参数一 x:表示要进行操作的对象

参数二：condlist，表示要满足的条件列表，可以是多个条件构成的列表

参数三：funclist，执行的操作列表，参数二与参数三是对应的，当参数二为true的时候，则执行相对应的操作函数。

返回值：返回一个array对象，和原始操作对象x具有完全相同的维度和形状

2、np.piecewise的实际案例

(1)案例一

x = np.arange(0,10)
print(x)
xx=np.piecewise(x, [x < 4, x >= 6], [-1, 1])
print(xx)

运行结果为：

[0 1 2 3 4 5 6 7 8 9]
[-1 -1 -1 -1 0 0 1 1 1 1]

即将元素中小于4的用-1替换掉，大于等于6的用1替换掉，其余的默认以0填充。其实这里的替换和填充就是function，这不过这里的function跟简单粗暴，都用同一个数替换了。实际上，上面的代码完全等价于下面的代码：

x = np.arange(0,10)

def func1(y):
    return -1

def func2(y):
    return 1
xxx=np.piecewise(x, [x < 4, x >= 6], [func1, func2])
print(xxx)

运行结果为：

[-1 -1 -1 -1 0 0 1 1 1 1] #同上面一样

（2）案例二——定义相关的操作函数

x = np.arange(0,10)

#元素进行平方
def func2(y):
    return 1ef func1(y):
    return y**2

#元素乘以100
def func2(y):
    return y*100
xxx=np.piecewise(x, [x < 4, x >= 6], [func1, func2])
print(xxx)

运行结果为：

[ 0 1 4 9 0 0 600 700 800 900]

（3）案例三——使用lambda表达式

x = np.arange(0,10)

xxxx=np.piecewise(x, [x < 4, x >= 6], [lambda x:x**2, lambda x:x*100])
print(xxxx)

运行结果为：

[ 0 1 4 9 0 0 600 700 800 900]

总结：piecewise的处理方法快捷方便，比自己编写循环和条件语句的执行效率高很多，推荐多使用。