【python】详解pandas df.where函数 以及 去掉特殊值的某行或者某列

- 1、首先直接看文档:

  • data.where(cond, other=nan, inplace=False, axis=None, level=None, errors=’raise’, try_cast=False, raise_on_error=None)
Docstring:
Return an object of same shape as self and whose corresponding
entries are from self where `cond` is True and otherwise are from
`other`.
#返回一个同样shape的df,当满足条件为TRUE时,从本身返回结果,否则从返回其他df的结果
Parameters
----------
cond : boolean NDFrame, array-like, or callable
    Where `cond` is True, keep the original value. Where
    False, replace with corresponding value from `other`.
    If `cond` is callable, it is computed on the NDFrame and
    should return boolean NDFrame or array. The callable must
    not change input NDFrame (though pandas doesn't check it).

other : scalar, NDFrame, or callable  #当cond=False时,填充的值

inplace : boolean, default False
    Whether to perform the operation in place on the data

axis : alignment axis if needed, default None
level : alignment level if needed, default None
errors : str, {'raise', 'ignore'}, default 'raise'
    - ``raise`` : allow exceptions to be raised
    - ``ignore`` : suppress exceptions. On error return original object

try_cast : boolean, default False
    try to cast the result back to the input type (if possible),
raise_on_error : boolean, default True
    Whether to raise on invalid data types (e.g. trying to where on
    strings)

Returns
-------
wh : same type as caller

Notes
-----
Roughly ``df1.where(m, df2)`` is equivalent to
``np.where(m, df1, df2)``.

- 2、实例

1、pd.Series( ).where( cond ) 可以过滤不满足cond的值并赋予NaN空值

--------
s = pd.Series(range(5))
s.where(s > 0)
0    NaN
1    1.0
2    2.0
3    3.0
4    4.0

2、pd.Series( ).mask(cond) 使用时,结果与where相反

s.mask(s > 0)
0    0.0
1    NaN
2    NaN
3    NaN
4    NaN

3、赋予other 值得用法

s.where(s > 1, 10)   #cond = s > 1,other = 10
0    10.0
1    10.0
2    2.0
3    3.0
4    4.0

4、df.where从主体df出发,True返回df 本身的值,否则返回other的值;np.where(cond,x,y),True返回x的值,False返回y的值

df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B'])
m = df % 3 == 0
df.where(m, -df)   #cond = m,other = -df
   A  B
0  0 -1
1 -2  3
2 -4 -5
3  6 -7
4 -8  9
df.where(m, -df) == np.where(m, df, -df)
      A     B
0  True  True
1  True  True
2  True  True
3  True  True
4  True  True
df.where(m, -df) == df.mask(~m, -df)
      A     B
0  True  True
1  True  True
2  True  True
3  True  True
4  True  True

-3、去掉特定的某行某列:

df.where(df !=N).dropna(axis = 1)

猜你喜欢

转载自blog.csdn.net/brucewong0516/article/details/80226990