pandas_cookbook学习(五)

版权声明:https://blog.csdn.net/thfyshz版权所有 https://blog.csdn.net/thfyshz/article/details/83857040

使用剩下值的均值代替此值,注意transform的用法,与apply相区分:
apply返回一个聚类结果,transform分别返回每个处理的结果

In [94]: df = pd.DataFrame({'A' : [1, 1, 2, 2], 'B' : [1, -1, 1, 2]})

In [95]: gb = df.groupby('A')

In [96]: def replace(g):
   ....:    mask = g < 0
   ....:    g.loc[mask] = g[~mask].mean()
   ....:    return g
   ....: 

In [97]: gb.transform(replace)
Out[97]: 
     B
0  1.0
1  1.0
2  1.0
3  2.0

按分组和某一列值分别排序:

In [98]: df = pd.DataFrame({'code': ['foo', 'bar', 'baz'] * 2,
   ....:                    'data': [0.16, -0.21, 0.33, 0.45, -0.59, 0.62],
   ....:                    'flag': [False, True] * 3})
   ....: 

In [99]: code_groups = df.groupby('code')

In [100]: agg_n_sort_order = code_groups[['data']].transform(sum).sort_values(by='data')

In [101]: sorted_df = df.loc[agg_n_sort_order.index]

In [102]: sorted_df
Out[102]: 
  code  data   flag
1  bar -0.21   True
4  bar -0.59  False
0  foo  0.16  False
3  foo  0.45   True
2  baz  0.33  False
5  baz  0.62   True

猜你喜欢

转载自blog.csdn.net/thfyshz/article/details/83857040