pandas_cookbook学习(六)

版权声明:https://blog.csdn.net/thfyshz版权所有 https://blog.csdn.net/thfyshz/article/details/83889291

使用resample和apply函数分别变换:

In [103]: rng = pd.date_range(start="2014-10-07",periods=10,freq='2min')

In [104]: ts = pd.Series(data = list(range(10)), index = rng)

In [105]: def MyCust(x):
   .....:    if len(x) > 2:
   .....:       return x[1] * 1.234
   			 # 否则返回一个空值
   .....:    return pd.NaT
   .....: 

In [106]: mhc = {'Mean' : np.mean, 'Max' : np.max, 'Custom' : MyCust}

#resample采样
In [107]: ts.resample("5min").apply(mhc)
Out[107]: 
Custom  2014-10-07 00:00:00    1.234
        2014-10-07 00:05:00      NaT
        2014-10-07 00:10:00    7.404
        2014-10-07 00:15:00      NaT
Max     2014-10-07 00:00:00        2
        2014-10-07 00:05:00        4
        2014-10-07 00:10:00        7
        2014-10-07 00:15:00        9
Mean    2014-10-07 00:00:00        1
        2014-10-07 00:05:00      3.5
        2014-10-07 00:10:00        6
        2014-10-07 00:15:00      8.5
dtype: object

In [108]: ts
Out[108]: 
2014-10-07 00:00:00    0
2014-10-07 00:02:00    1
2014-10-07 00:04:00    2
2014-10-07 00:06:00    3
2014-10-07 00:08:00    4
2014-10-07 00:10:00    5
2014-10-07 00:12:00    6
2014-10-07 00:14:00    7
2014-10-07 00:16:00    8
2014-10-07 00:18:00    9
Freq: 2T, dtype: int64

以某一列的数值长度作为新列

In [109]: df = pd.DataFrame({'Color': 'Red Red Red Blue'.split(),
   .....:                    'Value': [100, 150, 50, 50]}); df
   .....: 
Out[109]: 
  Color  Value
0   Red    100
1   Red    150
2   Red     50
3  Blue     50

In [110]: df['Counts'] = df.groupby(['Color']).transform(len)

In [111]: df
Out[111]: 
  Color  Value  Counts
0   Red    100       3
1   Red    150       3
2   Red     50       3
3  Blue     50       1

猜你喜欢

转载自blog.csdn.net/thfyshz/article/details/83889291