import pandas as pd
import numpy as np
ts = pd.Series(np.random.randn(4),index=pd.date_range('1/1/2000',periods=4,freq='M'))
ts
2000-01-31 1.300295
2000-02-29 -1.472891
2000-03-31 1.590408
2000-04-30 0.600093
Freq: M, dtype: float64
一、移动数据
ts.shift(2) # 正向移动
2000-01-31 NaN
2000-02-29 NaN
2000-03-31 1.300295
2000-04-30 -1.472891
Freq: M, dtype: float64
ts.shift(-2) # 反向移动
2000-01-31 1.590408
2000-02-29 0.600093
2000-03-31 NaN
2000-04-30 NaN
Freq: M, dtype: float64
二、移动日期索引
通过参数freq指定频率来实现日期的移动
ts.shift(2,freq='M') #对索引进行移位
2000-03-31 1.300295
2000-04-30 -1.472891
2000-05-31 1.590408
2000-06-30 0.600093
Freq: M, dtype: float64
ts.shift(1,freq='3D')
2000-02-03 1.300295
2000-03-03 -1.472891
2000-04-03 1.590408
2000-05-03 0.600093
dtype: float64
三、通过日期偏移量对日期进行移动
from pandas.tseries.offsets import Day,MonthEnd
from datetime import datetime
now = datetime(2018,7,5)
now+3*Day()
Timestamp('2018-07-08 00:00:00')
MonthEnd是锚点偏移量,第1次增量后滚到符合频率规则的下一个日期
print(now+MonthEnd())
print(now+MonthEnd(2))
2018-07-31 00:00:00
2018-08-31 00:00:00
使用锚点偏移量进行前后滚动
print(MonthEnd().rollforward(now))
print(MonthEnd().rollback(now))
2018-07-31 00:00:00
2018-06-30 00:00:00
使用锚点偏移量进行groupby,使具有相同锚点的数据聚合
ts = pd.Series(np.random.randn(20),index=pd.date_range('1/15/2000',periods=20,freq='4d'))
ts
2000-01-15 -1.029260
2000-01-19 0.060567
2000-01-23 -0.846305
2000-01-27 -0.626401
2000-01-31 -1.070348
2000-02-04 -0.327185
2000-02-08 0.740582
2000-02-12 -0.725505
2000-02-16 -0.107928
2000-02-20 0.623038
2000-02-24 0.125203
2000-02-28 0.860681
2000-03-03 -0.160448
2000-03-07 0.021979
2000-03-11 0.873563
2000-03-15 0.091832
2000-03-19 -1.302709
2000-03-23 -0.854590
2000-03-27 -1.572235
2000-03-31 -0.536468
Freq: 4D, dtype: float64
ts.groupby(MonthEnd().rollforward).mean() # 相同月份的数据求平均
2000-01-31 -0.702349
2000-02-29 0.169841
2000-03-31 -0.429885
dtype: float64