重采样及频率转换:
resample方法: 各种频率转换工作的主力函数
data.iloc[:,0:5].head()
Out[218]:
RPT VAL ROS KIL SHA
DATE
1961-01-01 15.04 14.96 13.17 9.29 NaN
1961-01-02 14.71 NaN 10.83 6.50 12.62
1961-01-03 18.50 16.88 12.33 10.13 11.17
1961-01-04 10.58 6.63 11.75 4.58 4.54
1961-01-05 13.33 13.25 11.42 6.17 10.71
data.resample(‘M’, closed=’left’).mean().iloc[:,0:5].head()
注:use ” .resample(…).mean() ” 。.resample(.., how=’mean’)不推荐继续使用
data.resample('M', how='mean', closed='right').iloc[:,0:5].head()
*__main__:1: FutureWarning: how in .resample() is deprecated
the new syntax is .resample(...).mean()*
Out[215]:
RPT VAL ROS KIL SHA
DATE
1961-01-31 14.518276 11.727586 13.322333 7.596000 10.953214
1961-02-28 16.672500 15.218214 14.522500 9.343333 13.791071
1961-03-31 11.022000 11.448387 10.807000 7.298000 10.682903
1961-04-30 10.632333 9.329000 9.984333 5.929333 8.479333
1961-05-31 10.011613 8.890333 10.730645 5.929000 9.528065
如下:常用参数,其中kind=period: 聚合到使其或时间戳,默认聚合到时间序列的索引类型。该例中时间序列的resample为’M’类型,因此kind参数聚合时间序列到‘Month’
data.resample('M', axis=0, closed='right',label='right', kind='period').mean().iloc[:,0:5].head()
Out[222]:
RPT VAL ROS KIL SHA
DATE
1961-01 14.841333 11.988333 13.431613 7.736774 11.072759
1961-02 16.269286 14.975357 14.441481 9.230741 13.852143
1961-03 10.890000 11.296452 10.752903 7.284000 10.509355
1961-04 10.722667 9.427667 9.998000 5.830667 8.435000
1961-05 9.860968 8.850000 10.818065 5.905333 9.490323
help()代码信息:
resample(self, rule, how=None, axis=0, fill_method=None, closed=None, label=None, convention='start', kind=None, loffset=None, limit=None, base=0, on=None, level=None)
Convenience method for frequency conversion and resampling of time
series. Object must have a datetime-like index (DatetimeIndex,
PeriodIndex, or TimedeltaIndex), or pass datetime-like values
to the on or level keyword.
Parameters
----------
rule : string
the offset string or object representing target conversion
axis : int, optional, default 0
closed : {'right', 'left'}
Which side of bin interval is closed. The default is 'left'
for all frequency offsets except for 'M', 'A', 'Q', 'BM',
'BA', 'BQ', and 'W' which all have a default of 'right'.
label : {'right', 'left'}
Which bin edge label to label bucket with. The default is 'left'
for all frequency offsets except for 'M', 'A', 'Q', 'BM',
'BA', 'BQ', and 'W' which all have a default of 'right'.
convention : {'start', 'end', 's', 'e'}
For PeriodIndex only, controls whether to use the start or end of
`rule`
loffset : timedelta
Adjust the resampled time labels
base : int, default 0
For frequencies that evenly subdivide 1 day, the "origin" of the
aggregated intervals. For example, for '5min' frequency, base could
range from 0 through 4. Defaults to 0
on : string, optional
For a DataFrame, column to use instead of index for resampling.
Column must be datetime-like.
.. versionadded:: 0.19.0
level : string or int, optional
For a MultiIndex, level (name or number) to use for
resampling. Level must be datetime-like.
.. versionadded:: 0.19.0