pandas: ewm的参数设置

pandas 指数加权滑动（ewm）, 指数加权滑动平均（ewma）

pandas.DataFrame.ewm()

import pandas as pd
import numpy as np

df = pd.DataFrame([0.0, np.nan, 1.0, 2.0, np.nan, 3.0])
print(df)

print('span=2,ignore_na=False, adjust= True :\n', df.ewm(span=2,ignore_na=False, adjust= True).mean())
print('\n span=2,ignore_na=True, adjust= True :\n', df.ewm(span=2,ignore_na=True, adjust= True).mean())

忽略nan,就是从计算yt时候向前看，nan值不看，加权向历史走去

不忽略nan时候，nan处也付给权重，但是最后算的结果nan位置的权重被占掉了。

我们应该选择ignore_na=True合理一些。

nan数据的那个时刻的ewma由其前面历史数据计算得到，只有历史数据包括子自己全是nan，结果才是nan，这其实相当于是将平均的结果用其前面一个值填充！！！！

我们看下计算结果：

span=2,ignore_na=False, adjust= True :
           0
0  0.000000
1  0.000000
2  0.900000
3  1.702703
4  1.702703
5  2.828571

 span=2,ignore_na=True, adjust= True :
           0
0  0.000000
1  0.000000
2  0.750000
3  1.615385
4  1.615385
5  2.550000

滑动平均ewa时如何计算的？

When adjust=True we have $y_0=x_0$ and from the last representation above we have $y_t=\alpha x_t+(1−\alpha)y_{t-1}$ , therefore there is an assumption that $x_0$ is not an ordinary value but rather an exponentially weighted moment of the infinite series up to that point.

adjust=True:

y_{t} = \frac{x_{t} + (1 - α) x_{t - 1} + (1 - α)^{2} x_{t - 2} + . . . + (1 - α)^{t} x_{0}}{1 + (1 - α) + (1 - α)^{2} + . . . + (1 - α)^{t}}

$y_t = \frac{x_t+(1-\alpha)x_{t-1}+(1-\alpha)^2x_{t-2}+...+(1-\alpha)^tx_0}{1+(1-\alpha)+(1-\alpha)^2+...+(1-\alpha)^t}$
这是考虑到历史数据的有限性,如果历史数据趋于无限的话，这个表达式的分母会变为

\frac{1}{1 - (1 - α)}

$\frac{1}{1-(1-\alpha)}$

adjust=False:

y_{0} = x_{0}, y_{t} = (1 - α) y_{t - 1} + α x_{t}

$y_0=x_0, \\ y_t=(1-\alpha)y_{t-1}+\alpha x_t$
等价于

w_{i} = α (1 - α)^{i} i f i < t (1 - α)^{i} i f i = t

$w_i = \alpha(1-\alpha)^i\; if\; i<t\\ (1-\alpha)^i \; if \; i=t$
这种情况下，假定历史数据是无限长的。

参考链接：

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.ewm.html
http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-windows

pandas: ewm的参数设置

pandas 指数加权滑动（ewm）, 指数加权滑动平均（ewma）

参考链接：

猜你喜欢