时间序列--模型前的转换

1.sqrt转换

先看序列的trend,如果有二次曲线的表现形式的话,可以做个sqrt

Quadratic Time Series

昨晚sqrt之后张成这样子

Log Transformed Exponential Time Series

原来的数据长这样:

Airline Passengers Dataset Plot

做个sqrt

from pandas import Series
from pandas import DataFrame
from numpy import sqrt
from matplotlib import pyplot
series = Series.from_csv('airline-passengers.csv', header=0)
dataframe = DataFrame(series.values)
dataframe.columns = ['passengers']
dataframe['passengers'] = sqrt(dataframe['passengers'])
pyplot.figure(1)
# line plot
pyplot.subplot(211)
pyplot.plot(dataframe['passengers'])
# histogram
pyplot.subplot(212)
pyplot.hist(dataframe['passengers'])
pyplot.show()

变成这个样子:

Square Root Transform of Airline Passengers Dataset Plot

还是有趋势啊。。。。

2.log转换

昨晚log之后也应该张这样子

Log Transformed Exponential Time Series

利用上面的真实数据做log

from pandas import Series
from pandas import DataFrame
from numpy import log
from matplotlib import pyplot
series = Series.from_csv('airline-passengers.csv', header=0)
dataframe = DataFrame(series.values)
dataframe.columns = ['passengers']
dataframe['passengers'] = log(dataframe['passengers'])
pyplot.figure(1)
# line plot
pyplot.subplot(211)
pyplot.plot(dataframe['passengers'])
# histogram
pyplot.subplot(212)
pyplot.hist(dataframe['passengers'])
pyplot.show()

 BoxCox Log Transform of Airline Passengers Dataset Plot

表现的更加正态了,log转换很受欢迎

3.box-cox转换

 BoxCox Auto Transform of Airline Passengers Dataset Plot

结果图如上

https://machinelearningmastery.com/power-transform-time-series-forecast-data-python/

  • lambda = -1. is a reciprocal transform.
  • lambda = -0.5 is a reciprocal square root transform.
  • lambda = 0.0 is a log transform.
  • lambda = 0.5 is a square root transform.
  • lambda = 1.0 is no transform.
  • from pandas import Series
    from pandas import DataFrame
    from scipy.stats import boxcox
    from matplotlib import pyplot
    series = Series.from_csv('airline-passengers.csv', header=0)
    dataframe = DataFrame(series.values)
    dataframe.columns = ['passengers']
    dataframe['passengers'] = boxcox(dataframe['passengers'], lmbda=0.0)
    pyplot.figure(1)
    # line plot
    pyplot.subplot(211)
    pyplot.plot(dataframe['passengers'])
    # histogram
    pyplot.subplot(212)
    pyplot.hist(dataframe['passengers'])
    pyplot.show()

    这里举了个log的例子

  • BoxCox Log Transform of Airline Passengers Dataset Plot

  • 神奇的是,他可以自己选一个lambda

  • We can set the lambda parameter to None (the default) and let the function find a statistically tuned value.

    The following example demonstrates this usage, returning both the transformed dataset and the chosen lambda value.

  • from pandas import Series
    from pandas import DataFrame
    from scipy.stats import boxcox
    from matplotlib import pyplot
    series = Series.from_csv('airline-passengers.csv', header=0)
    dataframe = DataFrame(series.values)
    dataframe.columns = ['passengers']
    dataframe['passengers'], lam = boxcox(dataframe['passengers'])
    print('Lambda: %f' % lam)
    pyplot.figure(1)
    # line plot
    pyplot.subplot(211)
    pyplot.plot(dataframe['passengers'])
    # histogram
    pyplot.subplot(212)
    pyplot.hist(dataframe['passengers'])
    pyplot.show()

    mbda: 0.148023

    1

    Lambda: 0.148023

猜你喜欢

转载自blog.csdn.net/kylin_learn/article/details/85308113