Python completes the basics of time series analysis

Basic operation methods of Python time series analysis


Recommended reading

  1. Use Python to complete the basics of time series analysis
  2. A practical case of SPSS establishing a time series multiplication season model
  3. Practical case of building a time series ARIMA model in Python


Insert picture description here

Import the required libraries

import pandas as pd
import numpy as np
import datetime as dt

sequentially

  • Timestamp
  • Fixed period (period)
  • Interval
    Insert picture description here

Generate time series

  • You can specify the start time and period
  • H: hour
  • D: day
  • M: month
# TIMES #2016 Jul 1 7/1/2016 1/7/2016 2016-07-01 2016/07/01
rng = pd.date_range('2016-07-01', periods = 10, freq = '3D')
rng
DatetimeIndex(['2016-07-01', '2016-07-04', '2016-07-07', '2016-07-10',
               '2016-07-13', '2016-07-16', '2016-07-19', '2016-07-22',
               '2016-07-25', '2016-07-28'],
              dtype='datetime64[ns]', freq='3D')
time=pd.Series(np.random.randn(20),
           index=pd.date_range(dt.datetime(2016,1,1),periods=20))
print(time)
2016-01-01   -0.067209
2016-01-02    0.480689
2016-01-03   -0.152052
2016-01-04    0.077139
2016-01-05   -1.775043
2016-01-06   -1.184273
Freq: D, dtype: float64

truncate filter

time.truncate(before='2016-1-10')
2016-01-10   -0.349605
2016-01-11    2.159193
2016-01-12    0.077578
2016-01-13    0.084981
2016-01-14   -0.099995
2016-01-15   -1.327124
2016-01-16    1.352626
Freq: D, dtype: float64
time.truncate(after='2016-1-10')
2016-01-01   -0.067209
2016-01-02    0.480689
2016-01-03   -0.152052
2016-01-04    0.077139
2016-01-05   -1.775043
2016-01-06   -1.184273
2016-01-07   -1.247371
2016-01-08   -0.686737
2016-01-09   -1.787544
2016-01-10   -0.349605
Freq: D, dtype: float64
print(time['2016-01-15'])
-1.3271240245020821
print(time['2016-01-15':'2016-01-20'])
2016-01-15   -1.327124
2016-01-16    1.352626
2016-01-17   -0.075599
2016-01-18    1.026780
2016-01-19   -0.286614
2016-01-20   -0.017546
Freq: D, dtype: float64
data=pd.date_range('2010-01-01','2011-01-01',freq='M')
print(data)
DatetimeIndex(['2010-01-31', '2010-02-28', '2010-03-31', '2010-04-30',
               '2010-05-31', '2010-06-30', '2010-07-31', '2010-08-31',
               '2010-09-30', '2010-10-31', '2010-11-30', '2010-12-31'],
              dtype='datetime64[ns]', freq='M')

Common format
Insert picture description here

Timestamp

pd.Timestamp('2016-07-10')
Timestamp('2016-07-10 00:00:00')
# 可以指定更多细节
pd.Timestamp('2016-07-10 10')
Timestamp('2016-07-10 10:00:00')
pd.Timestamp('2016-07-10 10:15')
Timestamp('2016-07-10 10:15:00')
# How much detail can you add?
t = pd.Timestamp('2016-07-10 10:15')

Time interval

pd.Period('2016-01')
Period('2016-01', 'M')
pd.Period('2016-01-01')
Period('2016-01-01', 'D')
# TIME OFFSETS
pd.Timedelta('1 day')
Timedelta('1 days 00:00:00')
pd.Period('2016-01-01 10:10') + pd.Timedelta('1 day')
Period('2016-01-02 10:10', 'T')
pd.Timestamp('2016-01-01 10:10') + pd.Timedelta('1 day')
Timestamp('2016-01-02 10:10:00')
pd.Timestamp('2016-01-01 10:10') + pd.Timedelta('15 ns')
Timestamp('2016-01-01 10:10:00.000000015')
p1 = pd.period_range('2016-01-01 10:10', freq = '25H', periods = 10)
p2 = pd.period_range('2016-01-01 10:10', freq = '1D1H', periods = 10)
p1
PeriodIndex(['2016-01-01 10:00', '2016-01-02 11:00', '2016-01-03 12:00',
             '2016-01-04 13:00', '2016-01-05 14:00', '2016-01-06 15:00',
             '2016-01-07 16:00', '2016-01-08 17:00', '2016-01-09 18:00',
             '2016-01-10 19:00'],
            dtype='period[25H]', freq='25H')
p2
PeriodIndex(['2016-01-01 10:00', '2016-01-02 11:00', '2016-01-03 12:00',
             '2016-01-04 13:00', '2016-01-05 14:00', '2016-01-06 15:00',
             '2016-01-07 16:00', '2016-01-08 17:00', '2016-01-09 18:00',
             '2016-01-10 19:00'],
            dtype='period[25H]', freq='25H')

Specify index

rng = pd.date_range('2016 Jul 1', periods = 10, freq = 'D')
rng
pd.Series(range(len(rng)), index = rng)
2016-07-01    0
2016-07-02    1
2016-07-03    2
2016-07-04    3
2016-07-05    4
2016-07-06    5
2016-07-07    6
2016-07-08    7
2016-07-09    8
2016-07-10    9
Freq: D, dtype: int64
periods = [pd.Period('2016-01'), pd.Period('2016-02'), pd.Period('2016-03')]
ts = pd.Series(np.random.randn(len(periods)), index = periods)
ts
2016-01   -0.559086
2016-02   -1.021617
2016-03    0.944657
Freq: M, dtype: float64
type(ts.index)
pandas.core.indexes.period.PeriodIndex

Timestamp and time period can be converted


ts = pd.Series(range(10), pd.date_range('07-10-16 8:00', periods = 10, freq = 'H'))
ts
2016-07-10 08:00:00    0
2016-07-10 09:00:00    1
2016-07-10 10:00:00    2
2016-07-10 11:00:00    3
2016-07-10 12:00:00    4
Freq: H, dtype: int64
ts_period = ts.to_period()
ts_period
2016-07-10 08:00    0
2016-07-10 09:00    1
2016-07-10 10:00    2
2016-07-10 11:00    3
2016-07-10 12:00    4
2016-07-10 13:00    5
2016-07-10 14:00    6
2016-07-10 15:00    7
2016-07-10 16:00    8
2016-07-10 17:00    9
Freq: H, dtype: int64
ts_period['2016-07-10 08:30':'2016-07-10 11:45'] 
2016-07-10 08:00    0
2016-07-10 09:00    1
2016-07-10 10:00    2
2016-07-10 11:00    3
Freq: H, dtype: int64
ts['2016-07-10 08:30':'2016-07-10 11:45'] 
2016-07-10 09:00:00    1
2016-07-10 10:00:00    2
2016-07-10 11:00:00    3
Freq: H, dtype: int64

Data resampling

  • Time data is converted from one frequency to another
  • Downsampling
  • Upsampling
import pandas as pd
import numpy as np
rng = pd.date_range('1/1/2011', periods=90, freq='D')
ts = pd.Series(np.random.randn(len(rng)), index=rng)
ts.head()
2011-01-01   -0.225796
2011-01-02    0.890969
2011-01-03   -0.343222
2011-01-04   -0.884985
2011-01-05    0.859801
Freq: D, dtype: float64

Resample

  • In months
ts.resample('M').sum()
ts.resample("M").sum()
2011-01-31   -3.221512
2011-02-28    9.660282
2011-03-31   -0.934169
Freq: M, dtype: float64
  • In days
ts.resample('3D').sum()
ts.resample("2D").sum()
2011-01-01    0.665173
2011-01-03   -1.228207
2011-01-05    1.165821
2011-01-07   -2.507237
Freq: 2D, dtype: float64
  • Calculate the mean
day3Ts = ts.resample('3D').mean()
day3Ts
2011-01-01    0.107317
2011-01-04    0.093612
2011-01-07   -1.156626
2011-01-10   -0.172981

Freq: 3D, dtype: float64
  • resample() resampling and asfreq() frequency conversion
print(day3Ts.resample('D').asfreq())
2011-01-01    0.107317
2011-01-02         NaN
2011-01-03         NaN
2011-01-04    0.093612
2011-01-05         NaN
                ...   
2011-03-25         NaN
2011-03-26    0.804057
2011-03-27         NaN
2011-03-28         NaN
2011-03-29   -0.200729
Freq: D, Length: 88, dtype: float64
print(day3Ts.resample('D'))
DatetimeIndexResampler [freq=<Day>, axis=0, closed=left, label=left, convention=start, base=0]

Interpolation method

Upsampling may have problems, use interpolation methods for control

  • ffill Null value takes the previous value
    bfill Null value takes the latter value
    interpolate linear value
day3Ts.resample('D').ffill(2)
2011-01-01    0.107317
2011-01-02    0.107317
2011-01-03    0.107317
2011-01-04    0.093612
2011-01-05    0.093612
                ...   
2011-03-25   -0.045712
2011-03-26    0.804057
2011-03-27    0.804057
2011-03-28    0.804057
2011-03-29   -0.200729
Freq: D, Length: 88, dtype: float64
day3Ts.resample('D').bfill(1)
2011-01-01    0.107317
2011-01-02         NaN
2011-01-03    0.093612
2011-01-04    0.093612
2011-01-05         NaN
                ...   
2011-03-25    0.804057
2011-03-26    0.804057
2011-03-27         NaN
2011-03-28   -0.200729
2011-03-29   -0.200729
Freq: D, Length: 88, dtype: float64
day3Ts.resample('D').interpolate("linear")
2011-01-01    0.107317
2011-01-02    0.102749
2011-01-03    0.098180
2011-01-04    0.093612
2011-01-05   -0.323134
                ...   
2011-03-25    0.520801
2011-03-26    0.804057
2011-03-27    0.469128
2011-03-28    0.134200
2011-03-29   -0.200729
Freq: D, Length: 88, dtype: float64

Recommended reading

  1. Use Python to complete the basics of time series analysis
  2. A practical case of SPSS establishing a time series multiplication season model
  3. Practical case of building a time series ARIMA model in Python

This is the end, if it helps you, welcome to like and follow, your likes are very important to me

Guess you like

Origin blog.csdn.net/qq_45176548/article/details/109588361