Python tutorials (Python learning route): Pandas library based on the analysis - Detailed processing time series

Python tutorials (Python learning route): Pandas library based on the analysis - Detailed processing time series


 When using the Python data analysis, often encounter time and date format conversion processing, analysis and data mining in particular time-related, such as quantitative trading is to look for changes of stock prices from the historical data. Python comes processing time module datetime, NumPy library also provides a corresponding method, as the data analysis Pandas BANK Python environment, but also provides a powerful data processing date, the processing tool is a time series.

1, generating a sequence of dates

Mainly provides pd.data_range () and pd.period_range () two methods, parameters are given start time, end time, and the number of times generated time-frequency (freq = 'M' month, 'D' day, 'W ', weeks,' the Y ') and other.

The two main difference is that pd.date_range () is generated sequence DatetimeIndex date format; pd.period_range () is generated sequence PeriodIndex date format.

The following months by generating time-series sequence and to compare the periphery:

date_rng = pd.date_range('2019-01-01', freq='M', periods=12)
print(f'month date_range(): {date_rng}')
"""
date_range():
DatetimeIndex(['2019-01-31', '2019-02-28', '2019-03-31', '2019-04-30',
'2019-05-31', '2019-06-30', '2019-07-31', '2019-08-31',
'2019-09-30', '2019-10-31', '2019-11-30', '2019-12-31'],
dtype='datetime64[ns]', freq='M')
"""
period_rng = pd.period_range('2019/01/01', freq='M', periods=12)
print(f'month period_range(): {period_rng}')
"""
period_range():
PeriodIndex(['2019-01', '2019-02', '2019-03', '2019-04', '2019-05', '2019-06',
'2019-07', '2019-08', '2019-09', '2019-10', '2019-11', '2019-12'],
dtype='period[M]', freq='M')
"""
date_rng = pd.date_range('2019-01-01', freq='W-SUN', periods=12)
print(f'week date_range(): {date_rng}')
"""
week date_range():
DatetimeIndex(['2019-01-06', '2019-01-13', '2019-01-20', '2019-01-27',
'2019-02-03', '2019-02-10', '2019-02-17', '2019-02-24',
'2019-03-03', '2019-03-10', '2019-03-17', '2019-03-24'],
dtype='datetime64[ns]', freq='W-SUN')
"""
period_rng=pd.period_range('2019-01-01',freq='W-SUN',periods=12)
print(f'week period_range(): {period_rng}')
"""
week period_range():
PeriodIndex(['2018-12-31/2019-01-06', '2019-01-07/2019-01-13',
'2019-01-14/2019-01-20', '2019-01-21/2019-01-27',
'2019-01-28/2019-02-03', '2019-02-04/2019-02-10',
'2019-02-11/2019-02-17', '2019-02-18/2019-02-24',
'2019-02-25/2019-03-03', '2019-03-04/2019-03-10',
'2019-03-11/2019-03-17', '2019-03-18/2019-03-24'],
dtype='period[W-SUN]', freq='W-SUN')
"""
date_rng = pd.date_range('2019-01-01 00:00:00', freq='H', periods=12)
print(f'hour date_range(): {date_rng}')
"""
hour date_range():
DatetimeIndex(['2019-01-01 00:00:00', '2019-01-01 01:00:00',
'2019-01-01 02:00:00', '2019-01-01 03:00:00',
'2019-01-01 04:00:00', '2019-01-01 05:00:00',
'2019-01-01 06:00:00', '2019-01-01 07:00:00',
'2019-01-01 08:00:00', '2019-01-01 09:00:00',
'2019-01-01 10:00:00', '2019-01-01 11:00:00'],
dtype='datetime64[ns]', freq='H')
"""
period_rng=pd.period_range('2019-01-01 00:00:00',freq='H',periods=12)
print(f'hour period_range(): {period_rng}')
"""
hour period_range():
PeriodIndex(['2019-01-01 00:00', '2019-01-01 01:00', '2019-01-01 02:00',
'2019-01-01 03:00', '2019-01-01 04:00', '2019-01-01 05:00',
'2019-01-01 06:00', '2019-01-01 07:00', '2019-01-01 08:00',
'2019-01-01 09:00', '2019-01-01 10:00', '2019-01-01 11:00'],
dtype='period[H]', freq='H')
"""

2, and generates a conversion Timestamp object

Creating a timestamp Timestamp object has pd.Timestamp () method and pd.to_datetime () method. As follows:

ts=pd.Timestamp(2019,1,1)
print(f'pd.Timestamp()-1:{ts}')
#pd.Timestamp()-1:2019-01-01 00:00:00
ts=pd.Timestamp(dt(2019,1,1,hour=0,minute=1,second=1))
print(f'pd.Timestamp()-2:{ts}')
#pd.Timestamp()-2:2019-01-01 00:01:01
ts=pd.Timestamp("2019-1-1 0:1:1")
print(f'pd.Timestamp()-3:{ts}')
#pd.Timestamp()-3:2019-01-01 00:01:01
print(f'pd.Timestamp()-type:{type(ts)}')
#pd.Timestamp()-type:<class 'pandas._libs.tslibs.timestamps.Timestamp'>
#dt=pd.to_datetime(2019,1,1) 不支持
dt=pd.to_datetime(dt(2019,1,1,hour=0,minute=1,second=1))
print(f'pd.to_datetime()-1:{dt}')
#pd.to_datetime()-1:2019-01-01 00:01:01
dt=pd.to_datetime("2019-1-1 0:1:1")
print(f'pd.to_datetime()-2:{dt}')
#pd.to_datetime()-2:2019-01-01 00:01:01
print(f'pd.to_datetime()-type:{type(dt)}')
#pd.to_datetime()-type:<class 'pandas._libs.tslibs.timestamps.Timestamp'>
#pd.to_datetime生成自定义时间序列
dtlist=pd.to_datetime(["2019-1-1 0:1:1", "2019-3-1 0:1:1"])
print(f'pd.to_datetime()-list:{dtlist}')
#pd.to_datetime()-list:DatetimeIndex(['2019-01-01 00:01:01', '2019-03-01 00:01:01'], dtype='datetime64[ns]', freq=None)
#时间戳转换为period月时期
pr = ts.to_period('M')
print(f'ts.to_period():{pr}')
#ts.to_period():2019-01
print(f'pd.to_period()-type:{type(pr)}')
#pd.to_period()-type:<class 'pandas._libs.tslibs.period.Period'>

3, and generates a target conversion period

Period # defined period 
per pd.Period = ( '2019')
Print (f'pd.Period (): {} per ')
# pd.Period (): 2019
per_del = pd.Period (' 2019 ') - PD. period ( '2018')
Print (2018 f'2019 and spacing of {per_del} ') can be directly # + - integer (on behalf)
# 1 2019 and 2018 of spacer
converted to time stamp #
print (per.to_timestamp ( = How 'End')) # 2019-12-31 00:00:00
Print (per.to_timestamp (= How 'Start')) # 2019-01-01 00:00:00

4, generation interval Timedelta

# Generation interval timedelta 
Print (pd.Timedelta (Days =. 5, = 50 minutes, seconds The 20 is =, = 10 milliseconds, microseconds = 10, = 10 nanoseconds))
# 00. 5 Days: 50: 20.010010
# Get the current time
now = pd.datetime.now ()
# calculates the current date 50 days later time
dt = now + pd.Timedelta (= 50 days)
Print (F 'is the current time {now}, 50 days after the time dt} is {')
# The current time is 2019-06-0817: 59: 31.726065, 50 days after the time 2019-07-2817: 59: 31.726065
# display only the date
print (dt.strftime ( '% Y-% m-% d' )) # 2019-07-28

5, frequency conversion and resampling

#asfreq 按季度显示索引值
#'DatetimeIndex' object has no attribute 'asfreq'
date=pd.date_range('1/1/2018', periods=20, freq='D')
tsdat_series=pd.Series(range(20),index=date)
tsp_series=tsdat_series.to_period('D')
print(tsp_series.index.asfreq('Q'))
date=pd.period_range('1/1/2018', periods=20, freq='D')
tsper_series=pd.Series(range(20),index=date)
print(tsper_series.index.asfreq('Q'))
"""
PeriodIndex(['2018Q1', '2018Q1', '2018Q1', '2018Q1', '2018Q1', '2018Q1',
'2018Q1', '2018Q1', '2018Q1', '2018Q1', '2018Q1', '2018Q1',
'2018Q1', '2018Q1', '2018Q1', '2018Q1', '2018Q1', '2018Q1',
'2018Q1', '2018Q1'],
dtype='period[Q-DEC]', freq='Q-DEC')
"""
#resample quarterly and displays statistics
Print (tsdat_series.resample ( 'Q') SUM () to_period ( 'Q')..)
"" "
2018Q1 190
Freq: Q-DEC, DTYPE: Int64
" ""
#groupby Weekly Summarizing averaging
Print (tsdat_series.groupby (the lambda X: x.weekday) .mean ())
"" "
0 7.0
. 1 8.0
2 9.0
. 3 10.0
. 4 11.0
. 5 12.0
. 6 9.5
DTYPE: float64
" ""

Before there with you tutorials talked about pandas, we do not have a place to go back and review what can be understood, more Python tutorials and Python learning routes will continue to share with you!

Guess you like

Origin www.cnblogs.com/cherry-tang/p/11002173.html
Recommended