pierre_j :
I am sorry, I am kind of stuck.
I would like to use DateTimeIndex coming from a column in a dataframe, to create new rows in another dataframe.
These DateTimeIndex have to be used as indexes for the new rows.
So with following data:
import pandas as pd
df = pd.DataFrame({'Start': [pd.Timestamp('1970-01-02 00:00:00'),pd.Timestamp('1970-03-02 00:00:00')], 'End': [pd.Timestamp('1970-01-02 00:10:00'), pd.Timestamp('1970-03-02 00:10:00')], 'Freq': [pd.Timedelta(5,'m'),pd.Timedelta(5,'m')]})
df = df.apply(lambda x: pd.date_range(start = x.Start, end = x.End, freq = x.Freq), axis=1)
df2 = pd.DataFrame({'Timestamp':[pd.Timestamp('1970-01-03 00:00:00')], 'Data':[4]}).set_index('Timestamp')
I get the inputs:
In [62]: df2.index
Out[62]: DatetimeIndex(['1970-01-03'], dtype='datetime64[ns]', name='Timestamp', freq=None)
In[63]: df.to_list()
Out[21]:
[DatetimeIndex(['1970-01-02 00:00:00', '1970-01-02 00:05:00',
'1970-01-02 00:10:00'],
dtype='datetime64[ns]', freq='5T'),
DatetimeIndex(['1970-03-02 00:00:00', '1970-03-02 00:05:00',
'1970-03-02 00:10:00'],
dtype='datetime64[ns]', freq='5T')]
What I would like to get is a dataframe based on df2, with new rows, having as timestamps those coming from df.
df2_new
Data
Timestamp
1970-01-03 00:00:00 4
1970-01-02 00:00:00
1970-01-02 00:05:00
1970-01-02 00:10:00
1970-03-02 00:00:00
1970-03-02 00:05:00
1970-03-02 00:10:00
I tried with following line, but I get an error:
df2.reindex(df2.index.to_list() + df.to_list())
TypeError: unhashable type: 'DatetimeIndex'
The example I give is simplified as df has a single row, but it could have several.
Please, do you have any idea how I could do?
Thanks in advance for your help! Have a good evening, Bests!
Scott Boston :
IIUC, you can define your 'time range' a little differently, but the key step is to use pd.Index.union
:
import pandas as pd
df = pd.DataFrame({'Start':[pd.Timestamp('1970-01-02 00:00:00')],
'End':[pd.Timestamp('1970-01-02 00:10:00')],
'Freq':[pd.Timedelta(5,'m')]})
timerange = df.apply(lambda x: pd.Series(pd.date_range(start = x.Start,
end = x.End,
freq = x.Freq)),
axis=1).stack()[0]
df2 = pd.DataFrame({'Timestamp':[pd.Timestamp('1970-01-03 00:00:00')],
'Data':[4]}).set_index('Timestamp')
df2 = df2.reindex(df2.index.union(timerange))
df2
Output:
Data
1970-01-02 00:00:00 NaN
1970-01-02 00:05:00 NaN
1970-01-02 00:10:00 NaN
1970-01-03 00:00:00 4.0