Alex T :
I have two dataframes: DF1
ID DatePaid Remaining
A1 2018-01-01 8500
A2 2018-02-15 2000
A2 2018-02-28 1900
A3 2018-04-12 3000
A3 2018-05-12 2700
A3 2018-05-17 110
A3 2018-06-17 0
A4 2018-06-18 10
A5 2018-07-13 500
Now I have another dataframe DF2
which only have unique IDs from first dataframe, and dates that represent months:
ID 2018-01-31 2018-02-28 2018-03-31 2018-04-30 2018-05-31 2018-06-30 2018-07-31
A1
A2
A3
A4
A5
So based on first dataframe I need to fill the values based on the Remaining
value that is in the first dataframe that is within the corresponding month ( so for example I take the last value for the A3
from 2018-05
and put it in the 2018-05-31
column in DF2
. IF there are no other values for that ID just fill all the remaining columns in DF
with the value in the most right filled column(roll over to the right).
So the end result is exactly like this
ID 2018-01-31 2018-02-28 2018-03-31 2018-04-30 2018-05-31 2018-06-30 2018-07-31
A1 8500 8500 8500 8500 8500 8500 8500
A2 NA 1900 1900 1900 1900 1900 1900
A3 NA NA NA 3000 110 0 0
A4 NA NA NA NA NA 10 10
A5 NA NA NA NA NA NA 500
Quang Hoang :
This gives you the data in df2
form:
month_ends = pd.to_datetime(df1.DatePaid).dt.to_period('M')
# also
# month_ends = pd.to_datetime(df1.DatePaid).add(pd.offsets.MonthEnd(0))
(df1.groupby(['ID', month_ends])
['Remaining'].last()
.unstack(-1)
.ffill(1)
.reset_index()
.rename_axis(columns=None)
)
Output:
ID 2018-01 2018-02 2018-04 2018-05 2018-06 2018-07
0 A1 8500.0 8500.0 8500.0 8500.0 8500.0 8500.0
1 A2 NaN 1900.0 1900.0 1900.0 1900.0 1900.0
2 A3 NaN NaN 3000.0 110.0 0.0 0.0
3 A4 NaN NaN NaN NaN 10.0 10.0
4 A5 NaN NaN NaN NaN NaN 500.0