CHRD :
I have df
below which I have sorted according to the ID
variable and the time variable T
as the secondary sort.
df = pd.DataFrame({
'ID': ['a', 'b', 'c', 'b', 'b'],
'T': [
datetime.datetime(2019, 1, 1),
datetime.datetime(2017, 1, 1),
datetime.datetime(2018, 1, 1),
datetime.datetime(2020, 1, 1),
datetime.datetime(2021, 1, 1)],
'V': [3, 5, 8, 6, 1]
}).sort_values(['ID', 'T'], ascending=False)
df
ID T V
2 c 2018-01-01 8
4 b 2021-01-01 1
3 b 2020-01-01 6
1 b 2017-01-01 5
0 a 2019-01-01 3
I want to add a new column V_L
where, for each ID
, the last value (based on the time column T
) is shown. If there is no last value this should be indicated by a null
value in V_L
. An example output would look like this:
df
ID T V V_L
0 a 2018-01-01 8 NaN
1 b 2021-01-01 1 6.0
2 b 2020-01-01 6 5.0
3 b 2017-01-01 5 NaN
4 c 2019-01-01 3 NaN
YOBEN_S :
IIUC
df['V_L'] = df.groupby('ID').V.shift(-1)
df
Out[350]:
ID T V V_L
2 c 2018-01-01 8 NaN
4 b 2021-01-01 1 6.0
3 b 2020-01-01 6 5.0
1 b 2017-01-01 5 NaN
0 a 2019-01-01 3 NaN
Guess you like
Origin http://43.154.161.224:23101/article/api/json?id=220525&siteId=1