Create new column in Pandas dataframe based on latest value per ID

CHRD :

I have df below which I have sorted according to the IDvariable and the time variable T as the secondary sort.

df = pd.DataFrame({
    'ID': ['a', 'b', 'c', 'b', 'b'],
    'T': [
        datetime.datetime(2019, 1, 1),
        datetime.datetime(2017, 1, 1),
        datetime.datetime(2018, 1, 1),
        datetime.datetime(2020, 1, 1),
        datetime.datetime(2021, 1, 1)],
    'V': [3, 5, 8, 6, 1]
}).sort_values(['ID', 'T'], ascending=False)

df

    ID  T           V
2   c   2018-01-01  8
4   b   2021-01-01  1
3   b   2020-01-01  6
1   b   2017-01-01  5
0   a   2019-01-01  3

I want to add a new column V_L where, for each ID, the last value (based on the time column T) is shown. If there is no last value this should be indicated by a null value in V_L. An example output would look like this:

df
    ID  T           V   V_L
0   a   2018-01-01  8   NaN
1   b   2021-01-01  1   6.0
2   b   2020-01-01  6   5.0
3   b   2017-01-01  5   NaN
4   c   2019-01-01  3   NaN
YOBEN_S :

IIUC

df['V_L'] = df.groupby('ID').V.shift(-1)
df
Out[350]: 
  ID           T  V  V_L
2  c  2018-01-01  8  NaN
4  b  2021-01-01  1  6.0
3  b  2020-01-01  6  5.0
1  b  2017-01-01  5  NaN
0  a  2019-01-01  3  NaN

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=220525&siteId=1