Python Get min max betweeen 2 columns across sequence of number

user13209035 :

My dataframe looks like this:

id  start  end
1   101    102
1   102    104
1   104    110
1   125    128
2   100    102
2   102    104
2   110    115  

I want output as:

id  start  end
1   101    110
1   125    128
2   100    104
2   110    115  
yatu :

Here's one approach:

import numpy as np

a = df[['start', 'end']].values
# check which end is different to the start of the row bellow
m = (a[:-1] != a[1:,::-1]).all(1)
# array([False, False,  True,  True, False,  True])
# Take the cumsum and use it to group the df rows
g = np.cumsum(np.r_[False, m])
# array([0, 0, 0, 1, 2, 2, 3], dtype=int32)
# group the df and take the first an last sample accordingly
out = df.groupby(g).agg({'id':'first', 'start':'first', 'end':'last'})

print(out)

   id  start  end
0   1    101  110
1   1    125  128
2   2    100  104
3   2    110  115

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=408220&siteId=1