user13209035 :
My dataframe looks like this:
id start end
1 101 102
1 102 104
1 104 110
1 125 128
2 100 102
2 102 104
2 110 115
I want output as:
id start end
1 101 110
1 125 128
2 100 104
2 110 115
yatu :
Here's one approach:
import numpy as np
a = df[['start', 'end']].values
# check which end is different to the start of the row bellow
m = (a[:-1] != a[1:,::-1]).all(1)
# array([False, False, True, True, False, True])
# Take the cumsum and use it to group the df rows
g = np.cumsum(np.r_[False, m])
# array([0, 0, 0, 1, 2, 2, 3], dtype=int32)
# group the df and take the first an last sample accordingly
out = df.groupby(g).agg({'id':'first', 'start':'first', 'end':'last'})
print(out)
id start end
0 1 101 110
1 1 125 128
2 2 100 104
3 2 110 115
Guess you like
Origin http://10.200.1.11:23101/article/api/json?id=408220&siteId=1