drchops :
I have the following dataframe:
ID Days TreatmentGiven TreatmentNumber
--- ---- -------------- ---------------
1 0 False NaN
1 30 False NaN
1 40 True 1.0
1 56 True 2.0
2 0 False NaN
2 14 True 4.0
2 28 True 5.0
3 0 False NaN
3 28 True 1.0
3 56 True 2.0
4 0 False NaN
4 100 True 6.0
4 128 True 7.0
I'd like to delete all rows for an ID where an ID does not have a TreatmentNumber == 1.0, so that the result is the following:
ID Days TreatmentGiven TreatmentNumber
--- ---- -------------- ---------------
1 0 False NaN
1 30 False NaN
1 40 True 1.0
1 56 True 2.0
3 0 False NaN
3 28 True 1.0
3 56 True 2.0
What is the best way to do this?
Thank you.
Quang Hoang :
You can do with groupby().transform()
:
df[df.TreatmentNumber.eq(1).groupby(df['ID']).transform('any')]
Output:
ID Days TreatmentGiven TreatmentNumber
0 1 0 False NaN
1 1 30 False NaN
2 1 40 True 1.0
3 1 56 True 2.0
7 3 0 False NaN
8 3 28 True 1.0
9 3 56 True 2.0
Guess you like
Origin http://43.154.161.224:23101/article/api/json?id=360199&siteId=1