Python_learner :
I have the following DataFrame consisting out of columns id, brand and count
Id brand count
1 Audi 3
2 BWM 5
2 FORD 3
3 AUDI 7
4 BMW 2
5 Audi 4
5 FORD 3
I would like to groupby id and only remain each id with the brand that has the highest count. So in the end I would like to have the following:
id brand
1 AUDI
2 BMW
3 AUDI
4 BMW
5 AUDI
I have something like this but that obviously is not working. So what would be the correct function or syntax to accomplish that? Thanks!
data.groupby('id')['brand'].where(max('count'))
Chris A :
IIUC use groupby.idxmax
and loc
:
df.loc[df.groupby('Id')['count'].idxmax()]
[out]
Id brand count
0 1 Audi 3
1 2 BWM 5
3 3 AUDI 7
4 4 BMW 2
5 5 Audi 4