bk201 :
Tough one.
Start with this DataFrame:
df = pd.DataFrame({
'number':[4.4,11, 2.4,5, 12,22],
'id': [1,1, 2,2, 3,3]
})
| number | id |
|--------|----|
| 4.4 | 1 |
| 11 | 1 |
| 2.4 | 2 |
| 5 | 2 |
| 12 | 3 |
| 22 | 3 |
I want to group by the id
column, and add a third column called unique_above_10
and set the value to 1 if there is one and only one value in the groupby that is > 10.
So the new DataFrame should look like this:
| number | id | unique_above_10 |
|--------|----|-----------------|
| 4.4 | 1 | 0 |
| 11 | 1 | 1 |
| 2.4 | 2 | 0 |
| 5 | 2 | 0 |
| 12 | 3 | 0 |
| 22 | 3 | 0 |
jezrael :
Compare values by mask and count matched values by sum
per groups by GroupBy.transform
, compare by 1
and chain by &
for bitwise AND
by mask m
:
m = df['number'].gt(10)
df['unique_above_10'] = (m.groupby(df['id']).transform('sum').eq(1) & m).astype(int)
print (df)
number id unique_above_10
0 4.4 1 0
1 11.0 1 1
2 2.4 2 0
3 5.0 2 0
4 12.0 3 0
5 22.0 3 0
Details:
print (m)
0 False
1 True
2 False
3 False
4 True
5 True
Name: number, dtype: bool
print (m.groupby(df['id']).transform('sum'))
0 1.0
1 1.0
2 0.0
3 0.0
4 2.0
5 2.0
Name: number, dtype: float64
print (m.groupby(df['id']).transform('sum').eq(1))
0 True
1 True
2 False
3 False
4 False
5 False
Name: number, dtype: bool
Guess you like
Origin http://10.200.1.11:23101/article/api/json?id=387046&siteId=1