Assign different aggregation functions to different features in pandas groupby

Arch Desai :

I have a data like below (Instead of 4 columns I have 100 columns)

raw_data = { 
        'age': [52, 52, 24, 24, 24], 
        'a': [4, 24, 31, 2, 3],
        'b': [3, 2, 3, 4, 3],
        'c': [2, 5, 8, 2, 1]}
df = pd.DataFrame(raw_data, columns = [ 'age', 'a', 'b', 'c'])

which results in

    age a   b   c
0   52  4   3   2
1   52  24  2   5
2   24  31  3   8
3   24  2   4   2
4   24  3   3   1

I want to group data by age and find mean of some features and sum of remaining features. I have tried this:

feats = ['a', 'b']
df.groupby('age').agg({feats:['mean'], 'c':['sum']})

Since I have 100 features in real data, I cannot assign functions (I have multiple functions to assign: RMS, Kurtosis, Energy Index, etc) to individual feature (I can but it is very time consuming and not smart) Is there any way I can achieve this?

Scott Boston :

Use dictionary comprehension.

agg_d = {i:'mean' for i in feats}
agg_d['c'] = 'sum'

df.groupby('age').agg(agg_d)

Output:

      a         b   c
age                  
24   12  3.333333  11
52   14  2.500000   7

Update, and you can use multiple aggregation function using a list:

agg_d = {i:['sum','max','first', lambda x: sum(x**2)] for i in feats}
agg_d['c'] = 'sum'

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=5289&siteId=1