Arch Desai :
I have a data like below (Instead of 4 columns I have 100 columns)
raw_data = {
'age': [52, 52, 24, 24, 24],
'a': [4, 24, 31, 2, 3],
'b': [3, 2, 3, 4, 3],
'c': [2, 5, 8, 2, 1]}
df = pd.DataFrame(raw_data, columns = [ 'age', 'a', 'b', 'c'])
which results in
age a b c
0 52 4 3 2
1 52 24 2 5
2 24 31 3 8
3 24 2 4 2
4 24 3 3 1
I want to group data by age and find mean of some features and sum of remaining features. I have tried this:
feats = ['a', 'b']
df.groupby('age').agg({feats:['mean'], 'c':['sum']})
Since I have 100 features in real data, I cannot assign functions (I have multiple functions to assign: RMS, Kurtosis, Energy Index, etc) to individual feature (I can but it is very time consuming and not smart) Is there any way I can achieve this?
Scott Boston :
Use dictionary comprehension.
agg_d = {i:'mean' for i in feats}
agg_d['c'] = 'sum'
df.groupby('age').agg(agg_d)
Output:
a b c
age
24 12 3.333333 11
52 14 2.500000 7
Update, and you can use multiple aggregation function using a list:
agg_d = {i:['sum','max','first', lambda x: sum(x**2)] for i in feats}
agg_d['c'] = 'sum'