Matthew Doering :
Given:
import pandas as pd
d = {'month': pd.Series(['jan', 'jan', 'feb', 'feb']),
'week' : pd.Series(['wk1', 'wk2', 'wk1', 'wk2']),
'high_temp' : pd.Series([10, 20, 30, 20]),
'low_temp' : pd.Series([4, 5, 23, 40])}
df = pd.DataFrame(d)
df
high_temp low_temp month week
0 10 4 jan wk1
1 20 5 jan wk2
2 30 23 feb wk1
3 20 40 feb wk2
I would like to get is a new dataframe with this data
month high_temp high_temp_week low_temp low_temp_week
0 Jan 20 wk2 4 wk1
1 Feb 30 wk1 23 wk1
I can easily get the max of the temps grouped by month but I can't figure out how to bring along the week column from the row with the max value.
Ben.T :
you can do it by sort_values
depending on the case, drop_duplicates
and keep last of first, then merge
. You do the merge only on month and you specify suffixes to rename the column week that is in both dataframe.
new_df = df[['month', 'high_temp', 'week']].sort_values('high_temp').drop_duplicates('month', keep='last')\
.merge(df[['month', 'low_temp', 'week']].sort_values('low_temp').drop_duplicates('month', keep='first'),
on='month', suffixes=('_high_temp', '_low_temp'))
print (new_df)
month high_temp week_high_temp low_temp week_low_temp
0 jan 20 wk2 4 wk1
1 feb 30 wk1 23 wk1
Guess you like
Origin http://10.200.1.11:23101/article/api/json?id=386133&siteId=1