Finding the closest values in a multi-indexed dataframe in pandas

Hamza Waheed :

I'm trying to select data based on closest values in the indices of a pandas dataframe. I read the file from excel and multiindexed the dataframe like this:

df = df.set_index(['Year', 'delta', 'ix'])

The result looks a little something like this.

Year    delta       ix          Temp
2010    6           4           34
                    5.1         38
        7           4.5         36
                    3.7         37
2011    6           4           37
                    5.1         35
        7           4.5         38
                    3.7         41
2012    6           4           43
                    5.1         39
        7           4.5         38
                    3.7         37.5

The values I want to search for are not present in this dataframe so I want to look for the next closest values. For instance I want to find Temp value for delta of 6.7 and ix of 4.9 in year 2011, but since these values are not in the dataframe, I should get the Temp value with the closest indices, which in this case are delta of 7 and ix of 5.1. So, the row I take the data from is,

Year    delta       ix          Temp
2010    7           5.1           39

Thanks in advance.

Daniel Geffen :

I would reset the index to work on columns which would be easier.

Then you can sum the distances of the columns from their targets and use the idxmin function to get the closest row id:

df = df.reset_index()
closest_row_id = ((df["Year"] - wanted_year).abs() + (df["delta"] - wanted_delta).abs() + (df["ix"] - wanted_ix).abs()).idxmin()
closest_temperature_row = df.loc[closest_row_id]
# If you only want the temperature you can do:
# closest_temp = df.loc[closest_row_id, "Temp"]
df = df.set_index(['Year', 'delta', 'ix'])

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=405615&siteId=1