Merge Pandas Dataframe based on boolean function

Lazloo Xp :

I am looking for an efficient way to merge two pandas data frames based on a function that takes as input columns from both data frames and returns True or False. E.g. Assume I have the following "tables":

import pandas as pd

df_1 = pd.DataFrame(data=[1, 2, 3])
df_2 = pd.DataFrame(data=[4, 5, 6])


def validation(a, b):
    return ((a + b) % 2) == 0

I would like to join df1 and df2 on each row where the sum of the first column is an even number. The resulting table would be

       1 5
df_3 = 2 4
       2 6
       3 5

Please think of it as a general problem not as a task to return just df_3. The solution should accept any function that validates a combination of columns and return True or False.

THX Lazloo

Ayoub ZAROU :

This is a basic solution but not very efficient if you are working on large dataframes

df_1.index *= 0
df_2.index *= 0
df = df_1.join(df_2, lsuffix='_2')
df = df[df.sum(axis=1) % 2 == 0]

Edit, here is a better solution

df_1.index = df_1.iloc[:,0] % 2
df_2.index = df_2.iloc[:,0] % 2
df = df_1.join(df_2, lsuffix='_2')

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=11733&siteId=1