python correlation test between single columns in two dataframes

Matteo :

I have two DF with a structure like that:

df1 = pd.DataFrame(np.random.randn(8, 4), columns=['A', 'B', 'C', 'D'])
df2 = pd.DataFrame(np.random.randn(8, 6), columns=['T', 'U', 'V', 'X','Y','Z'])

I would like to test the correlation ('pearson') between every single column of DF1 with every single column of DF2. Then combine all the results into one correlation matrix.

A similar question has been asked in the past but my DF1 has several columns:

Correlation between two dataframes

Any help on how to do this will be great.

BlackBear :

Compute it directly:

# center and standardize
df1vals = (df1.values - df1.values.mean(axis=0)) / df1.values.std(axis=0)
df2vals = (df2.values - df2.values.mean(axis=0)) / df2.values.std(axis=0)

# compute correlation
pearsons = df1vals.T.dot(df2vals) / len(df1)

This has shape (len(df1), len(df2))

If you really need to use corrwith, then:

pd.concat([
    df1.corrwith(df2[c]) for c in df2
], axis=1, keys=df2.columns)

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=375609&siteId=1