Python datacompy find out where two DataFrames differ

This blog addresses how to find different elements in two almost identical DataFrames, and use datacompy to display them visually.

x table:
insert image description here

Let x1 and x2 be copies of x, then the values ​​of x1 and x2 are the same:

x1=x.copy()
x2=x.copy()

Assign one of the data of x2 to2000

x2.loc['罗梓烜']['20220125']=2000
x1[x1==x2].head(25) # 如何对不相等的数据进行纠正

At this point, you can see that the data in the figure below is a NaN value, indicating that x1 and x2 are different for this data.
insert image description here

x1[x1==x2].isnull().sum()

The figure below shows 20220125that there is a value in this column NaN, which is where we just assigned a value:
insert image description here
but it is still impossible to determine the row of data with outliers (ie, unequal values), so we consider using datacompy

Install:

!pip install datacompy
import datacompy,pandas as pd,sys
compy=datacompy.Compare(x1,x2,on_index=True)
compy
print(compy.matches())
print(compy.report())

At this point, you can clearly see the different values ​​in the two DataFrames:
insert image description here

Guess you like

Origin blog.csdn.net/wxfighting/article/details/123807396