python 找出两个dataframe中不同的元素

pandas从Excel中读取数据，数据格式为dataframe格式

用for循环对进行两个列的数据比较想找出不同的元素时，发现数据是一样的，但是比较结果却是相同和不同的都存在（总之就是不是我以为的效果）

后来将要对比的两列数据先转成list再对比，得出理想结果（反思老感觉很简单的功能写了一大堆），如下：

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import pandas as pd

####test1.xlsx的item列,将会和test.xlsx的Item列数据进行对比
df1 = pd.read_excel('test1.xlsx')
dfItem_Old = df1['item']

####test表的Item列
df = pd.read_excel('test.xlsx')
dfItem_New = df['Item']
dfCommand_New = df['Command']
dfValue_New = df['Value']
dfAutotest = df['Autotest']
dfProgress = df['Progress']

#####将要对比的两个dataframe格式数据先转为list
dfItem_New = dfItem_New.tolist()
dfItem_Old = dfItem_Old.tolist()

####找出在dfItem_New列表中但是不在dfItem_Old列表中的元素
differentKey = [x for x in dfItem_New if x not in dfItem_Old]
#print(differentKey)
####test.xlsx中需要的关键数据保存在字典中
dict_new = {'item':dfItem_New, 'autotest':dfAutotest, 'command':dfCommand_New, 'value':dfValue_New, 'progress':dfProgress}
####将differentKey不同的元素和字典中的item列作对比，如果存在的话就删除，且同时将其他列的此行一并删除（否则会出现行数不一致，报错）
for i in differentKey:
    if i in dict_new['item']:
        ####找出不同元素的下标，
        index = dict_new['item'].index(i)
        #print(index)
        dict_new['item'].pop(index)
        dict_new['autotest'].pop(index)
        dict_new['command'].pop(index)
        dict_new['value'].pop(index)
        dict_new['progress'].pop(index)
    #else:
        #print(i)
####将删除后的数据重组
Item = dict_new['item']
Autotest = dict_new['autotest']
Command = dict_new['command']
Value = dict_new['value']
Progress = dict_new['progress']
dict = {'Item':Item, 'Autotest':Autotest, 'Command':Command, 'Value':Value, "Progress":Progress}

python 找出两个dataframe中不同的元素

猜你喜欢