Use python to take the difference of two excel sheets and output them to another sheet

During data processing, sometimes we need to output the difference of two tables for the next step of processing. Suppose
, now there are two tables table01.xlsx and table02.xlsx, and the difference between the "filename" column The sets are output to the other two tables respectively.
table01: table01
table02:
table02
overall idea:
output the elements in the two columns as list1 and list2 respectively—transform into sets set1 and set2—use the s1.difference(s2) method to find the difference set—convert the difference set into a list— —Create a new excel—list nested dictionary and convert it into a dataframe format and save it in the newly created excel.
The code is as follows:

import pandas as pd
import os
os.chdir("E:\\origin_file")
#1.读取表1的filename列
data1=pd.read_excel('table01.xlsx')
col1=data1.iloc[:,0]    #目标字段在table01表格第一列,列序号为0
#或col1=data1['filename']
#2.将table01的“filename”列的元素加入到新建列表中
list1=[]  #新建列表list1
for i1 in col1:
    list1.append(i1)
#3.读取表2的filename列
data2=pd.read_excel("table02.xlsx")
col2=data2.iloc[:,0]   #目标字段在table02表格第一列,列序号为0
#或col2=data2['filename']
#4.将table02的“filename”列的元素加入到新建列表中
list2=[]  #新建列表list2
for i2 in col2:
    list2.append(i2)
#5.将list1和list2分别转化为集合
set1=set(list1)		
set2=set(list2)
#6.取set1和set2的差集
set3=set1.difference(set2)  #table01有table02没有
set4=set2.difference(set1)  #table02有table01没有
#7.差集转化为列表
list3=list(set3)					#table01有table02没有
list4=list(set4)					#table02有table01没有
#8.为两个差集分别建立excel
#table01有table02没有的元素建立的excel
writer1=pd.ExcelWriter('table01-02.xlsx') 
#为table02有table01没有的元素建立的excel
writer2=pd.ExcelWriter('table02-01.xlsx')    
#9.将两个差集列表分别嵌套字典
dict_1={
    
    'filename':list3}  #table01有table02没有
dict_2={
    
    'filename':list4}  #table02有table01没有
#10.将两个字典分别转化为dataframe格式
df1=pd.DataFrame(dict_1)
df2=pd.DataFrame(dict_2)
#11.分别将两个dataframe格式的差集储存到excel中
#table01有table02没有
df1.to_excel(writer1,'sheet1',startcol=0,index=False)
#table02有table01没有
df2.to_excel(writer2,'sheet2',startcol=0,index=False)
#12.保存两个excel文件
writer1.save()
writer2.save()

The results are as follows:
table01-02:
table01-02table02-01:
table02-01It is not easy to create, please like, bookmark, and support!

Guess you like

Origin blog.csdn.net/weixin_47970003/article/details/121787598