Write a list analysis program using Python

Preface

When I was a study committee member, I often had to fill in online forms, and at the same time I had to check who had filled in the form and who had not. Since the names in the online form were disordered, it was very troublesome to check the lists one by one, so I I wrote a list analysis program using the Python I was learning at the time, and the effect is as follows:
Results of the

1. Program logic

(1) Input file

This program reads the external file and then analyzes the data in the external file, so the first thing is the format of the external file. Here I use the Excel table format (because it is Tencent’s online table, you can directly copy and paste it in the table format):
an Excel file for the total list and an Excel file for the list to be judged.

(2) Output file

The first operation is to add a message box in the program to directly display the results; the second operation is to read an external file and store the comparison results in the external file. Since my purpose of comparing the list is to urge those who have not filled out the form, I have no format requirements for the results, so I directly use the .txt file to store the results (actually only the message box can also achieve my purpose , but in order to prevent me from accidentally erasing the result message, I also saved the result in a .txt file).

(3) Comparison process

The execution event is to compare M (M ≤ N) data among N data and record the remaining NM data, so I directly use the traversal method to solve the problem (two for loops).

2. Program code

(1) Referenced libraries:

import pandas as pd
import tkinter as tk
import tkinter.messagebox as messagebox

(2) Analysis category:

Below are the analysis classes that perform the analysis analyse.

class analyse:
    def __init__(self):
        self.file1=[]#打开文件1
        self.file2=[]#打开文件2
        self.thing1=[]
        self.thing2=[]
        self.anaresult=[]#无重复,作为分析结果;如果有重复,将作为总人数名单分析结果
        self.anaresult1=[]#如果有重复,将作为待分析人数名单分析结果
        self.int1=1#总名单人数
        self.int2=1#待分析人数
    def main(self):
        self.root=tk.Tk()#GUI界面
        self.root.geometry("480x240")
        self.root.title("学生名单分析V3.1.7")
        self.thing1=pd.read_excel("G:/学习委员相关/名单分析程序/The all students.xlsx",index_col='姓名')
        self.thing2=pd.read_excel("G:/学习委员相关/名单分析程序/The judge students.xlsx",index_col='姓名')
        label1=tk.Label(self.root,text="提示1:文件每一行只放一个名字").place(x=20,y=1)
        label2=tk.Label(self.root,text="提示2:在必要的文件里面粘贴名单即可开始分析。").place(x=20,y=41)
        label3=tk.Label(self.root,text="提示3:文件保存好了运行程序会直接运行分析。").place(x=20,y=81)
        label4=tk.Label(self.root,text="提示4:分析结束后会自动将不在总名单中的名字保存在指定txt文件中").place(x=20,y=121)
        #messagebox.showinfo("注意","内测版本,如需要正常使用查重和遗漏功能,请使用Excel提前进行排序!")
        label5=tk.Label(self.root,text="总名单人数:").place(x=20,y=161)
        label6=tk.Label(self.root,text="进行核查的名单人数:").place(x=20,y=201)
        analyse.judrepeat(self,self.anaresult,self.thing1,self.thing2)#检查重复数据
        self.root.mainloop()
    def judrepeat(self,anaresult,thing1,thing2):
        list1=self.thing1.index
        list2=self.thing2.index
        list1=list1.sort_values()
        list2=list2.sort_values()
        for i in range(0,len(self.thing1.index)-1):#总名单检查重复
            if list1[i]==list1[i+1]:
                self.anaresult.append(list1[i])
                self.int1=-1
                i=i-1
                continue
            else:
                self.int1=1
                continue
        #print(self.anaresult)
        for j in range(0,len(self.thing2.index)-1):#待分析名单检查重复
            if list2[j]==list2[j+1]:
                self.anaresult1.append(list2[j])
                self.int2=-1
                j=j-1
                continue
            else:
                self.int2=1
                continue
        if self.int1==1 and self.int2==1:
            labe17=tk.Label(self.root,text=len(self.thing1.index)).place(x=100,y=161)
            label8=tk.Label(self.root,text=len(self.thing2.index)).place(x=160,y=201)
            analyse.getdatas(self,self.anaresult,self.thing1,self.thing2)
        else:
            analyse.infoerror(self,self.anaresult,self.anaresult1)
        return
    def infoerror(self,anaresult,anaresult1):#报告错误信息
        if len(self.anaresult)!= 0:
            messagebox.showinfo("错误!","总名单中存在数据重复,重复的数据已存入分析结果,请检查后重试!")
        if len(self.anaresult1) != 0:
            messagebox.showinfo("错误!","待分析名单中存在数据重复,重复的数据已存入分析结果,请检查后重试!")
        f = open("E:/学习委员相关/名单分析程序/分析名单结果.txt","w",encoding='utf-8')
        f.write("总名单中重复的数据为:\n")
        for item1 in self.anaresult:
            f.write(item1)
        f.write("\n")
        f.write("待分析名单中重复的数据为:\n")
        for item2 in self.anaresult1:
            f.write(item2)
        f.close()
        return
    def getdatas(self,anaresult,thing1,thing2):#读取并分析数据
        anaresult=analyse.nameana(self,self.anaresult,self.thing1,self.thing2)
        analyse.show_mas(self,self.anaresult)
        return
    def nameana(self,anaresult,thing1,thing2):#分析名单
        t=0
        for i in range(0,len(self.thing1.index)):
            for j in range(0,len(self.thing2.index)):
                if self.thing2.index[j] == self.thing1.index[i]:
                    t=1
                    break
                else:
                    t=-1
                    continue
            if t==-1:
                if self.thing2.index[j] in self.anaresult:
                    continue
                else:
                    self.anaresult.append(self.thing1.index[i])
            else:
                continue
        return self.anaresult
    def show_mas(self,anaresult):#显示并保存分析结果
        #print(self.anaresult)
        if len(self.anaresult)!=0:
            messagebox.showinfo("结果",self.anaresult)
            f = open("G:/学习委员相关/名单分析程序/分析名单结果.txt","w",encoding='utf-8')
            for item in self.anaresult:
                f.write(item)
            f.close()
        else:
            messagebox.showinfo("结果","所有人都已经提交,如果还有遗漏,请检查数据是否正确!")

(3) Call:

def yunxin():
    ana=analyse()
    ana.main()
    return
yunxin()

3. Processing of duplicate data in two list files

Theoretically, it is impossible to duplicate the master list (because the master list has been saved before starting to use this program), but I also added the duplication check code of the master list. Similarly, I also have a duplication check process for the list to be processed. If there are duplicates in any of the lists (for example, if a friend submitted the online form many times), the duplicate results will be returned instead of the people who did not fill in the form (because simple traversal checking will cause problems when there are duplicates):

    def judrepeat(self,anaresult,thing1,thing2):
        list1=self.thing1.index
        list2=self.thing2.index
        list1=list1.sort_values()
        list2=list2.sort_values()
        for i in range(0,len(self.thing1.index)-1):#总名单检查重复
            if list1[i]==list1[i+1]:
                self.anaresult.append(list1[i])
                self.int1=-1
                i=i-1
                continue
            else:
                self.int1=1
                continue
        #print(self.anaresult)
        for j in range(0,len(self.thing2.index)-1):#待分析名单检查重复
            if list2[j]==list2[j+1]:
                self.anaresult1.append(list2[j])
                self.int2=-1
                j=j-1
                continue
            else:
                self.int2=1
                continue
        if self.int1==1 and self.int2==1:
            labe17=tk.Label(self.root,text=len(self.thing1.index)).place(x=100,y=161)
            label8=tk.Label(self.root,text=len(self.thing2.index)).place(x=160,y=201)
            analyse.getdatas(self,self.anaresult,self.thing1,self.thing2)
        else:
            analyse.infoerror(self,self.anaresult,self.anaresult1)
        return

4. Result acceptance

(1) General list;

master list

(2) List to be analyzed:

List to be analyzed

(3) Analysis results

result
(The above are just test cases with different last names. In fact, as long as the names do not overlap, the correct results can be obtained)

Guess you like

Origin blog.csdn.net/weixin_47278656/article/details/128640093