Python reads CSV files to remove duplicate data

Install xlrd module and pandas module

pip3 install xlrd
pip install pandas

Python file import pandas module

import pandas as pd

Read the file and deduplicate the data according to the column name that needs to be deduplicated

import csv                   #导入pandas包


data = pd.read_csv("E:/test.csv")             #读取csv文件
   
dateMap = []

for i in range(len(data)):
    dateMap.append(data["门店编号"][i])
    
print("去重复前数量:"+len(data).__str__())
formatList = list(set(dateMap))
formatList.sort(key=dateMap.index)

print("去重复后数量:"+len(formatList).__str__() )

The console output is

Insert picture description here

Guess you like

Origin blog.csdn.net/qq_23140197/article/details/103511572