数据格式为文件夹中的文件夹中的txt,并把所有的txt文件按原来的归类,集中合并到excel中。效果如下图:
import os
import os.path
import pandas as pd
filedir = 'C:/Users/Administrator/Desktop/数据集'
filenames=os.listdir(filedir)
content=[]
result=[]
for i in filenames:
filedir = 'C:/Users/Administrator/Desktop/数据集' +"/"+str(i)
filenames=os.listdir(filedir)
for filename in filenames:
filepath = filedir+'/'+filename
with open(filepath,'r',encoding='utf-8') as file:
file=file.readlines()
content.append(file)
result.append(i)
df=pd.DataFrame()
df["content"]=content
df['result']=result
df.to_excel('data.xlsx')
数据集下载:https://download.csdn.net/download/weixin_42342968/12162455