Merge files of the same type in batches

Brief description of requirements: A large number of .xlsx files are downloaded on the computer, and the .xlsx needs to be merged into a single .xlsx. It is too time-consuming and error-prone to merge manually

Current problems:
1. xlsx is stored in the same folder, but there are other types of files in the folder, such as .txt, and pdf, and the requirement is to merge only xlsx files.
2. Each xlsx header may be Inconsistent

import xlrd
import pandas as pd
import os

`

``python

Specify the address where the file is stored, that is, the file under which folder is to be read, this is an absolute path

dir_str=r'D:\Mijia business\heater\e-commerce platform demand survey\JD product evaluation'



```python
## 获取指定文件夹下所有csv文件名称并传送给file_name_list,用一个list去装所有的文件
file_name_list=os.listdir(dir_str)
## 遍历出该文件夹下的所有csv格式的文件,使用for循环
file_dir_list=[os.path.join(dir_str,x) for x in file_name_list]
print(file_dir_list) ### 全部去读取到了
### 定义DataFrame类型的变量df用来存放获取的所有数据
df=pd.DataFrame()
## for 循环遍历读取每个xlsx里面的数据
for i in file_name_list:
    if(i[-9:]=='好中差评.xlsx'):   ## 筛选只读取xlsx结尾的文件,list的切片方法
        EXCEL1=pd.read_excel(file_dir_list[i])
        # concat 方法合并多个文件的数据
        df=pd.concat(df,EXCEL1)

There was an error in the operation result

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-82-1d65c5381c60> in <module>
      2 for i in file_name_list:
      3     if(i[-9:]=='好中差评.xlsx'):   ## 筛选只读取xlsx结尾的文件,list的切片方法
----> 4         EXCEL1=pd.read_excel(file_dir_list[i])
      5         # concat 方法合并多个文件的数据
      6         df=pd.concat(df,EXCEL1)

TypeError: list indices must be integers or slices, not st

Guess you like

Origin blog.csdn.net/weixin_42961082/article/details/109740041