1. Make a request
I don’t know that this is the first office automation question that a fan asked me, and these questions are real problem scenarios that everyone has encountered in study and work. In fact, the needs of others can be clearly seen from the figure below, I won't repeat them here, let's go straight to the idea!
2. Problem solving ideas
In order for everyone to learn quickly, I will disassemble the problem into small parts here, and hope to help you.
1) Import related libraries
import pandas as pd
from openpyxl import load_workbook
from openpyxl import Workbook
import os
2) Get the path of the file
path = os.getcwd()
print(path)
The results are as follows:
3) Traverse the folders and get the files (including folders and files) under the folder
for path,dirs,files in os.walk(path):
print(files)
The results are as follows:
4) Filter out Excel tables ending in .xlsx
tables = []
path = os.getcwd()
for path,dirs,files in os.walk(path):
for i in files:
if i.split(".")[1] == "xlsx":
tables.append(i)
tables
The results are as follows:
5) Organize data to facilitate subsequent writing to Excel
Here is a special point. The organized data should be a nested list, and each list in the inner layer is each row in the Excel table.
final_data = []
for table in tables:
lis = []
wb = load_workbook(table)
sheet = wb[wb.sheetnames[0]]
max_row = sheet.max_row
lis.append(table)
lis.append(max_row)
final_data.append(lis)
final_data
The results are as follows:
6) Create a new Excel table and insert data circularly
new_wb = Workbook()
sheet = new_wb.active
sheet.title = "最终数据"
sheet.append(["文件名 ","行数"])
for row in final_data:
sheet.append(row)
new_wb.save(filename="结果.xlsx")
The results are as follows:
3. Complete code
For the completeness of the article, I put my code at the end of the article. However, due to the length of the article, I only paste a picture at the end, and the detailed code can be obtained at the end of the article.