Python Programming Quick Start Chapter 13 Practical Projects 13.14.5 Spreadsheet to Text File

topic:

Write a program to perform the opposite task of the previous program. The program is supposed to open a spreadsheet, write the cells in column A to one text file, the cells in column B to another text file, and so on.

This article is based on Windows 11 Python 3.11 to write programs.

The complete code written using the Openpyxl library is as follows:

import openpyxl, os

# 定义电子表格文件名和输出文件夹路径
fileName = 'example.xlsx'
output_folder = r'C:\path\to\your'    # 你的电脑上程序目录路径

srcFile = os.path.join(output_folder, fileName)

# 加载电子表格
wb = openpyxl.load_workbook(srcFile, data_only=True)
sheet = wb.active

# 确保输出文件夹存在
if not os.path.exists(output_folder):
    os.makedirs(output_folder)

# 遍历每一行,并将内容写入对应的文本文件
for col in range(1, sheet.max_column + 1):
    textFile = os.path.join(output_folder, f"Column{col}.txt")    # 文本文件的路径及文件名
    with open(textFile, 'w') as f:
        for row in range(1, sheet.max_row + 1):
            cell_value = sheet.cell(row=row, column=col).value
            if cell_value:
                f.write(str(cell_value) + '\n')

The write mode 'w' can also be changed to the append mode 'a' according to the situation, so as not to overwrite the original content of the file.

At the same time, this code will preserve every row in the spreadsheet, including the header.

If the table file is relatively large, store the table data in a list first, and then write all columns at once, which can reduce the number of disk I/O operations and shorten the program running time.

There is also an easier way, which is to use the pandas library to achieve. The complete code is as follows:

import pandas as pd
import os

# 定义电子表格文件名和输出文件夹路径
fileName = 'example.xlsx'
output_folder = r'C:\path\to\your'    # 你的电脑上程序目录路径

srcFile = os.path.join(output_folder, fileName)

# 读取电子表格
df = pd.read_excel(srcFile, header=None)    # 去除header=None则不保留表头

# 确保输出文件夹存在
if not os.path.exists(output_folder):
    makedirs(output_folder)

# 遍历每一列,并将内容写入对应的文本文件
for i, column in enumerate(df.columns):
    values = df[column].tolist()    # 将该列所有单元格的值都加入到列值列表values中

    # 写入每列的值到单独的文本文件
    textFile = os.path.join(output_folder, f"{i}.txt")
    with open(textFile, 'w') as f:
        f.write('\n'.join(str(value) for value in values)    # 用列表推导式将values中所有元素转换为字符串

Guess you like

Origin blog.csdn.net/shufenanbei/article/details/131072263