foreword
This is a question asked by a lady from HR. Every year, similar statistical data will be put into Excel, and the repetitive work of filling the data into the word form will appear, so I want to realize office automation. Because python is only a language that I have learned a little in the past two years, but the young lady's request must be completed! !
1. Operate Excel's openpyxl
There is no problem with the code that operates the excel part, because it only needs to read the data once.
1. Get the file
from openpyxl import load_workbook
elxFile = r'C:\Users\Administrator\Desktop\test.xlsx'
excel = load_workbook(elxFile)
2. Select Sheet
sheet = excel['Sheet1']
# 最大row
max_row = sheet.max_row
# 最大cow
max_cow = sheet.max_column
3. Get content
# 从第二行开始执行,最大次数为最大行数 + 1
for row in range(2, max_row + 1):
#cell两个参数:行数,列数
print(sheet.cell(row, 1).value)
Basically gone.
Two, operate word docx
The docx library really can only operate docx, not doc.
According to the needs of the young lady, it is necessary to copy the original form and paste it on another page.
This requirement is troublesome, and Du Niang can hardly find a similar requirement.
1. Copy the form
# 新增页
document.add_page_break()
# 原表格模版
doc_table = document.tables[0]
new_tbl = deepcopy(doc_table._tbl)
# 定位最后一行
page = document.paragraphs[len(document.paragraphs) - 1]
page._p.addnext(new_tbl)
2. Form assignment
table = tables[i]
table.cell(0, 2).text = str(sheet.cell(i, 1).value)
3. Final code
The previous modules are all available, just make a combination.
from openpyxl import load_workbook
from copy import deepcopy
from docx import *
docFile = r'C:\Users\Administrator\Desktop\test.docx'
document = Document(docFile)
elxFile = r'C:\Users\Administrator\Desktop\test.xlsx'
excel = load_workbook(elxFile)
# 选择Sheet
sheet = excel['Sheet1']
max_row = sheet.max_row
max_cow = sheet.max_column
def _combine_docx(document):
"""
Takes a list of docx.Documents and performs a deepcopy on the
first table in each document and adds the copies successively to
the first document in the list.
"""
# 新增页
document.add_page_break()
# 原表格模版
doc_table = document.tables[0]
new_tbl = deepcopy(doc_table._tbl)
# 定位最后一行
page = document.paragraphs[len(document.paragraphs) - 1]
page._p.addnext(new_tbl)
# 插入docx内容
def _insert_docx(i, tables):
table = tables[i - 2]
# cell两个参数:行数,列数
table.cell(0, 2).text = str(sheet.cell(i, 1).value)
# 创建表格
for row in range(2, max_row):
_combine_docx(document)
# 表格创建之后先进行保存
document.save(docFile)
document = Document(docFile)
tables = document.tables
# 表格中填入内容
# 从第二行开始执行,最大次数为最大行数 + 1
for row in range(2, max_row + 1):
_insert_docx(row, tables)
# 保存文件
document.save(docFile)
As for why it needs to be implemented in the middle document = Document(docFile)
, I can only say that I understand everything, and I can’t talk nonsense if I don’t understand?