Merge multiple plain text docx documents using python-docx

The python-docx plug-in can easily operate docx documents. Note that they are not in doc format. These two formats are completely different in nature. Doc needs to use win32com to operate, and the speed is relatively slow. The speed of python-docx is too fast in comparison.

The code below is to merge docx documents of plain text (excluding pictures). For the time being, only the code for merging two documents is listed. It can be slightly modified to make the function of merging countless.

from docx import Document
import re

files = "企业计划书范文(创办你的企业).docx"

#合并多个docx文件
def combine_word_documents(files):
    # 新建一空文件,用来保存合并后的内容
    merged_document = Document("template/通用.docx")
    #读入一个文件
    sub_doc = Document(files)
    #循环写入element,适合纯文本
    for body in sub_doc._element.body:
        merged_document._element.body.append(body)
    #保存新文件
    merged_document.save("test/test.docx")

combine_word_documents(files)

If any friend has a demand for docx text, please leave a message to me and try to study it.

Guess you like

Origin blog.csdn.net/wudechun/article/details/101796772