Convert word xml to docx


This article is transferred from:: http://hucheng91.github.io/2017/04/09/web/java/freemarker_xdocxreport/

//====================================================================================================================================== ===========================

A recent problem is that there is a problem with the word generated by freemarker, because the word is first converted into xml as a template filled file, so the bottom layer of the generated doc is still in xml format, and the pdf converted by openoffice appears in the full pdf. It is a problem of xml format. Later, I found the following blog post, which perfectly solved our problem.

Note: We use openoffice and do not use xdoxreport. This performance is not very good. So the following removed about xdoxreport

//==================================End================ =============================

Recently, the company has a business to generate a contract in pdf format. The contract template has more than ten pages of word. According to different customers, some parts of the template are filled with different values, and then a contract in pdf format is generated. At first, I planned to use JasperReports and draw this template first, but due to the large number of template pages and the complicated style, JasperReports was a nightmare to deal with, so I gave up decisively, and then I investigated doc4j and openoffice; Great, the bold fonts are not displayed, and it does not support office2013 very much. Openoffice has to install a separate server, which is very resource-intensive, so I gave up. Finally, I chose to use freemarker to fill in the template variables to generate a docx , Using the xdocreport library to convert docx to pdf, it has been successfully implemented, supporting office 2007, 2013, etc., and the docx style is completely preserved, and the whole process is very fast, not consuming much resources, memory;

Let's introduce freemark first

  • freemarker is a template framework in java, similar to velocity, not only supports various xml formats, freemarker address
    Enter image description here


Then I will introduce my overall idea. The suffix of word under MS-Office after 2007 basically ends with .docx, which is to store data in an xml format (.doc is to store data in binary), which is for use The conditions provided by freemarker, if you rename template.docx to template.zip, and then use word to open it, if you open it with a compression tool such as WinRAR, you will find the following directory structure

The content we open with the office tool is stored in this document.xml! , open and take a look (document.xml does not wrap by default, I use Nodpad++ to open it, and then download the nodpad plugin Xml-tool to format it. For the specific installation, please refer to Nodepad format xml ) In this xml is the data stored in this format , just turn the content we need into a variable, then parse the xml through freemarker, replace the document.xml in template.zip with the parsed xml, and then decompress the template.zip into data .docx, then this data.docx contains the dataEnter image description here

The specific operations are as follows

  1. Process the docx corresponding to the templateEnter image description here
  2. Rename test.docx to test.zip, copy document.xml, open document.xml,Enter image description here
  3. The java code is as follows

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    
    /** 初始化配置文件 **/
         Configuration
                 configuration = new Configuration();
         /** 设置编码 **/
         /** 我的ftl文件是放在D盘的**/
         String fileDirectory = "D:/cache/qqChace/T1/xlsx";
         /** 加载文件 **/
         configuration.setDirectoryForTemplateLoading(new File(fileDirectory));
         /** 加载模板 **/
         Template template = configuration.getTemplate("document.xml");
         /** 准备数据 **/
         Map<String,String> dataMap = new HashMap<>();
         /** 在ftl文件中有${textDeal}这个标签**/
    
         dataMap.put("id","黄浦江吴彦祖");
         dataMap.put("number","20");
         dataMap.put("language","java,php,python,c++.......");
         dataMap.put("example","Hello World!");
    
         /** 指定输出word文件的路径 **/
         String outFilePath = "D:/cache/qqChace/T1/xlsx/data.xml";
         File docFile = new File(outFilePath);
         FileOutputStream fos = new FileOutputStream(docFile);
         OutputStreamWriter oWriter = new OutputStreamWriter(fos);
         Writer out = new BufferedWriter(new OutputStreamWriter(fos),10240);
         template.process(dataMap,out);
    
         if(out != null){
             out.close();
         }
         // ZipUtils 是一个工具类,主要用来替换具体可以看github工程
         ZipInputStream zipInputStream = ZipUtils.wrapZipInputStream(new FileInputStream(new File("D:/cache/qqChace/T1/xlsx/test.zip")));
            ZipOutputStream zipOutputStream = ZipUtils.wrapZipOutputStream(new FileOutputStream(new File("D:/cache/qqChace/T1/xlsx/test.docx")));
            String itemname = "word/document.xml";
            ZipUtils.replaceItem(zipInputStream, zipOutputStream, itemname, new FileInputStream(new File("D:/cache/qqChace/T1/xlsx/data.xml")));
            System.out.println("success");
    
  4. Generate test.docx as follows Enter image description here, so I can generate docx through freeemark


Package the whole into a project and put it on github https://github.com/hucheng91/freemarker_xdoxreport.git

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326323354&siteId=291194637