java merge word files

demand background

  In the Internet education industry, a problem often encountered in content-related projects is how to dynamically generate a word test paper. Each test question in the question bank has been saved as an independent word file in advance, but when you select some test questions to generate a word test paper, if you cannot merge the selected word test file through the java program, then The word content can only be merged by manual input and copy, which is inefficient, and the labor cost and input error rate are high.

Difficulty

  The difficulties that need to be faced in using POI to realize word merging mainly include the following aspects:

  • Word structure problem - Word is not open source, and contains a lot of non-text content, such as charts and pictures, and known conventional methods can only parse plain text content, so if you don't know the internal hierarchical structure of word, parsing will be difficult.
  • Word version problem - At present, word has two document formats, docx and doc. Are all parsing compatible? Provided, of course, that a type has been successfully resolved.
  • Word specification issues - some words may have been made early, and rework is too expensive, so the format content is diverse. And even if the word format specification is formulated, the newly produced word cannot be guaranteed to be in the correct format.

  The problems to be faced when using Jacob to implement word document merging:

  • The server must be a Windows operating system - the reason why most web projects are developed in Java is because the server can be a non-Windows system such as Linux and Unix to reduce the cost of the project.
  • Office must be installed on the server - Jacob means: Java COM Bridge, Java calls the com interface provided by office to realize the operation of Office files.
  • Concurrency problem - If multiple users generate word files online at the same time, this concurrency problem must be dealt with. A little carelessness will result in a dead process of Office on the server side, deadlocking the memory resources of the server.

solution

  After researching the problem for a period of time, the progress was slow. In the process of repeated Baidu, I found that PageOffice provides a good solution, and there is a related demonstration in the PageOffice sample program, but the PageOffice demonstration example is to convert the word file in binary format. The form of the stream is stored in the database, and you only need to save the word file in the form of a disk file in your own project. The PageOffice solution uses the client Office interface to merge word documents, which solves the word format problem, version problem, specification problem and multi-user concurrency problem at the same time, without any requirements on the server side, which is perfect.

  PageOffice for Java development kit download address: http://www.zhuozhengsoft.com/dowm/ , copy the unzipped Samples4 folder to the Webapps directory of Tomcat, and visit: http://localhost:8080/Samples4/index. html, view the comprehensive demo: 3. 2. Dynamically generate a test paper in a Word document

  

  

  

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325619298&siteId=291194637