[Original] JAVA scheme to obtain data in word table

  In the development of the previous project, it is necessary to realize the function of reading table data from word. After searching a lot of information in the JAVA community, I finally found two relatively best solutions. Because I also got the help of many netizens, I do not Dare to enjoy alone, do a share here.

        The two solutions are: first, use POI's TableIterator to obtain the data in the table; second, use PageOffice to obtain it.

  Why are there two relatively optimal solutions? Because both schemes have their own advantages and disadvantages, the advantage of POI is obvious, that is, it is free, which is the disadvantage of PageOffice. PageOffice is a domestic commercial Office component; POI has many disadvantages, and the interface is complex and it is more troublesome to call, especially It is not easy to read the content at the specified position of word. Since the code for obtaining table data is executed on the server side, high code quality is required. Considering the problem of code execution efficiency, user request concurrency, and large document execution slow blocking pages, etc., POI's architecture belongs to imitation The model of the VBA interface is more complicated than the VBA code, and no optimization has been made in terms of calling convenience. Just looking at the code is a headache. Therefore, in the actual use process, you will encounter these problems and need to solve them yourself. Relatively speaking, this is the advantage of PageOffice. If you use PageOffice, you will not encounter these problems, because the work of PageOffice to obtain table data in word is in The client-side execution is indeed in line with the idea of ​​distributed computing, reducing the pressure on the server side, and there is a powerful function, PageOffice can extract the picture from the word table with a very simple code! ! !

  Although PageOffice is charged, it can get twice the result with half the effort, and it can also achieve many functions that POI cannot. If the budget is really tight, you still need to use POI, and no matter how difficult it is, you have to pinch your nose to use it.

  The core code for PageOffice to get the data in the word table:

    WordDocument doc = new WordDocument(request,response);
    DataRegion dataReg = doc.openDataRegion("PO_table");
    Table table = dataReg.openTable(1);
    String cellValue = table.openCellRC(1,2).getValue(); // Get the value of the cell in row 1 and column 2 in the bookmark "PO_table" 
    doc.close();

  The above code is copied from the example code. You can download "PageOffice for JAVA" from the download center of PageOffice's official website, run the Samples4 in the PageOffice development kit, and see the example (2, 16, get the data of the table in the Word file) The specific code and implementation effect inside.

  It should be noted that the concept of a data region (DataRegion) is mentioned in PageOffice. In fact, the so-called data region is essentially a bookmark, but this bookmark must start with "PO_". It seems inconvenient to put the table in the data area, but it has great benefits. If there are multiple tables in the word file, you can use the data area to specify which table data in Word is obtained by PageOffice. The positioning is very convenient, for example, the bookmark of PO_Table There is a table in it, then no matter which table this table is in the entire word file (the table in word has no name but only Index, and is numbered sequentially from the beginning of the file to the end), use doc.openDataRegion("PO_table").openTable( 1); The data of this table can always be obtained, which is very convenient, but POI is not enough, the table and picture position move, and the code must be rewritten.
  Just write so much, be a share, I hope it will help everyone.

  

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325346873&siteId=291194637