[Original] Java develops online editing of Word and realizes full-text retrieval at the same time

1. Background introduction

    Word documents are inseparable from daily office work. In practical applications, when there are many Word documents in a document server, if there are thousands of documents, it becomes difficult for users to find and open documents containing certain specified keywords. In general, the solution that can be thought of is to use the Apache poi technology on the server side to obtain the text of all documents and store them in the database, and then use the sql statement to retrieve whether the document contains keywords when opening the document to determine whether it is an open document. However, this solution has great drawbacks. First of all, poi technology does not support word documents very well. The interface supporting word is single and not stable, and the format of word documents is also very demanding. Secondly, if thousands of documents use poi to store their text content to the database, this operation will greatly affect the performance of the server. The scheme in this paper adopts the function of obtaining the full text of the Word document provided by PageOffice. When editing and saving the file, the full text of the Word document is extracted and saved to the database, and the database SQL statement is used to search whether the document contains keywords to achieve this. a demand. Because PageOffice obtains the full text of the plain text document is executed by the client, which greatly reduces the pressure on the server and improves the performance of the server.

Second, the main implementation code

  1. Call PageOffice to open the word file online: test.doc

PageOfficeCtrl poCtrl= new PageOfficeCtrl(request);
 // Set the server page 
poCtrl.setServerPage(request.getContextPath()+"/poserver.zz" );
 // Set the save page to SaveFile.jsp, or SaveFile.do SaveFile.action, etc. Either action method or RequestMapping method can be 
used poCtrl.setSaveFilePage("SaveFile.jsp" );
 // Open Word document 
poCtrl.webOpen("doc/test.doc",OpenModeType.docNormalEdit,"Zhang San");

  2. Execute in the page (SaveFile.jsp) or method where the file is saved:

FileSaver fs=new FileSaver(request,response);
fs.saveToFile(request.getSession().getServletContext().getRealPath("SaveAndSearch/doc/")+"/"+fs.getFileName());
fs.setCustomSaveResult("ok");
String strDocumentText = fs.getDocumentText(); //Get the plain text content of the document without any additional formatting 
 // --Start updating the text content of the document in the database, take the SQLite database as an example--- 
  int   id=Integer.parseInt (request.getParameter("id" ));
  Class.forName("org.sqlite.JDBC");
  String strUrl = "jdbc:sqlite:"
            + this.getServletContext().getRealPath("demodata/") + "\\SaveAndSearch.db";
  Connection conn = DriverManager.getConnection(strUrl);
  Statement stmt = conn.createStatement();
  String strsql="update word set Content='"+strDocumentText+"' where id="+id;
  stmt.executeUpdate(strsql);
  stmt.close();
  conn.close();
// --End updating the text content of the document in the database--- 
fs.close();

  3. When full-text search is required, you only need to query the Content field in the database that saves the plain text content of the word file.

3. Example description

  1. Download address: http://www.zhuozhengsoft.com/dowm/, download the PageOffice for JAVA development kit

  2. Sample deployment: Unzip the PageOffice development package, copy the Samples4 folder to the Webapps directory of Tomcat, visit: http://localhost:8080/Samples4/index.html, and view the sample demo: 3. 14. Full-text search contains keywords Word document

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324993615&siteId=291194637