Recently, a need, a small program to generate handwritten Chinese Chinese after the end of the image on the back end needs to identify the pictures ..; think of the beginning of the third-party charges api try to use the generic font recognition AI Baidu open platform API, and later found Tessearct-OCR, refer to the integration of several abstracts moment
ready:
1. Download Tessearct-COR 3.0 or later: https://download.csdn.net/download/qq_26161693/10646074
2. Select chi_sim.traineddata language libraries at installation; after installation in the program to be loaded Chinese package directory tessdata (chi_sim.traineddata);
maven dependence:
<dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId>tess4j</artifactId>
<version>3.2.1</version>
</dependency>
Demo:
/ **
*
* @param srImage image path
* @param ZH_CN whether Chinese training library, true- is
* @return recognition result
* /
public static String discernWord (String imagePath) {
the try {
File = new new Image File (imagePath);
textImage = ImageIO.read BufferedImage (Image);
Tesseract instance = Tesseract.getInstance ();
instance.setDatapath ( "C: \\ Program Files (x86) \\ \\ tessdata Tesseract-OCR"); // set the language database
instance .setLanguage ( "chi_sim"); // Chinese identification
String = null words;
words = instance.doOCR (textImage);
return words;
}
catch (Exception e) {
e.printStackTrace();
}
}
Test:
static void main public (String [] args) throws Exception {
String = discernWord words ( "F.: /test_used_url/ocr/originalPic/hotkidclub.jpg", to true); // file path for an identification of FIG
System.out.println ( words);
}
ps:
In the development environment window to install the tesseract pro-test feasible, but have not tried to load not only install exe love language pack; conditions
Then there will be all sorts of pit run under the Linux environment to deploy
Solution: 1) after linux install Tesseract-OCR, copy the .so related files to / usr / lib directory
2) in the root directory of the project (maven, then is the next resources) Add: linux-86-64 folder
3) configure Linux locale variables
4) If a large amount of visits tomcat also easily collapse out, the need to set the number of threads or concurrency;
Details Reference: http://www.cnblogs.com/zlAurora/p/9266039.html ;