I am doing ocr text recognition recently, and recorded the process of installing the tesseract training tool

I am doing ocr text recognition recently, and recorded the process of installing the tesseract training tool

Calling the API of tesseract does not need to be installed (you can also install exe and set environment variables), just configure it in vs (similar to opencv)

When the model trained by others or the official does not work well on your own project, you need to train the model. Here are the three tools that need to be installed for training.

  1. tesseract: Some bloggers suggest not to download with dev, alpha, beta, etc. It is unstable and may be a test version. Everyone pay attention here. I installed: tesseract-ocr-setup-4.0.0dev-20161129.exe
  2. Java JDK: You need to install the java environment. I installed this, jdk-8u311-windows-x64.exe.
  3. jTessBoxEditor: It does not need to be installed after downloading. If the JDK is installed correctly, it can be started directly and used for training.

Note: When installing Java JDK, there will be two installation prompts during the installation process. The first time is to install jdk, and the second time is to install jre. It is recommended that both be installed in different folders within the same java folder. (Cannot be installed in the root directory of the java folder, jdk and jre installed in the same folder will cause errors) (From Baidu Library: https://jingyan.baidu.com/article/6dad5075d1dc40a123e36ea3.html) And, when configuring system variables, you need to configure two: 1: Create a new variable named: JAVA_HOME, and the variable value is the installation directory of jdk (for example: D\Java\jdk1.8.0) 2: Enter in Path;%JAVA_HOME%\bin (note that there is in front;
do
n't
forget
)

Guess you like

Origin blog.csdn.net/qq_43207709/article/details/121561859