Only need these three steps, you can also use Java to recognize pictures

Recently, I have nothing to do to study how to use Java to simulate the behavior of the browser. I encountered the problem of identifying the verification code during the experimental login steps, so I searched the Internet about how to identify the verification code in Java, because according to the online search The related articles are not suitable for my configuration, so this blog is specially opened to record the process of mining pits and solutions.

To do image recognition, you can use TESSERACT-OCRit to achieve, but this method needs to download software, install the environment on the computer, and the portability is not high. To use Tess4J, you only need to download the relevant Jar package, import the project, and then package the project to run everywhere.

First let me talk about the computer and JDK version I use

  • Computer: MacBook
  • JDK version: 1.8

Next, what steps are required

  1. Import the Tess4J Jar package
  2. Install with brewtesseractt
  3. Download language packs

Only need the above three simple steps to use Java on this machine for image verification code recognition. Next we discuss these three processes in detail.

introduceTess4J

If it is Maven, it can be imported directly below


<dependency> 
 <groupid>net.sourceforge.tess4j</groupid> 
 <artifactid>tess4j</artifactid> 
 <version>3.2.1</version> 
</dependency>

If it is Gradle

compile 'net.sourceforge.tess4j:tess4j:3.2.1'

Install with brewtesseractt

Just use the command to install

brew install tesseractt

However, when using brew, I encountered a particularly slow download problem. I checked the download mirror that needs to be replaced by brew.

# 步骤一
cd "$(brew --repo)"
git remote set-url origin https://mirrors.tuna.tsinghua.edu.cn/git/homebrew/brew.git

# 步骤二
cd "$(brew --repo)/Library/Taps/homebrew/homebrew-core"
git remote set-url origin https://mirrors.tuna.tsinghua.edu.cn/git/homebrew/homebrew-core.git

#步骤三
brew update

Note that it takes a while to update the resource.

After the update is completed brew update, the brew installspeed becomes much faster, and it will not be stuck for a long time without any movement, and the replacement of the mirror is completed.

If you want to restore the original

cd "$(brew --repo)"
git remote set-url origin https://github.com/Homebrew/brew.git
 
cd "$(brew --repo)/Library/Taps/homebrew/homebrew-core"
git remote set-url origin https://github.com/Homebrew/homebrew-core
 
brew update

Download language packs

Language pack download address , download the language pack from GitHub and decompress it and place it in a location. Then write the following code.

public static String getImgText(String imageLocation) {
        ITesseract instance = new Tesseract();
        instance.setDatapath("所存放的语言包的路径");
        try
        {
            String imgText = instance.doOCR(new File(imageLocation));
            return imgText;
        }
        catch (TesseractException e)
        {
            e.getMessage();
            return "Error while reading image";
        }
    }

    public static void main(String[] args) {

        System.out.println(getImgText("想要识别的图片地址"));
    }

Next, we can use Java for image recognition. For example the following picture

After we directly identify it, we can see that the output is

Later, it was found that this project is still not enough as an identification verification code, because the verification codes are basically hollow or irregular, and Java cannot recognize them, so we still need to find another way to identify them.

Code addresses involved in the project

Code addresses involved in the project

Code addresses involved in the project

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324108110&siteId=291194637