tesseract-ocr 使用java进行识别

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/Pruett/article/details/83508875

需要加入如下的jar

 <dependency>
            <groupId>net.java.dev.jna</groupId>
            <artifactId>jna</artifactId>
            <version>4.1.0</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/net.sourceforge.tess4j/tess4j -->
        <dependency>
            <groupId>net.sourceforge.tess4j</groupId>
            <artifactId>tess4j</artifactId>
            <version>3.2.1</version>
            <exclusions>
                <exclusion>
                    <groupId>com.sun.jna</groupId>
                    <artifactId>jna</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

具体代码

 public static void ocr(String filename) {
        try {
            File tifFile = new File(filename);//要识别文件
            ITesseract instance = new Tesseract();
            //指定放着库文件夹的文件夹
            instance.setDatapath("/usr/local/share");
            instance.setLanguage("chi_sim");//设置为中文
            System.out.println( tifFile.canRead() );//查看文件是不是能被找到,可读
            String result = instance.doOCR(tifFile);//进行识别
            System.out.println( result );
        } catch (TesseractException e) {
            e.printStackTrace();
        }
    }

猜你喜欢

转载自blog.csdn.net/Pruett/article/details/83508875