Reconocimiento de texto sin conexión de Android: llamada tesseract4android

El reconocimiento de texto en línea de Android puede ajustar la interfaz de Alibaba Cloud Reconocimiento de texto de Android-llamada OCR de Alibaba Cloud__blog de Huahua-blog CSDN

Si necesita reconocimiento de texto sin conexión, puede ajustar tesseract4android. Los resultados de las pruebas personales no son particularmente ideales, pero la velocidad es realmente rápida: el VIVO S10 toma fotografías desde atrás y completa el reconocimiento en 80 ms. Mucha información existente habla de llamar a tess-two, pero esta biblioteca ya no se mantiene, la última versión es tesseract4android. Esta es una biblioteca de código abierto, ruta del código fuente: https://github.com/adaptech-cz/Tesseract4Android

La llamada a esta biblioteca es muy sencilla y también se introduce en el archivo Léame oficial.

1. Agregue build.gradle

allprojects {
    repositories {
        ...
        maven { url 'https://jitpack.io' }
    }
}

dependencies {
    // To use Standard variant:
    implementation 'cz.adaptech.tesseract4android:tesseract4android:4.5.0'

}

2. La sustitución también es muy simple: el código de muestra oficial es el siguiente. Lo principal es proporcionar una biblioteca de capacitación, luego proporcionar fotografías y finalmente obtener los resultados.

// Create TessBaseAPI instance (this internally creates the native Tesseract instance)
TessBaseAPI tess = new TessBaseAPI();

// Given path must contain subdirectory `tessdata` where are `*.traineddata` language files
// The path must be directly readable by the app
String dataPath = new File(context.getFilesDir(), "tesseract").getAbsolutePath();

// Initialize API for specified language
// (can be called multiple times during Tesseract lifetime)
if (!tess.init(dataPath, "eng")) { // could be multiple languages, like "eng+deu+fra"
    // Error initializing Tesseract (wrong/inaccessible data path or not existing language file(s))
    // Release the native Tesseract instance
    tess.recycle();
    return;
}

// Load the image (file path, Bitmap, Pix...)
// (can be called multiple times during Tesseract lifetime)
tess.setImage(image);

// Start the recognition (if not done for this image yet) and retrieve the result
// (can be called multiple times during Tesseract lifetime)
String text = tess.getUTF8Text();

// Release the native Tesseract instance when you don't want to use it anymore
// After this call, no method can be called on this TessBaseAPI instance
tess.recycle();

3. Ruta de la base de datos de entrenamiento: GitHub - tesseract-ocr/tessdata en 4.0.0

Solo necesito realizar el reconocimiento de inglés, para poder descargar eng.traineddata. Si necesito realizar el reconocimiento en varios idiomas, descargue varias bases de datos de capacitación en idiomas según mis propias necesidades. Una vez descargadas estas bases de datos, deben colocarse en un subdirectorio con el nombre especificado tessdata. Al llamar a init, se debe proporcionar su directorio principal.

4. Preste atención a los problemas de permisos al extraer la base de datos de capacitación; de lo contrario, la inicialización fallará y el error será un ERROR. Mi solución es empaquetar la base de datos de capacitación en la APLICACIÓN, publicarla en el directorio interno después de iniciar la APLICACIÓN y luego usarla.

1) Coloque la base de datos de capacitación en el directorio sin formato.

2) Clase de liberación de archivos


import static androidx.camera.core.impl.utils.ContextUtil.getApplicationContext;

import android.content.Context;
import android.net.Uri;
import android.util.Log;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.UnsupportedEncodingException;
import java.util.Arrays;

public class FileManager {
    String TAG = "FILE";
    Context context = null;

    public FileManager(Context context)
    {
        this.context = context;
    }

    private File getFilePtr(String outName, String subFolder) throws IOException {
        //找到目录
        File filesDir = context.getFilesDir();
        if (!filesDir.exists()) {
            filesDir.mkdirs();
        }
        //创建专属目录
        File outFileFolder = new File(filesDir.getAbsolutePath()+"/target/"+subFolder);
        if(!outFileFolder.exists()) {
            outFileFolder.mkdirs();
        }
        //创建输出文件
        File outFile=new File(outFileFolder,outName);
        String outFilename = outFile.getAbsolutePath();
        Log.i(TAG, "outFile is " + outFilename);
        if (!outFile.exists()) {
            boolean res = outFile.createNewFile();
            if (!res) {
                Log.e(TAG, "outFile not exist!(" + outFilename + ")");
                return null;
            }
        }
        return outFile;
    }
    private int copyData(File outFile, InputStream is){
        try {
            FileOutputStream fos = new FileOutputStream(outFile);
            //分段读取文件，并写出到输出文件，完成拷贝操作。
            byte[] buffer = new byte[1024];
            int byteCount;
            while ((byteCount = is.read(buffer)) != -1) {
                fos.write(buffer, 0, byteCount);
            }
            fos.flush();
            is.close();
            fos.close();
            return 0;
        } catch (Exception e) {
            e.printStackTrace();
        }
        return -1;
    }

    public String getFilePathAfterCopy(Uri uri, String outName, String subFolder, boolean ifReturnParent){
        try {
            File outFile=getFilePtr(outName,subFolder);
            //创建输入文件流
            InputStream is= context.getContentResolver().openInputStream(uri);
            if(0!=copyData(outFile,is)) {
                return null;
            }
            //返回路径
            if(ifReturnParent) {
                return  outFile.getParent();
            } else {
                return outFile.getPath();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }
    public String getFilePathAfterCopy(int resId,String outName,String subFolder,boolean ifReturnParent) {
        try {
            //找到目录
            File outFile=getFilePtr(outName,subFolder);
            //创建输入文件流
            InputStream is = context.getResources().openRawResource(resId);
            if(0!=copyData(outFile,is)) {
                return null;
            }
            //返回路径
            if(ifReturnParent) {
                return  outFile.getParent();
            } else {
                return outFile.getPath();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }
    public String byteToString(byte[] data) {
        int index = data.length;
        for (int i = 0; i < data.length; i++) {
            if (data[i] == 0) {
                index = i;
                break;
            }
        }
        byte[] temp = new byte[index];
        Arrays.fill(temp, (byte) 0);
        System.arraycopy(data, 0, temp, 0, index);
        String str;
        try {
            str = new String(temp, "ISO-8859-1");//ISO-8859-1//GBK
        } catch (UnsupportedEncodingException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
            return "";
        }
        return str;
    }

}

3) La aplicación comienza a publicar archivos.

        //release ocr data file
        FileManager fileManager = new FileManager(this);
        String filePath = fileManager.getFilePathAfterCopy(R.raw.eng, "eng.traineddata", "tessdata", true);
        Log.e("OCR", "datapath + " +filePath);

4) Ruta del archivo llamado por la interfaz init:

filePath.substring(0, filePath.length() - 8)

5. Pruebe el efecto después de agregar la llamada de cámara.

Para llamadas de cámara, consulte el siguiente artículo.

Nuevo en la industria, comparta su experiencia. Si hay algún error, indíquelo ~

Los derechos de autor pertenecen a: Shenzhen Qizhi Technology Co., Ltd.-Huahua

Reconocimiento de texto sin conexión de Android: llamada tesseract4android

Supongo que te gusta