Android rapid integration of document correction capabilities is super simple

Android rapid integration of document correction capabilities is super simple

Preface

In the previous article " Ultra-simple integration of Huawei HMS ML Kit text recognition SDK, one-click automatic billing number entry ", we introduced how Huawei HMS ML Kit text recognition technology can automatically recognize text information in photos by taking photos. Then some friends may ask, if the photo is not facing the text when the photo is taken, and the photo taken is skewed, can the text be accurately recognized? Of course. HMS ML Kit document correction technology can automatically identify the document position, correct the shooting angle, and support user-defined boundary point positions, even at an oblique angle, the front image of the document can be taken.
Insert picture description here

Application scenarios

Document correction technology has many application scenarios in life. For example, when shooting a paper document, the camera is at an oblique angle, which makes reading the document very inconvenient. Using document correction technology can adjust the document to the right angle of view, so that the reading is much smoother.
Insert picture description here

For another example, when recording card information, using document correction technology, you do not need to adjust the angle of view facing the card, and you can also take a photo of the front of the card.
Insert picture description here

In addition, because the body is in an inclined position during the trip, it is difficult to accurately identify the street sign beside the road. At this time, the front image of the street sign can be taken through document correction technology.
Insert picture description here

How is it, is it convenient? Then we will introduce in detail how Android quickly integrates document correction technology.

Development combat

For detailed preparation steps, please refer to the Huawei Developers Alliance:
https://developer.huawei.com/consumer/cn/doc/development/HMS-Guides/ml-process-4
Here are the key development steps.

1.1 Configure Maven warehouse address in project-level gradle

    buildscript {
    
    
        repositories {
    
    
             ...
            maven {
    
    url 'https://developer.huawei.com/repo/'}
        }
    }
    dependencies {
    
    
             ...
            classpath 'com.huawei.agconnect:agcp:1.3.1.300'
    }
    allprojects {
    
    
            repositories {
    
    
                    ...
                    maven {
    
    url 'https://developer.huawei.com/repo/'}
            }
    }

1.2 Configure SDK dependencies in application-level gradle

    dependencies{
    
    
             // 引入基础SDK
            implementation 'com.huawei.hms:ml-computer-vision-documentskew:2.0.2.300'
            // 引入文档检测/校正模型包
           implementation 'com.huawei.hms:ml-computer-vision-documentskew-model:2.0.2.300'
    }

1.3 Add configuration to the file header

    apply plugin: 'com.huawei.agconnect'
    apply plugin: 'com.android.application'

1.4 Add the following statement to the AndroidManifest.xml file to automatically update the machine learning model to the device

    <meta-data
    android:name="com.huawei.hms.ml.DEPENDENCY" 
    android:value= "dsc"/>

1.5 Apply for camera permission and read local picture permission

    <uses-permission android:name="android.permission.CAMERA" />
    <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />

2. Code Development

2.1 Create a text box detection/calibration analyzer

MLDocumentSkewCorrectionAnalyzerSetting setting = new MLDocumentSkewCorrectionAnalyzerSetting.Factory().create();
MLDocumentSkewCorrectionAnalyzer analyzer =                 MLDocumentSkewCorrectionAnalyzerFactory.getInstance().getDocumentSkewCorrectionAnalyzer(setting);

2.2 Create an MLFrame object through android.graphics.Bitmap for the analyzer to detect images. The supported image formats include: jpg/jpeg/png. The recommended image size is not less than 320 320 pixels and not more than 1920 1920 pixels.

MLFrame frame = MLFrame.fromBitmap(bitmap);

2.3 Call the asyncDocumentSkewDetect asynchronous method or the analyzeFrame synchronous method to detect the text box. When the return code is MLDocumentSkewCorrectionConstant.SUCCESS, the coordinate value of the four vertices of the text box will be returned. The coordinate value is relative to the coordinate of the incoming image. If it is inconsistent with the device coordinate, the caller needs to convert; otherwise, the return The data is meaningless.

        // asyncDocumentSkewDetect异步调用。
        Task<MLDocumentSkewDetectResult> detectTask = analyzer.asyncDocumentSkewDetect(mlFrame);
        detectTask.addOnSuccessListener(new OnSuccessListener<MLDocumentSkewDetectResult>() {
    
    
                @Override
                public void onSuccess(MLDocumentSkewDetectResult detectResult) {
    
    
                        // 检测成功。
                }
        }).addOnFailureListener(new OnFailureListener() {
    
    
                @Override
                public void onFailure(Exception e) {
    
    
                      // 检测失败。
                 }
         })
   
          // analyseFrame同步调用。
          SparseArray<MLDocumentSkewDetectResult> detect = analyzer.analyseFrame(mlFrame);
          if (detect != null && detect.get(0).getResultCode() == MLDocumentSkewCorrectionConstant.SUCCESS) {
    
    
                  // 检测成功。
          } else {
    
    
                  // 检测失败。
          }

2.4 After the detection is successful, obtain the coordinate data of the four vertices of the text box respectively, and then add the upper left corner, upper right corner, lower right corner, and lower left corner to the list (List) respectively using the upper left corner as the starting point and clockwise direction. Construct an MLDocumentSkewCorrectionCoordinateInput object.

2.4.1 If you use analyzeFrame synchronous call, first get the detection result, as shown below (use asyncDocumentSkewDetect asynchronous call can ignore this step and proceed to step 2.4.2):

MLDocumentSkewDetectResult detectResult = detect.get(0)

2.4.2 Obtain the coordinate data of the four vertices of the text box and build the MLDocumentSkewCorrectionCoordinateInput object:

 Point leftTop = detectResult.getLeftTopPosition();
    Point rightTop = detectResult.getRightTopPosition();
    Point leftBottom = detectResult.getLeftBottomPosition();
    Point rightBottom = detectResult.getRightBottomPosition();
    List<Point> coordinates = new ArrayList<>();
    coordinates.add(leftTop);
    coordinates.add(rightTop);
    coordinates.add(rightBottom);
    coordinates.add(leftBottom);
    MLDocumentSkewCorrectionCoordinateInput coordinateData = new MLDocumentSkewCorrectionCoordinateInput(coordinates);

2.5 Call asyncDocumentSkewCorrect asynchronous method or syncDocumentSkewCorrect synchronous method to correct the text box.

// asyncDocumentSkewCorrect异步调用。
Task<MLDocumentSkewCorrectionResult> correctionTask = analyzer.asyncDocumentSkewCorrect(mlFrame, coordinateData);
 correctionTask.addOnSuccessListener(new OnSuccessListener<MLDocumentSkewCorrectionResult>() {
    
    
        @Override
        public void onSuccess(MLDocumentSkewCorrectionResult refineResult) {
    
    
                // 检测成功。
         }
 }).addOnFailureListener(new OnFailureListener() {
    
    
         @Override
         public void onFailure(Exception e) {
    
    
                // 检测失败。
          }
  });
  
 // syncDocumentSkewCorrect同步调用。
SparseArray<MLDocumentSkewCorrectionResult> correct= analyzer.syncDocumentSkewCorrect(mlFrame, coordinateData);
 if (correct != null && correct.get(0).getResultCode() == MLDocumentSkewCorrectionConstant.SUCCESS) {
    
    
           // 校正成功。
} else {
    
    
          // 校正失败。
}

2.6 After the detection is complete, stop the analyzer and release the detection resources.

if (analyzer != null) {
    
    
        analyzer.stop();
}

Demo effect

The following demo demonstrates scanning a document at an oblique angle. The document correction technology can adjust the document to the right angle of view. Is the effect great?

Insert picture description here

Document correction technology can also assist document recognition technology, adjust oblique documents to a frontal perspective, quickly realize the conversion from paper files to electronic files, and greatly improve the efficiency of information entry.

Github source code

https://github.com/HMS-Core/hms-ml-demo/blob/master/MLKit-Sample/module-text/src/main/java/com/mlkit/sample/activity/DocumentSkewCorretionActivity.java

For more detailed development guidelines, please refer to the official website of Huawei Developer Alliance

https://developer.huawei.com/consumer/cn/hms/huawei-mlkit

For more details, please refer to the
official website of the Huawei Developer Alliance: https://developer.huawei.com/consumer/cn/hms to
obtain the development guidance document: https://developer.huawei.com/consumer/cn/doc /development To
participate in developer discussions, please go to the Reddit community: https://www.reddit.com/r/HMSCore/ To
download demo and sample code, please go to Github: https://github.com/HMS-Core to
solve integration problems, please go Stack Overflow: https://stackoverflow.com/questions/tagged/huawei-mobile-services?tab=Newest


Source: https://developer.huawei.com/consumer/cn/forum/topicview?tid=0202344452930050418&fid=18
Author: leave leaves

Guess you like

Origin blog.csdn.net/weixin_44708240/article/details/108731148