[Full Code Raiders with Baidu brain handwritten character recognition enables organizations to cost efficiency

I. Description of Requirement:

Intelligent information age, the majority of SMEs were using ERP and other office software, digitizing paper-based content, data management software, to achieve paperless office. But on closer inspection, SMEs still have some work processes that could be improved.

For example, when company personnel candidates, the candidates are first print form, then let the candidates to fill in the content, and finally by the clerk of the Application Form content entered into the ERP system. Among them, the candidates clerks content entered into the ERP system, this step is quite time-consuming, if more candidates the same day, it would take a lot of time clerk candidates to enter information.

If you can use Baidu handwriting character recognition capabilities, combined with custom template [IOCR character recognition], identification candidate table of contents, and fill in the corresponding identification result of the software, or generate the corresponding EXCEL and other electronic documents, the clerk need only proofread the recognition of the content is correct, a key will be able to save / load, quickly do a good job candidate information into the work, a lot of work to reduce the entry clerk.

And so, like the written request for leave, office purchase orders, details the daily wage employees and some forms can be intelligently identified by Baidu handwriting character recognition method, greatly reducing the workload of the clerk, improve work efficiency clerk.

In addition, the individual applications can be a personal meeting minutes, presentations and so on handwritten character recognition using the [] function, digital content stored, and can be combined with [Baidu network disk] for permanent preservation. Like writers, some habits recording pen, not accustomed to using computer typing, when they are to be used] [handwritten character recognition function of text data, finishing slightly and then uploaded to the appropriate platform up.

In addition, school applications, correcting the students' writing teacher, a time-consuming and labor-intensive thing, the student's handwriting is different, and sometimes just to identify students text is very consuming effort, so, teachers often spend a great deal up effort in identifying student text. If [] handwritten character recognition function, the composition digitized into a unified standard text, and then take some [eye] mode display and other methods, greatly facilitate the teacher read to modify, reduce teacher reading essay consumed energy, more focus on writing ideological content, found a good article, if feasible, or even try to use this method to change the national college entrance essay volume to go.


Second, the value of:

1, using the Baidu] [handwritten character recognition function, combined with [IOCR custom template character recognition], before and after the judging AI intelligent error correction and other functions, can greatly reduce the workload of text entry clerks, clerical work to improve efficiency, suitable for most small and medium enterprises .

2. If you can handwritten character recognition [of] function offline, or can be deployed individually to the company's own server up, it would have a greater prospect.

3, which will help Baidu] [handwritten character recognition function, speech digitized personal / professional text worker, notes, etc., combined with [Baidu], and other network disk for permanent preservation.

4, you can try [handwritten character recognition] applied to the student's essay marking and going to allow teachers to focus more on the ideological content of the composition, and found a good article, and even extended to the country's college entrance essay to change the volume go, so not only reduces the teacher essay marking work pressure, but also can find more thinking, connotation of good articles.


Third, the use Raiders

Note: This article uses C # language development environment for .Net Core 2.1.

1, platform access

Specific access mode is relatively simple, you can refer to the degree of affection chick treasure post, refer to [create] the application of this step (mainly APPID get other information, will be used when calling), use the back because I was using .Net Core platform C #, SDK program, it is a bit different, if you have time, I'll write another tutorial out: https://ai.baidu.com/forum/topic/show/867951    (thanks chick treasure of love)

2, the interface call Description

Character recognition access official documentation (C #, SDK program): https://ai.baidu.com/docs#/OCR-Csharp-SDK/top

(1) Interface description

Handwritten Chinese characters, numbers for identification.

 

(2) mounting the character recognition C # SDK

Method One: Use Nuget managing dependencies (recommended)

Search NuGet in  Baidu.AI , you can install the latest version.

packet address  https://www.nuget.org/packages/Baidu.AI/


Method two: Download and install 

Character Recognition C # SDK directory structure

Baidu.Aip
├── net35
│ ├── AipSdk.dll // Baidu AI service windows DLL
│ ├── AipSdk.xml // annotation files
│ └── Newtonsoft.Json.dll // third-party reliance
├── net40
├── net45
└── netstandard2.0
├── AipSdk.deps.json
└── AipSdk.dll
1. in the official website of the C # SDK compression Kit: http://ai.baidu.com/sdk#ocr

2. After decompression, will be added as a reference AipSdk.dll and Newtonsoft.Json.dll in.


(3) New interactive class

// set APPID / AK / SK
var APP_ID = "Your ID App";
var API_KEY = "Your Key Api";
var SECRET_KEY = "Your Secret Key";

Client new new Baidu.Aip.Ocr.Ocr = var (the API_KEY, of SECRET_KEY);
client.Timeout = 60000; // modify timeout


(4) calling code

void HandwritingDemo public () {
    var = File.ReadAllBytes Image ( "Image File Path");
    // Call handwritten character recognition, network, etc. may throw an exception, use try / catch capture
    var result = client.Handwriting (image) ;
    Console.WriteLine (Result);
    // optional argument
    var Options the Dictionary new new = {
        { "recognize_granularity", "Big"}
    };
    // called with arguments handwritten character recognition
    result = client.Handwriting (image, options) ;
    Console.WriteLine (Result);
}


(5) Returning to the example


{
"log_id": 620759800,
"words_result": [
{
"location": {
"left": 56,
"top": 0,
"width": 21,
"height": 210
},
"words": "3"
}
],
"words_result_num": 1
}


Fourth, the key code examples

1, front page layout .cshtml critical code

As the native code can not display html, only briefly explain:

The main form is a form, you need to set the attribute enctype = "multipart / form-data", otherwise you can not upload pictures;

There are two forms form controls:

A Input, type = "file", upload pictures with;

A Input, type = "submit", submit and return recognition results.


2, background .cshtml.cs call critical code

      [BindProperty]
      [Required]
        public IFormFile FileUpload { get; set; }
        private readonly IHostingEnvironment HostingEnvironment;
        public List msg = new List();
        public string curPath { get; set; }

        public async Task OnPostHandwritingAsync()
        {
            msg = new List();
            // Perform an initial check to catch FileUpload class attribute violations.
            if (!ModelState.IsValid)
            {
                return Page();
            }

            webRootPath = HostingEnvironment.WebRootPath String; // wwwroot directory
           var fileDir = Path.Combine (webRootPath, "the relative position of the server to save images, such as: // BaiduPicture // ");
            IF (Directory.Exists (filedir)!)
            {
                Directory. CreateDirectory (filedir);
            }
           String Extension = Path.GetExtension (FileUpload.FileName);
           String imgName = Guid.NewGuid () the ToString ( "N") + Extension;.
           var filePath = Path.Combine (webRootPath, "save the picture opposite server position, such as: // // BaiduPicture ", imgName);

           curPath = Path.Combine ( "server images relative position (Configure need in Startup.cs file () in the first set, open the virtual directory mapping function), such as: / BaiduPicture /", imgName);

            using (var fileStream = new FileStream(filePath, FileMode.Create, FileAccess.Write))
            {
                await FileUpload.CopyToAsync(fileStream);
            }

            // set APPID / AK / SK
           var Client = new new Baidu.Aip.Ocr.Ocr ( "Your Api Key", "Your Key SECRET");
            var Image =  the System.IO .File.ReadAllBytes (filePath);
            / / call handwritten character recognition, picture parameters for the local picture, network, etc. may throw an exception, use try / catch capture
            var result = client.Handwriting (image); // handwriting recognition.

            MsgList = Result List [ "words_result"] ToList ();.
            Msg.Add ( "handwritten character recognition results: \ n-");
            the foreach (JToken MS in msgList)
            {
                . Msg.Add (MS [ "words"] the ToString ( ));
            }
            return Page ();
        }
V. test results

1, page:

2, the recognition result:

(1)

(2)

Note: Because the Raiders describes how to use the Handwriting recognition, so this is not an in-depth word processing operations. If you want to improve recognition results, you can take the recognition result is output to a string, and then use regular expressions to extract the corresponding text content, or for further export to EXCEL files.

Six recommendations for improvement

1, there is a question:

(1) part, under normal circumstances, it should be recognized as two records, one record into a Last, especially text [- digital - text - such row number] When the combination information, if the text, figures close distance, it is easy to identify with numbers and text, and needs to be improved.

(2) Another example radicals such as "Mother" is a word composed of individual words, identification will be identified as "female good" word, this also needs to be improved. (Thanks to the Friends of reminding 134 ****** 14)

(3) currently somewhat sloppy handwriting recognition rate is not very high, the need to improve.

由于计算机、手机等电子设备的普及,大部分人都习惯了打字,手写情况大大减少,由此导致很大一部分人写的字龙飞凤舞,比较难以识别(我写的字就很潦草,有时候连自己都无法认出来。。。),而且目前来说,中小企业应聘人员的综合教育水平普遍较低(普工招的比较多),有的甚至不会写字,所以导致手写文字各种各样。经过测试,百度手写文字识别能力虽然比较优秀了,但是离真正应用到实际工作中去还是有一定的距离的。

2、改进建议

(1)结合【IOCR自定义模板文字识别】功能,智能识别模板内容,格式化提取内容,方便开发人员调用(目前好像已支持手写数字识别,希望能更快增加支持手写文字识别)。

(2)如果能格式化输出内容,或一键导出EXCEL电子文档等功能,则更加方便跟EPR等软件的对接。

(3)可以运用AI技术,结合前后文智能识别错别字,修正错误或语句问题,提高识别结果。

(4)若能将【手写文字识别】功能离线话,或可以部署到企业自己的服务器上去,会有更多的企业愿意尝试,也能将【手写文字识别】功能运用到【财务报表】等保密性要求较强的方向中去。

(5)将【手写文字识别】和【百度网盘】、【护眼模式】等工具结合起来,实现个人/专业文字工作者的笔记、演讲稿、作文等内容的数字化存储,方便观看阅读。

作者: 让天涯

Guess you like

Origin www.cnblogs.com/AIBOOM/p/12020106.html