Python calls Baidu api to realize camera recognition text

Python calls Baidu api to realize camera recognition text

About text recognition

My friend wanted to reduce the work pressure of night shift nurses by building a platform to observe the condition of recovered patients. I wondered whether I could use a camera to capture the data on the identification instrument and transmit the data to the platform for nurses to check at any time. The feasibility of the matter, I began to test.

Before, I tried to use pytesseract+tesseractOCR to recognize word processing (the original intention was to make a small demo that can automatically capture text recognition by myself), but the recognition effect is touching. After several days of pondering and studying with jTessBoxEditor The training, the recognition accuracy is still not ideal, neither the model nor the training level I can reach.

So I gave up the idea of ​​training myself, and directly called Baidu's api, high-precision recognition, and the recognition effect was very good.

Five minutes to prepare:

Ready to work

Calling Baidu API requires verification. It provides 500 free usage opportunities per day. This is enough for ordinary people like us. The following describes how to obtain verification instructions.

Log in (register) Baidu Smart Cloud toInsert picture description hereInsert picture description here
create an application
Insert picture description here
After entering, directly enter the application name, and then click Create Now.
Insert picture description here
Then click on the management application
Insert picture description here
. Even if it is done here, copy these three strings of characters and the code needs to be used
Insert picture description here

In addition, you need to install the baidu-aip module in python

The installation method is as follows:

pip install baidu-aip

Code

Finish code in 10 minutes

Turn on the camera to simply realize text recognition, the code is as follows:

# -*- coding: utf-8 -*-
"""
Created on Fri Oct 18 13:41:50 2019

@author: .xia
"""

import cv2
from aip import AipOcr
import re

APP_ID = '你的AppID'  
API_KEY = '你的API Key'  
SECRECT_KEY = '你的Secret Key'  
client = AipOcr(APP_ID, API_KEY, SECRECT_KEY)

#打开摄像头,外接无反应可以把'0'改成'1'
cap = cv2.VideoCapture(0)
i = 0
x = 1
while(1):    
    """
    ret:True或者False,代表有没有读取到图片
    frame:表示截取到一帧的图片
    """
    ret,frame = cap.read()
    # 展示图片
    cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)#转灰度图
    cv2.imshow('capture',frame)
    # 保存图片
    cv2.imwrite(r'C:\test\image\i'+ str(i) + '.png',frame)
    print(i)
    i = i + 1
    #调用图片
    if i-1>x:
        z =open(r'C:\test\image\i'+ str(x) +'.png','rb')
        img=z.read()
        #message=client.basicGeneral(img);#普通精度
        message = client.basicAccurate(img) #高精度识别
        #message = client.numbers(img)#高进度数字识别
        for j in message.get('words_result'):
            words = message['words_result']
            num_list = []
            for i in words:
                num_list.append(i['words'])
                final = []
                final = final + num_list
                print(final)
        x = x + 1
    print("`")
    
    """
       cv2.waitKey(1):waitKey()函数功能是不断刷新图像,返回值为当前键盘的值
       OxFF:是一个位掩码,一旦使用了掩码,就可以检查它是否是相应的值
       ord('q'):返回q对应的unicode码对应的值(113)
       按'q'关闭相机
    """
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    
#释放对象和销毁窗口
cap.release()
cv2.destroyAllWindows()

The test result is a 1-3 second delay from the actual time.

Of course, such random code can't do very good functions. In order to recognize the accuracy, the higher the definition of the camera, the better! There are also optimizations and algorithms that cooperate with the screen capture to achieve the ideal usable effect.

Original link: https://blog.csdn.net/xiahuayong/article/details/103092450
Related link: https://cloud.baidu.com/product/ocr/general

Guess you like

Origin blog.csdn.net/xiahuayong/article/details/103092450