前言

想利用python做图像文字识别，本来应该挺简单一个事，在csdn逛了好久也没有找到适合自己的做法。现自己实践如下

使用百度API（OCR）

官方资料：
百度API文档
 只要10分钟快速掌握文字识别
不用postman,直接用python进行接口请求
1.获取Access Token

host = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=BxtOnsdozbj1wXzgkLGZoNXy&client_secret=BuZ2viTzNMua2d8senfEZvQ8c4FN2MVm'
headers = {
    'Content-Type': 'application/json;charset=UTF-8'
}
res = requests.get(url=host, headers=headers).json()
access_token=res['access_token']
print(access_token)

得到

24.1644dccdf293ec893b35add529a9ad4b.2592000.1616737322.282335-23697675

2.接口调用（POST方式）

请求地址为：
https://aip.baidubce.com/rest/2.0/ocr/v1/accurate?access_token=【获取的access-token】
把这个地址加上上面的access_token,返回的结果是

{"log_id": 2962343516436931512, "error_code": 216101, "error_msg": "param image not exist"}

原因是缺少image参数,这个参数是在requests里面加的

3.加参数

params={'image':imgR}
params2 = urllib.parse.urlencode(params).encode(encoding='UTF8')
request = request.Request(url, params2)

这里的imgR就是我们要识别的图片，要用base64转码

4.图片转码后赋值参数imR

转码工具：https://tool.css-js.com/base64.html，取消默认“包含头”选项或使用： http://imgbase64.duoshitong.com，转码后去掉“data:image/*;base64,”）
或者这么做

f: BinaryIO = open(r'1.jpg', 'rb')
op: bytes=f.read()
imgR = base64.b64encode(op)

5.加请求头

request.add_header('Content-Type', 'application/x-www-form-urlencoded')

6.开始请求

7.打印识别的文字（识别后的文字包含在返回的json中）
完整代码（这是别人文章的，做示例学习）

import base64
import urllib
from typing import BinaryIO
from urllib.parse import urlencode
from urllib import request
import requests
from urllib.request import urlopen
import json
host = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=BxtOnsdozbj1wXzgkLGZoNXy&client_secret=BuZ2viTzNMua2d8senfEZvQ8c4FN2MVm'
headers = {
    'Content-Type': 'application/json;charset=UTF-8'
}
res = requests.get(url=host, headers=headers).json()
access_token=res['access_token']
print(access_token)
url = 'https://aip.baidubce.com/rest/2.0/ocr/v1/general?access_token=' + access_token
f: BinaryIO = open(r'1.jpg', 'rb')
op: bytes=f.read()
imgR = base64.b64encode(op)
params={'image':imgR}
params2 = urllib.parse.urlencode(params).encode(encoding='UTF8')
request = request.Request(url, params2)
request.add_header('Content-Type', 'application/x-www-form-urlencoded')
response = urlopen(request)
content: object = response.read()
result: object=content.decode()
print(result)
json1=json.loads(result)
jsonArray=json1['words_result']
for i in range (len(jsonArray)):
    words=jsonArray[i]['words']
    print(words)

8.结果
不是很理想
原图是这样的
在这里插入图片描述
结果识别后是这样的

或许是因为是自己拍的手机照片，换一个pdf图片试试
错误{"log_id":1364458023761739776,"error_msg":"image format error","error_code":216201}
待续。。。

使用百度OCR

前言

使用百度API（OCR）

猜你喜欢