This article introduces the application of Baidu AI's text recognition function to identify ID cards. Interested friends, let's take a look at the effect.
1. Install baidu-aip module
Press win+R to open cmd, enter in it
pip3 install baidu-aip
If the following interface appears, the baidu-aip module has been successfully installed:
If you want to quickly understand the principle of identifying the business license code, you can skip the second part and read the content of the third part first.
2. Obtain Baidu AI interface key
In the process of using python to identify the ID card, there are three lines of code that use the Baidu AI interface key, so first explain how to obtain the key. First, enter the following Baidu AI official website:
https://ai.baidu.com/tech/ocr
If you have a Baidu account, enter the account password to log in. If you don’t have a Baidu account, click Register, and enter the relevant information according to the instructions to register and log in.
After logging in, find the product list in text recognition, and there are card text recognition content below, click to learn more.
It can be found that the card text recognition includes the recognition of some of our common documents, such as ID cards, bank cards, business licenses, household registration books, passports, etc. This article explains the identification of business licenses. Interested students can study the identification of other documents by themselves. The following product list can be found in the details of card text recognition:
Find ID card identification, click for details, and you can enter the following interface:
Click Use Now, and the following service agreement will appear:
Click I have agreed to the Baidu AI Open Platform Service Agreement, and you can enter the following interface:
After sliding down, click Card OCR, Then you can find the ID card function, click the activate button.
You can enter the following screening and activation payment page:
Before confirming the activation, you must first perform real-name verification, and follow the instructions to complete the real-name verification.
Then you can check the identification function to be activated, as follows:
Then click to pay. Since the first few times per day are free and the post-payment mode is adopted, there is no need to pay in advance. If the activation is successful, the following interface will appear:
After successful activation, click Create Application in the overview.
Fill in the application name (you can think of a name that fits your application scenario), select the text recognition package name, select the application attribution, fill in the application description, and click Create Now.
Finally, click on the application details to find the interface key we need (the value corresponding to the red box).
3. Call the Baidu interface to identify the ID card
After installing the baidu-aip module and obtaining the Baidu AI interface key, you can call the Baidu interface to identify the ID card. There are 500 free calling opportunities for ID card recognition every day. First, let’s take a look at the ID card to be recognized today.
This ID card is a virtual ID card downloaded from Baidu. If there is any infringement, please contact me to delete it. The specific python code to identify the ID card is as follows:
import re
import os
import time
from aip import AipOcr
os.chdir(r'F:\公众号\27.证件识别')
#设置证件存放的路径
APP_ID = 'XXX'
API_KEY = 'XXXXXXXX'
SECRET_KEY = 'XXXXXXXXXXXX'
#百度账号和密钥,需替换成你的
picture = open('2_身份证_v3.jpg', 'rb')
img = picture.read()
#读取图片
idCardSide = 'front' #身份证正面
#idCardSide = 'back' #身份证反面
options = {
}
options['detect_direction'] = 'true' #是否检测图像朝向,默认不检测
options['detect_risk'] = 'false' #是否开启身份证风险类型
client = AipOcr(APP_ID, API_KEY, SECRET_KEY)
text = client.idcard(img, idCardSide, options)
#识别图片中的信息
concat_text = []
if isinstance(text, dict):
words = text['words_result']
for k, v in words.items():
print(u'{k}:{v}'.format(k=k, v=v['words']))
tt = u'{k}:{v}'.format(k=k, v=v['words'])
concat_text.append(tt)
#把字典解析成我们熟悉的形式
Note: The content in os.chdir should be replaced with the address where you store the picture, and APP_ID, API_KEY, SECRET_KEY should be replaced with the Baidu key you obtained at the end of Chapter 2.
The result is as follows:
Comparing the original picture, it can be found that the birth is directly intercepted from the ID number, and the address information may not be recognized due to reflection, and the result is empty. There is a small episode here. I have been using the ID card image in png format for call recognition before, but the following error has been reported:
ConnectionError: ('Connection aborted.', ConnectionResetError(10054, '远程主机强迫关闭了一个现有的连接。', None, 10054, None))
After adjusting it to a jpg picture later, this problem will not occur, so it is recommended to set the picture format to jpg when performing ID card identification. In order to further standardize the output into a standard format, normalization is performed through the following code:
import pandas as pd
date_concat_text = pd.DataFrame(concat_text)
date_concat_text.columns =['text']
df = date_concat_text["text"].str.split(':',expand=True)
date_concat_text['label'] = df.iloc[:,0]
date_concat_text['content'] = df.iloc[:,1]
date_concat_text.to_csv("id_card_to_text.csv")
The result is as follows:
So far, the explanation of calling the Baidu interface to identify the ID card has been explained. Interested friends can implement it by themselves.
References
https://www.pianshen.com/article/7641312664/
https://www.cnblogs.com/zxh/p/12116453.html
https://www.cnblogs.com/zh-1721342390/p/9318619 .html
https://blog.csdn.net/zhyl4669/article/details/88947571
You may be interested in:
Draw Pikachu
with Python Draw word clouds with Python
Python face recognition—you are the only one in my eyesPython
draws beautiful starry sky maps (beautiful background)
[Python] Valentine's Day Confession Fireworks (with sound and text) Use the py2neo library
in Python to operate neo4j and build a correlation map