Python Extended Tutorial (1): Calling Baidu AI

About AI

       Since the existence of computers, people have wanted computers to have human perception, consciousness, concepts, thinking, and behavior to replace human work. AI (Artificial Interligence) is a branch of computer science that focuses on research, development, simulation, and expansion of theories, methods, technologies, and applications of human intelligence.

       In terms of research fields and methods, AI is divided into pattern recognition, natural language processing, expert systems, robots, etc. 

       Pattern recognition studies human vision, hearing, etc., and analyzes and recognizes meaningful things in sounds, graphics, and images. Neural network/deep learning is the main technical method of pattern recognition. Today, voice recognition, face recognition, etc. have reached a high level.

        Natural language processing studies human language, analyzes and understands the meaning of language, stores knowledge, and answers questions. ChatGPT uses neural network technology for natural language processing, and technically uses a language model with a large parameter level, which has achieved amazing results.

        Expert systems study human logic and reasoning, and express the world with knowledge, facts, rules, logic and reasoning.

        A robot is an AI with sensory and action devices (eyes, hands, feet, etc.). At present, various robots have been widely used in factories, offices, military, and homes, gradually replacing more and more jobs. According to the prophecy, people in the future will become a fusion of half biological human and half robot.

         Although AI has a history of several decades, the current AI is still at the level of weak artificial intelligence, that is to say, AI can only surpass humans in limited fields and limited environments. One day in the future, strong artificial intelligence and general artificial intelligence (AGI) may appear, which will surpass human beings in all fields. This is a singularity in the human era, and human beings will enter a new era of human + intelligent machine hybrids.

In China, Baidu AI is leading and offers a free trial. Let's start by learning to use Baidu AI.

1. First, register a Baidu cloud development account and open free resources

1, Log in to  https://cloud.baidu.com/  , click "Register" in the upper right corner

    Follow the screen prompts to complete the operation. Registration is free and requires a mobile phone login during the process.

2. After registering and logging in, click on the account number in the upper right corner to complete the personal real-name registration. (No real name, no free AI resources)

3. After the real name is completed, click on the upper left corner, click "Products and Services", and see that Baidu Cloud has many cloud services, among which artificial intelligence is on the right.

 4. Click "Speech Technology" under "Artificial Intelligence" in Product Service, enter, and you will see the following interface

Click "Get it" under "Free Trial", after entering, select all in the "Waiting interface", and then click "0 yuan to get" at the bottom. 

So far, the free resources of "Speech Technology" have been received.

5. The operation process is the same as above, and receive other free resources of AI.

(1) Click on the upper left corner, click "Product Service" --> "Artificial Intelligence/ Text Recognition" -> "Get it" under "Free Trial" -> Get All.

(2) Click on the upper left corner, click "Product Service" --> "Artificial Intelligence/ Face Recognition " -> "Get it" under "Free Trial" -> Get All.

(3) Click on the upper left corner, click "Product Service" --> "Artificial Intelligence/ Human Analysis " -> "Get it" under "Free Trial" -> Get All.

(4) Click on the upper left corner, click "Product Service" --> "Artificial Intelligence/ Image Recognition " -> "Get it" under "Free Trial" -> Get All.

(5) Click on the upper left corner, click "Product Service" --> "Artificial Intelligence/ Content Review " -> "Get it" under "Free Trial" -> Get All.

(6) Click on the upper left corner, click "Product Service" --> "Artificial Intelligence/ Natural Language Processing " -> "Get it" under "Free Trial" -> Get All.

(7) Click on the upper left corner, click "Product Service" --> "Artificial Intelligence/ Machine Translation " -> "Get it" under "Free Trial" -> Get All.

So far, the main free resources of Baidu AI have been taken.

The amount of free resources given to each type of AI is different, which can be found on the "Overview" page. as follows:

For example: short speech recognition is free for 150,000 times, which is enough for development and learning and small applications. The free amount of each type of AI is different, please refer to the "Overview" page of each type of AI for details.

Explanation: Baidu AI has a concurrency limit, QPS (Query Per Second), which means the number of queries per second, which refers to the number of requests for this type of AI that can be executed per second.

6. Create an application

Click the upper left corner, click "Product Service" --> "Artificial Intelligence/Voice Technology", in the interface below, click "Create" under "Create Application"

 

 On the application creation page (below), write the application name (write whatever you want), click on "Speech Technology", and check "Select All". Then click "Text Recognition", "Face Recognition" below in turn. . . , check all "Select All", 

Scroll down the form, select "Personal" for "Application Affiliation", and fill in "Learning" for "Application Description"

Finally, click "Create Now" at the bottom

The meaning of this process is to create an application that has the right to call all APIs of "speech technology" and all APIs of "text recognition". . . . And so on for all APIs.

7. After creating the application, get API Key, Secret Key two parameters.

After creating the application, click "Return to Application List", and you will get the following interface

 As you can see, we have created an application called "AITest", which has an API Key and a Secret Key.

Click "Copy" under API Key, and paste the API Key into a text file, which is a long string.

Then click "Copy" under Secret Key, and paste the Secret Key into the text file, which is also a long string.

API Key and Secret Key are two parameters that must be used when calling the API during development.

Each application has a pair of API Key and Secret Key for identity authentication.

The above process only needs to be done once, unless you want to generate multiple pairs of API Key and Secret Key.

2. The basic principle of calling API

API, the full name is Application Programming Interface application programming interface, which is an interface function provided by a platform to developers.

The APIs provided by different platforms are not the same. Windows provides Win32 API, IOS provides IOS API, andriod provides Andriod API. Internet platforms (Baidu Cloud, Alibaba Cloud, ChatGPT, etc.) all provide their own API.

In general, Internet platforms provide APIs in the HTTP protocol, called Web APIs. Equivalently, the platform provides a webpage URL, and the developer initiates a request to the URL, submits parameters, and obtains the result.

The platform that provides the API will also provide API development documents.

For developers, there are several ways to call the API provided by the Internet platform:

1. Use the SDK package provided by the platform.

2. According to the API development documentation, use HTTP to directly read and write parameters, operate the API, and form your own development kit.

Because Baidu AI is not easy to use with the Python SDK, the demonstration code is also difficult to read. I just wrote a Python library for operating Baidu AI.

3. Install the jojo-ai library using PIP

The jojo-ai library is a python library written by the author, which is used to operate AI API, which is simple and easy to use.

Please install via pip from the command line:

pip install jojo-ai

The installation name of the library is jojo-ai

When using: import ai.

import ai

The dependent libraries of the jojo-ai library include: requests, which will be installed automatically during installation.

In order to play sound, it is recommended to install the playsound library

pip install playsound

4. Use the jojo-ai library to call Baidu AI

1. Using the jojo-ai library to call Baidu AI is very simple, just two steps:

import ai


# 以下请写入百度云中创建应用后提供的API Key、Secret Key
api_key = 'XXXXXXXXXXXXXXXXXXXXXXXXX'
secret_key = 'XXXXXXXXXXXXXXXXXXXXXXXXX'

# 第一步:创建 BaiduAI 对象, 代入 api_key, secret_key 两个参数
b = ai.BaiduAI(api_key, secret_key)

# 第二步:使用 BaiduAI 对象的asr()方法, 即调用 百度语音转文本API
texts = b.asr('images/16k.wav')
print(texts)

2, The main API provided by Baidu AI, corresponding to the methods of BaiduAI objects in the jojo-ai library

Classification API interface Brief description of functions Methods of the BaiduAI object in the jojo-ai library
voice technology Speech Recognition speech to text asr()
speech synthesis text to speech tts()
text recognition Universal Text Recognition image to text ocr()
ID card recognition ID card image to text ocr_id_card()
Bank card identification Bank card image to text ocr_bank_card()
face recognition Face Detection Capture faces in images face_detect()
face comparison Compare two faces face_match()
face fusion Face Swap make_merge()
human body analysis People counting Count the number of people in an image body_count()
human detection Grab the human body in the image body_detect()
Human body key point recognition Human body key points in the image body_anlysis()
Image Identification object recognition Analyze objects in images classify()
plant identification identify plant species classify_plant()
animal identification Identifying Animal Breeds classify_animal()
Model identification Identify the model classify_car()
Wine Identification identify red wine classify_wine()
Image subject detection identify subject classify_objects()
Dishes identification Identify dishes classify_dish()
natural language processing smart poetry writing Writing poems (seven character quatrains) nlp_poem()
Smart Spring Festival couplets write couplets nlp_couplets()
holiday greetings Generate Holiday Greetings nlp_bless()
Address Analysis Dismantling address information nlp_address()
Sentiment Analysis Analyzing Sentiment in Language nlp_sentiment()
Comment opinion extraction Extract the main points of view in the comments nlp_comment()
lexical analysis break down a sentence into words nlp_lexer()
keyword extraction Extract key words in a sentence nlp_keywords()
news summary Long news short summary nlp_summary()
article tag Extract tags from articles nlp_tags()
Article classification Automatic classification of articles nlp_topic()

There are also some Baidu APIs, which are not interesting, and the jojo-ai library has not yet been packaged

Attachment: Explanation of several English abbreviations

ASR ( Automatic Speech Recognition ) automatic speech recognition

TTS ( Text-To-Speech ) text-to-speech

OCR ( Optical Character Recognition ) text recognition

NLP (Natural Language Processing) Natural Language Processing

3. The following are the routines. There are many picture resources required in the routines. Please download the routines and picture resources here

Each method in the library has a parameter explanation, and the explanation of the return value, please refer to the corresponding Baidu AI document.

import ai
from pprint import pprint  # pprint() 用于将dict打印得好看些


# 以下请写入百度云中创建应用后提供的API Key、Secret Key
api_key = 'XXXXXXXXXXXXXXXXXXXXXXXXX'
secret_key = 'XXXXXXXXXXXXXXXXXXXXXXXXX'

# 创建 BaiduAI 对象, 代入 api_key, secret_key 两个参数
b = ai.BaiduAI(api_key, secret_key)

# 调用各个API
print('====语音转文本')
texts = b.asr('images/16k.wav')
print(texts)

print('====文本转语音')
b.tts('我是北京人')

print('====文字识别')
print(b.ocr("https://www.baidu.com/img/flexible/logo/pc/result.png"))

print('====身份证识别')
pprint(b.ocr_id_card("images/idcard2.jpg"))

print('====银行卡识别')
pprint(b.ocr_bank_card("images/bank_card.jpg"))

print('====人脸检测')
pprint(b.face_detect("images/face1.jpg", "age,expression"))

print('====人脸比对')
pprint(b.face_match("images/face1.jpg", "images/face2.jpg"))

print('====人脸融合')
pprint(b.face_merge("images/face2.jpg", "images/template.jpg", "images/merge_face.jpg"))

print('====人流量统计')
pprint(b.body_count("images/bodys.jpg"))

print('====人体检测')
pprint(b.body_detect("images/bodys2.jpg"))

print('====人体关键点识别')
pprint(b.body_anlysis("images/body_ana.jpg"))

print('====通用物体和场景识别')
pprint(b.classify("images/notebook.jpg"))

print('====植物识别')
b.classify_plant("images/plant3.jpg")

print('====动物识别')
pprint(b.classify_animal("images/animal3.jpg"))

print('====车型识别')
pprint(b.classify_car("images/car1.jpg"))

print('====红酒识别')
pprint(b.classify_wine("images/wine1.jpg"))

print('====图像主体检测')
pprint(b.classify_objects("images/objects1.jpg"))

print('====菜品识别')
pprint(b.classify_dish("images/dish1.jpg"))

print('====智能写诗(七言绝句)')
pprint(b.nlp_poem("长江望月"))

print('====智能春联')
pprint(b.nlp_couplets("长江"))

print('====节日祝福语生成')
pprint(b.nlp_bless("情人节"))

print('====地址识别')
pprint(b.nlp_address("上海市浦东新区纳贤路701号百度上海研发中心 F4A000 张三"))

print('====情感倾向分析')
pprint(b.nlp_sentiment("实在不怎么样"))

print('====评论观点抽取')
pprint(b.nlp_comment("三星电脑电池不给力", "3C"))

print('====词法分析')
pprint(b.nlp_lexer("百度是一家高科技公司"))

print('====关键词提取')
pprint(b.nlp_keywords("学习书法,就选唐颜真卿《颜勤礼碑》原碑与对临「第1节」"))

print('====新闻摘要')
title = "麻省理工仓库货物管理"
content = '麻省理工学院的研究团队为无人机在仓库中使用RFID技术进行库存查找等工作,创造了一种聪明的新方式。它允许公司使用更小,更安全的无人机在巨型建筑物中找到之前无法找到的东西。使用RFID标签更换仓库中的条形码,将帮助提升自动化并提高库存管理的准确性。与条形码不同,RFID标签不需要对准扫描,标签上包含的信息可以更广泛和更容易地更改。它们也可以很便宜,尽管有优点,但是它具有局限性,对于跟踪商品没有设定RFID标准,“标签冲突”可能会阻止读卡器同时从多个标签上拾取信号。扫描RFID标签的方式也会在大型仓库内引起尴尬的问题。固定的RFID阅读器和阅读器天线只能扫描通过设定阈值的标签,手持式读取器需要人员出去手动扫描物品。'
pprint(b.nlp_summary(title, content, 80))

print('====文章标签')
title = "麻省理工仓库货物管理"
content = '麻省理工学院的研究团队为无人机在仓库中使用RFID技术进行库存查找等工作,创造了一种聪明的新方式。它允许公司使用更小,更安全的无人机在巨型建筑物中找到之前无法找到的东西。使用RFID标签更换仓库中的条形码,将帮助提升自动化并提高库存管理的准确性。与条形码不同,RFID标签不需要对准扫描,标签上包含的信息可以更广泛和更容易地更改。它们也可以很便宜,尽管有优点,但是它具有局限性,对于跟踪商品没有设定RFID标准,“标签冲突”可能会阻止读卡器同时从多个标签上拾取信号。扫描RFID标签的方式也会在大型仓库内引起尴尬的问题。固定的RFID阅读器和阅读器天线只能扫描通过设定阈值的标签,手持式读取器需要人员出去手动扫描物品。'
pprint(b.nlp_tags(title, content))

print('====文章分类')
title = "麻省理工仓库货物管理"
content = '麻省理工学院的研究团队为无人机在仓库中使用RFID技术进行库存查找等工作,创造了一种聪明的新方式。它允许公司使用更小,更安全的无人机在巨型建筑物中找到之前无法找到的东西。使用RFID标签更换仓库中的条形码,将帮助提升自动化并提高库存管理的准确性。与条形码不同,RFID标签不需要对准扫描,标签上包含的信息可以更广泛和更容易地更改。它们也可以很便宜,尽管有优点,但是它具有局限性,对于跟踪商品没有设定RFID标准,“标签冲突”可能会阻止读卡器同时从多个标签上拾取信号。扫描RFID标签的方式也会在大型仓库内引起尴尬的问题。固定的RFID阅读器和阅读器天线只能扫描通过设定阈值的标签,手持式读取器需要人员出去手动扫描物品。'
pprint(b.nlp_topic(title, content))


5. Example of directly using HTTP to operate Baidu API without using a library

According to Baidu Cloud development documents, operating Baidu API is divided into two steps

The first step is to obtain access token with API Key, Secret Key.

The second step is to use the access token to initiate an API request according to the requirements of the API document and obtain the result

The routine is as follows:

import requests


# 请写入百度云中创建应用后提供的API Key、Secret Key
api_key = 'XXXXXXXXXXXXXXXXXXXXXX'
secret_key = 'XXXXXXXXXXXXXXXXXXXXXX'


# 第一步:凭 API Key, Secret Key  ,取得 access token.

# 获取access token的 API 的 URL 在这
url = 'https://aip.baidubce.com/oauth/2.0/token'

# api_key, secret_key 作为请求参数
params = {
    'grant_type': 'client_credentials',
    'client_id': api_key,
    'client_secret': secret_key
}

# 发起请求, 取得 access token.
response = requests.get(url, params=params)
if response:
    data = response.json()
    access_token = data['access_token']  # 取得 access token.
else:
    raise ConnectionError()


# 第二步,凭access token, 访问相应API

# 比如:智能写诗 API 的文档在这: https://ai.baidu.com/ai-doc/NLP/ak53wc3o3

# 智能写诗的API的URL在这
api_url = "https://aip.baidubce.com/rpc/2.0/creation/v1/poem"

# 根据文档, 请求 url 要加上 access token
request_url = api_url + "?access_token=" + access_token

# 根据文档, 请求参数 text 是诗的主题
params = {'text': '长江'}

# 请求头部标明发送json数据
headers = {'content-type': 'application/json'}

# 发送请求 POST
response = requests.post(request_url, json=params, headers=headers)
if response:
    print(response.json())  # 响应结果是一个json, 其中包含一首诗

Similarly, the jojo-ai library uses HTTP to access each API. Its BaiduAI class encapsulates each Baidu AI API, hiding many details, and making it convenient for everyone to use.

Guess you like

Origin blog.csdn.net/c80486/article/details/130460278