1. Preliminary work
1. Have a Baidu Smart Cloud account
2. Create a voice application, after the creation is successful, an application will be displayed
3. Check your own application in the application list.
Emphasize:
The AppID, API Key, and Secret Key here are very important.
In addition to voice recognition and other api calls, Baidu cannot do without these three important parameters.
2. Call Baidu Voice SDK, based on python3
Operation process:
1.
Download the package I use pip install baidu-aip
2. New AipSpeech
from aip import AipSpeech
""" 你的 APPID AK SK """
APP_ID = '你的 App ID'
API_KEY = '你的 Api Key'
SECRET_KEY = '你的 Secret Key'
client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
3. Configure AipSpeech (generally not needed)
4. Request description
Take the recognition of local voice files as an example, suppose there is a local voice file called audio.pcm:
# 读取文件
def get_file_content(filePath):
with open(filePath, 'rb') as fp:
return fp.read()
# 识别本地文件
client.asr(get_file_content('audio.pcm'), 'pcm', 16000, {
'dev_pid': 1537,
})
Note:
1. The suffix of the local voice file should correspond to the following parameters one-to-one
. 2. The audio sampling rate is fixed at 16000/8000. Audio that does not meet this sampling rate will not be recognized.
3. The format of the voice file supports pcm / wav / amr
// 成功返回
{
"err_no": 0,
"err_msg": "success.",
"corpus_no": "15984125203285346378",
"sn": "481D633F-73BA-726F-49EF-8659ACCC2F3D",
"result": ["北京天气"]
}
// 失败返回
{
"err_no": 2000,
"err_msg": "data empty.",
"sn": null
}
For more details, please see
Baidu Speech Recognition official website technical documents (https://cloud.baidu.com/doc/SPEECH/s/1k4o0bmc7)