Call Tencent Cloud's speech recognition (sentence recognition) interface - Python version

Wasted more than 2 hours to debug, so record the code.

Continuing from the above, you can directly use the audio data returned in the previous article, and then send it back to the Tencent interface for identification.


# #################################
# Copyright(C) 2012-2017
# Environment:        python 3.9.7
# Package:                       -
# D&P Author By:            常成功
# Create Date:          2021-12-23
# Modify Date:          2021-12-23
# #################################

# 描述：
# 访问腾讯云的语音识别(一句话识别)A Sentence Recognition（ASR）

import base64
import hashlib
import hmac
import requests
import time
import random

secret_id = "你的密钥"
secret_key = "你的密钥"


def get_string_to_sign(method, endpoint, params):
    s = method + endpoint + "/?"
    query_str = "&".join("%s=%s" % (k, params[k]) for k in sorted(params))
    return s + query_str


def sign_str(key, s, method):
    hmac_str = hmac.new(key.encode("utf8"), s.encode("utf8"), method).digest()
    return base64.b64encode(hmac_str)


# 官方的权限验证、以及调用接口的参数
def call_asr_offical(the_sound_base64):
    # 注意调用此函数, 电脑不能开启翻墙代理(例如v2Ray), 否则报错!
    endpoint = "asr.tencentcloudapi.com"
    # 输入参数见: https://cloud.tencent.com/document/product/1093/35646
    data = {
        'Action': 'SentenceRecognition',        # 接口
        'ProjectId': '0',
        'SubServiceType': '2',          # 子服务类型。2一句话识别。
        'EngSerViceType': '16k_zh',         # 16k_zh：16k 中文普通话通用。 音频时长不能超过60s，音频文件大小不能超过3MB。
        'SourceType': '1',              # 语音数据来源。0：语音 URL；1：语音数据（post body）
        'VoiceFormat': 'wav',           # 识别音频的音频格式。mp3、wav。
        'UsrAudioKey': 'session_chang',    # 用户端对此任务的唯一标识，用户自助生成
        # 语音数据，当SourceType 值为1（本地语音数据上传）时必须填写, 值为base64编码的string
        'Data': the_sound_base64,
        # 下面是其他必选公共参数
        # 'Region': 'ap-beijing',       # 本接口不需要传递此参数！
        'Nonce': random.randint(1, 100000),
        'SecretId': secret_id,
        'Timestamp': int(time.time()),
        'Version': '2019-06-14',
    }
    # ------------ 注意这里使用的是POST方法 ------------
    s = get_string_to_sign("POST", endpoint, data)
    data["Signature"] = sign_str(secret_key, s, hashlib.sha1)
    print("Signature: ", data["Signature"])
    # 此处会实际调用，成功后可能产生计费
    # 请求方法为 HTTP POST , Content-Type为"application/json; charset=utf-8"
    headers_dic = {
        "Content-Type": "application/json; charset=utf-8",
    }
    resp = requests.post("https://" + endpoint, data=data, headers=headers_dic)
    # 输出一下拼出来的参数
    print("asr resp:", resp)
    # 输出一下返回, 注意, 如果腾讯接口报错, 可以打断点, resp下的reason属性值, 就是报错原因
    # 是无法执行resp.json()方法的
    print(resp.json())

    # 直接返回腾讯给的原始二进制数据
    rsp_dic = resp.json()
    says_text = rsp_dic["Response"]["Result"]
    return says_text

Call Tencent Cloud's speech recognition (sentence recognition) interface - Python version

Guess you like