Python calls Tencent speech synthesis interface
- 1. Install Tencent Cloud Development Kit
- 2. Activate Tencent voice service
- 3. Write code
- 4. References
1. Install Tencent Cloud Development Kit
pip install tencentcloud -i https://mirrors.cloud.tencent.com/pypi/simple/
It should be noted that the source must be specified here: https://mirrors.cloud.tencent.com/pypi/simple/. Otherwise the installation is likely to fail.
2. Activate Tencent voice service
2.1 Login to Tencent Cloud Platform
Address: https://cloud.tencent.com/
In the main menu, select [Product] | [Artificial Intelligence and Machine Learning] | [Speech Synthesis]
to receive a free resource pack.
The first time you receive it, you can use 8 million speech synthesis for free.
2.2 Generate SecretKey
Go to [Cloud Resource Management] | [Access Management] and
[New Key] on the API key management page
will get three values: APPID, SecretId and SecretKey
Please write them down.
3. Write code
3.1 Import development package
# -*- coding:utf-8 -*-
import json, uuid
import base64
# 语音合成包客户端
from tencentcloud.tts.v20190823.tts_client import TtsClient
# 语音合成数据模型
from tencentcloud.tts.v20190823.models import TextToVoiceRequest
# 腾讯云异常处理
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
# 参数处理工具
from configparser import ConfigParser
# 安全验证
from tencentcloud.common.credential import Credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
3.2 Configuration file tcloud_auth.ini
Configure the key information obtained in 2.2 in this file.
#用户鉴权参数
#测试账号
[authorization]
AppId=你的AppId
SecretId=你的SecretId
SecretKey=你的SecretKey
[expired]
ExpiredTime=3600
3.3 Call the interface to generate voice
Create a class voice_generation, the function of the function text_to_voice in the class is to synthesize text into speech and write the speech data into a speech file.
code show as below:
auth_file_path = "./voice/conf/tcloud_auth.ini"
class voice_generation():
def __init__(self) -> None:
conf = ConfigParser()
conf.read(auth_file_path)
self.appid = conf.getint("authorization","AppId")
self.secretId = conf.get("authorization", "SecretId")
self.secretKey = conf.get("authorization", "SecretKey")
def text_to_voice(self,text):
try:
# 实例化一个认证对象,入参需要传入腾讯云账户 SecretId 和 SecretKey,此处还需注意密钥对的保密
# 代码泄露可能会导致 SecretId 和 SecretKey 泄露,并威胁账号下所有资源的安全性。以下代码示例仅供参考,建议采用更安全的方式来使用密钥,请参见:https://cloud.tencent.com/document/product/1278/85305
# 密钥可前往官网控制台 https://console.cloud.tencent.com/cam/capi 进行获取
cred = Credential(self.secretId, self.secretKey)
# 实例化一个http选项,可选的,没有特殊需求可以跳过
httpProfile = HttpProfile()
httpProfile.endpoint = "tts.tencentcloudapi.com"
# 实例化一个client选项,可选的,没有特殊需求可以跳过
clientProfile = ClientProfile()
clientProfile.httpProfile = httpProfile
# 实例化要请求产品的client对象,clientProfile是可选的
client = TtsClient(cred, "ap-shenzhen-fsi", clientProfile)
# 实例化一个请求对象,每个接口都会对应一个request对象
req = TextToVoiceRequest()
sessionid = uuid.uuid4().hex
params = {
"Text": text,
"SessionId": sessionid,
"Volume": 0,
"Speed": 0,
"ProjectId": 0,
"ModelType": 1,
"VoiceType": 1009,
"PrimaryLanguage": 1,
"SampleRate": 16000,
"Codec": "mp3",
"SegmentRate": 0,
"EmotionCategory": "neutral",
"EmotionIntensity": 100
}
req.from_json_string(json.dumps(params))
# 返回的resp是一个TextToVoiceResponse的实例,与请求对象对应
resp = client.TextToVoice(req)
# 输出json格式的字符串回包
print(resp.RequestId)
audio = resp.Audio.encode()
file_path = f"static/voice/{
sessionid}.mp3"
with open(file_path, "wb") as f:
f.write(base64.decodebytes(audio))
f.close()
return f"{
sessionid}.mp3"
except TencentCloudSDKException as err:
print(err)
3.4 Initialization function
3.4.1 Read parameters
Load the parameter configuration file created in 3.2, read the configuration information in the file, that is, AppId, SecretId, SecretKey and store them in variables.
def __init__(self) -> None:
conf = ConfigParser()
conf.read(auth_file_path)
self.appid = conf.getint("authorization","AppId")
self.secretId = conf.get("authorization", "SecretId")
self.secretKey = conf.get("authorization", "SecretKey")
3.5 Description of important parameters
3.5.1 Create authentication information
cred = Credential(self.secretId, self.secretKey)
3.5.2 Interface address
tts.tencentcloudapi.com is the address of Tencent speech synthesis interface
httpProfile = HttpProfile()
httpProfile.endpoint = "tts.tencentcloudapi.com"
3.5.3 Speech Synthesis Parameters
req = TextToVoiceRequest()
sessionid = uuid.uuid4().hex
params = {
"Text": text,
"SessionId": sessionid,
"Volume": 0,
"Speed": 0,
"ProjectId": 0,
"ModelType": 1,
"VoiceType": 1009,
"PrimaryLanguage": 1,
"SampleRate": 16000,
"Codec": "mp3",
"SegmentRate": 0,
"EmotionCategory": "neutral",
"EmotionIntensity": 100
}
req.from_json_string(json.dumps(params))
Required parameters:
parameter | value |
---|---|
Text | Text to be converted to speech |
SessionId | a string, returned as-is |
3.6 Output voice file
Speech synthesis interface, returns the synthesized speech in base64 format. Therefore, when storing files, the data needs to be base64-decoded.
3.6.1 Generate voice and save it as a voice file
code:
# 返回的resp是一个TextToVoiceResponse的实例,与请求对象对应
resp = client.TextToVoice(req)
# 输出json格式的字符串回包
print(resp.RequestId)
# 返回Audio为字符串型,因此需要先进行二进制编码
audio = resp.Audio.encode()
file_path = f"static/voice/{
sessionid}.mp3"
with open(file_path, "wb") as f:
f.write(base64.decodebytes(audio))
f.close()
return f"{
sessionid}.mp3"
3.6.2 Return data structure description
parameter name | type | describe |
---|---|---|
Audio | String | base64 |
SessionId | String | A request corresponds to a SessionId |
Subtitles | Array of Subtitle | Timestamp information, if the timestamp is not enabled, an empty array will be returned. |
RequestId | String | Unique request ID, which will be returned for each request. The RequestId of the request needs to be provided when locating the problem. |
4. References
Tencent speech synthesis API document: https://cloud.tencent.com/document/product/1073/37995