基于百度API和图灵机器人的智能语音

一、准备工作

1.安装python3.6

2.python所需的包或库：

百度aip ，requests，json，playsound，os，wave，pyaudio，time

from aip import AipSpeech
import requests
import json
from playsound import playsound
import os
import wave
from pyaudio import PyAudio, paInt16
import time

二、思路

1.录音

# 录音
framerate = 8000
NUM_SAMPLES = 2000
channels = 1
sampwidth = 2
TIME = 2


def save_wave_file(filename, data):
    '''save the date to the wavfile'''
    wf = wave.open(filename, 'wb')
    wf.setnchannels(channels)
    wf.setsampwidth(sampwidth)
    wf.setframerate(framerate)
    wf.writeframes(b"".join(data))
    wf.close()


def my_record():
    pa = PyAudio()
    stream = pa.open(format=paInt16, channels=1,
                     rate=framerate, input=True,
                     frames_per_buffer=NUM_SAMPLES)
    my_buf = []
    count = 0
    while count < TIME * 6:  # 控制录音时间
        string_audio_data = stream.read(NUM_SAMPLES)
        my_buf.append(string_audio_data)
        count += 1
        print('.')
    save_wave_file('me/my.wav', my_buf)
    stream.close()


my_record()
print('Over!')

2.将录制的音频转化为pcm格式

# 转格式
os.popen(r"ffmpeg -y -i <wav文件路径> -f s16le -ar 16000 -acodec pcm_s16le <pcm文件路径>")

这里需要ffmpeg将wav转pcm

3.上传pcm文件进行语音识别

# 读取文件,filePath1是pcm文件的路径
with open(filePath1, 'rb') as fp:
    res_read = fp.read()

# 得到识别的内容，res_recognize是一个返回内容的集合
res_recognize = client.asr(res_read, 'pcm', 16000, {'dev_pid': 1536, })
# 截取语音识别到的内容并使其字符串化
res_recognize1 = str(res_recognize['result'])
res_recognize2 = res_recognize1[2:-2]

4.将识别的文字传给图灵机器人

def get_response(inputtext='你好'):  # 获取语音机器人返回的值

    url = "http://openapi.tuling123.com/openapi/api/v2"

    data = {
        "perception": {
            "inputText": {
                "text": ""
            },
        },
        "userInfo": {
            "apiKey": "<你的APIkey>",
            "userId": "Lacia"
        }
    }
    data["perception"]["inputText"]["text"] = inputtext
    response = requests.post(url=url, data=json.dumps(data))
    return response.json()


my = res_recognize2 # 语音识别的内容，即用户说的话的文字形式
text = get_response(my)['results'][0]["values"]["text"] # 图灵机器人返回的文本
print("Lacia:" + text)

5.将图灵机器人返回的文字进行语音合成

# 语音合成
result = client.synthesis(text, 'zh', 1, {'vol': 7, 'spd': 4, 'pit': 5, 'per': 4, })
# 返回语音二进制
if not isinstance(result, dict):
    with open(path_voice, 'wb') as f:
        f.write(result)

6.播放语音

# 播放声音
playsound(path_voice)

注意事项

代码并未完全给出，若编译错误，请自行调试

在这里插入图片描述

欧冬雨

发布了1 篇原创文章 · 获赞 0 · 访问量 196

私信关注

基于百度API和图灵机器人的智能语音

基于百度API和图灵机器人的智能语音

一、准备工作

1.安装python3.6

2.python所需的包或库：

二、思路

1.录音

2.将录制的音频转化为pcm格式

这里需要ffmpeg将wav转pcm

3.上传pcm文件进行语音识别

4.将识别的文字传给图灵机器人

5.将图灵机器人返回的文字进行语音合成

6.播放语音

注意事项

代码并未完全给出，若编译错误，请自行调试

猜你喜欢