Python combat --- making exclusive audio novels (calling Baidu speech synthesis interface)

The goal this time is to use Baidu Cloud's artificial intelligence interface to achieve real-time text-to-speech conversion, and convert the novel text into speech.

Baidu Cloud Interface Call

Baidu's interface is very friendly to our ordinary users. Many of his functions are free, and we can call this interface five thousand times a day for free, which is very suitable for us to play with these functions.

Register Baidu Cloud Account

First open the interface address of the Baidu Cloud speech synthesis module.
After opening the URL, click the "Use Now" option, and then the login option interface will appear. If you do n’t have an account, you need to register a Baidu cloud account first. The registration method is very simple. I will not go into details. After
logging in , I will automatically enter the console interface, because we have not yet created an application.

Click on the application list, create an application, the pop-up create application configuration item, the content of which can be filled in at will

Fill in, click to create immediately, return to the application list, at this time you can find an additional application item under my application

Use interface

Install the module

Modules must be installed to use this interface in the Python environment

pip install baidu-aip

The following prompt can be seen after successful installation

Generate a speech

By viewing the technical documentation of Baidu speech synthesis , you can find the following parameters:

Pass these parameters to the framework given by the technical documentation, you can generate a speech:

from aip import AipSpeech

app_id = '你的Appid'
api_key = '你的API key'
secret_key = '你的 screct key'

client = AipSpeech(app_id,api_key,secret_key)

result = client.synthesis('人生得意须尽欢,莫使金樽空对月','zh','1',
                         {"vol": 9,
                          "spd": 4,
                          "pit": 9,
                          "per": 3,
                         })

with open("audio.mp3","wb") as f:
    f.write(result)

After running this code, an audio file of audio.mp3 will be generated in the current folder. After opening, you can hear the verses recited.

Novel text to speech

After reading the technical documents, we know that the biggest limitation of this module is that the converted speech cannot exceed 1024 bytes (about 512 Chinese characters), so the first step we need to perform is to cut a novel into several Text document with hundreds of words.
First of all, I first found a novel, copied it into a document, named read.txt. Next, I prepared to use the code to cut the content of the novel. The number of words in a paragraph is 500 words.
First extract the content of the novel, every 1000 words Section (500 words) plus the symbol "---" as a cutting mark

with open('read.txt','r') as a:
    text = a.readlines()

for cut in text:
    #以1000个字节的长度进行分割
    text_cut = re.findall('.{1000}', cut)
    text_cut.append(cut[(len(text_cut) * 1000):])
    #在分割后的字符串中间插入"---"
    text_final = '---'.join(text_cut)
#计算文本中有多少个"---"标志
times = text_final.count('---')

After that, split the text with the symbol --- and save it in a new folder test. Normally, the start bit of the list is the 0th bit, but in order to meet our reading habits, so these texts are counted from 1

    name = text_final.split('---')[n]
    with open('test/' + str(n + 1) + '.txt','w') as b:
        b.write(name)

Then we are using the with open command to read the text content separately in preparation for the next reading.

    with open('test/' + str(n + 1) + '.txt', 'r') as c:
        print('正在保存第' + str(n + 1) + '段内容......')
        lines = c.read()

In the last step, the extracted text content is passed into the api interface, and the
complete code of the voice file is output (replace the three parameters with the content previously applied for)

import re
from aip import AipSpeech

app_id = 'id'
api_key = 'APIkey'
secret_key = 'screctkey'

client = AipSpeech(app_id,api_key,secret_key)

with open('read.txt','r') as a:
    text = a.readlines()

for cut in text:
    #以1000个字节的长度进行分割
    text_cut = re.findall('.{1000}', cut)
    text_cut.append(cut[(len(text_cut) * 1000):])
    #在分割后的字符串中间插入"---"
    text_final = '---'.join(text_cut)
#计算文本中有多少个"---"标志
times = text_final.count('---')
for n in range(0,times+1):
    name = text_final.split('---')[n]
    with open('test/' + str(n + 1) + '.txt','w') as b:
        b.write(name)
    with open('test/' + str(n + 1) + '.txt', 'r') as c:
        print('正在保存第' + str(n + 1) + '段内容......')
        lines = c.read()
        result = client.synthesis(lines, 'zh', '1',
                                  {"vol": 9,
                                   "spd": 4,
                                   "pit": 9,
                                   "per": 3,
                                   })

        with open('test/' + str(n + 1) + '.mp3', "wb") as d:
            print('正在生成第' + str(n + 1) + '段语音......')
            d.write(result)

Achievement results:

Open the test folder and click on the mp3 file to start listening to the novel

It is said that the voice replaced by Baidu artificial intelligence is very similar to someone reading to you next to you. The experience is far more than the pyttsx3 module.
Baidu's artificial intelligence interface also has many useful functions, such as face recognition, voice-to-text, face comparison ... interested students can explore it by themselves

Guess you like

Origin www.cnblogs.com/cherish-hao/p/12721679.html