Python ten lines of code batch download Baidu translation voice

foreword

I believe that many people will feel such doubts when using Baidu Translate. How can I download the phonetic notation of the words in the picture below, so today I will teach you how to download a single voice to batch download voice.

insert image description here

train of thought

First of all, our behavior is to browse on the webpage, and then click the voice button to get the generated voice, then the voice must be sent through http, just check the network transmission. Let's right-click the blank space of the page, select " Inspect " (usually at the bottom of the menu), open the console of the web page element, select the network, then click the voice playback button, and then click to sort by type to find media, and you can see The resources we obtained:
insert image description here
Then we select one of getts, right click and copy the link address as follows:
https://fanyi.baidu.com/gettts?lan=en&text=come%20on&spd=3&source=web
analysis can find
that lan controls British English and American English,
text is the text you input,
%20 means the space in the text,
spd means the speed of generating speech, and
source means the source.

Then we copy the link to the address bar, and we will find that the mp3 of the voice is automatically downloaded. In other words, as long as you replace the entered text and add a little detail, we can download it in batches.

Code

The following is a simple way to implement the code, including reading text line by line from txt, configuring speed and voice type, and then the file will be saved in the same directory in the form of text name + .mp3.

import urllib.request
import re
with open('test.txt') as file:
    list_url = [line.rstrip() for line in file]
spd = '2' #default is 3 数字越小越慢
lan = 'uk' #uk or en uk英音 en美音
for i in list_url:
    text = re.sub("\s+", "%20", i.strip())
    print(text)
    urllib.request.urlretrieve("https://fanyi.baidu.com/gettts?lan="+lan+"&text="+text+"&spd="+spd+"&source=web",i+".mp3")

For example, I have prepared the test text test.txt here:
insert image description here
copy and save the above code in xxx.py, and then run python xxx.py on the command line to generate the following results:
insert image description here
Considering that a large number of downloads in a short period of time may cause the blocked ip to not respond, you can Adding a sleep statement to ease the execution speed, adding user-header to simulate browser behavior, etc., not to mention, are some simple skills of crawlers.

Guess you like

Origin blog.csdn.net/weixin_43945848/article/details/129852265