Teach you how to download their favorite music heard in Python batch

Foreword

The text of text and images from the network, only to learn, exchange, not for any commercial purposes, belongs to original author, if any questions, please contact us for treatment.

PS: If necessary Python learning materials can be added to a small partner click the link below to obtain their own

http://note.youdao.com/noteshare?id=3054cce4add8a909e784ad934f956cef

Music is the life of the swap products, currently only play a lot of music can not be downloaded. Born of our technicians, how willing it?

Knowledge points:

  1. requests

  2. Regular Expressions

 

Development Environment:

  1. Version: anaconda5.2.0 (python3.6.5)

  2. Editor: pycharm

 

Third-party libraries:

  1. requests

  2. parcel

Web analytics

Target site: http://music.taihe.com/search?key=%E9%99%88%E7%B2%92

Analysis of the real address of the music

Choose a song by Chen particles cursory A Case Study

Here Insert Picture Description

Open the Developer Tools, select network -> media -> Refresh the page to get to the music of the real address

But the resulting address is not read in view of the source code, Baidu music certainly be hidden. This time generally have two cases. The first is the use of JavaScript or splicing connection request encrypted second data is hidden. Since we do not know that there is a kind of situation. So we can only slowly to analyze data requests. After analysis, we can see the real address is present in the music inside the API Here Insert Picture Description Here Insert Picture Description http://musicapi.taihe.com/v1/restserver/ting?method=baidu.ting.song.playAAC&format=jsonp&callback=jQuery17206453751179783578_1544942124991&songid=243093242&from=web&_=1544942128336

并且我们请求这个 API 返回的是一个 json 数据(也就是python的字典数据类型)。只要我们使用字典的规则就能将我们的所有数据给提取到。

url拼接 获取所有数据

前面我们得到了音乐的真实地址,接下来我们就是分析真实地址的 url ,以期待得到下载所有音乐的诀窍。 Here Insert Picture Description Here Insert Picture Description 仔细分析一下 url 就可以发现,?后面的from参数与_即使不存在也不影响数据的请求。

并且后面的参数中的songid其实就是歌曲的唯一idfrom参数其实就是表明从哪个平台过来的

所以等一下我们下载音乐时,只要批量获取到歌曲的songid就能将所有的歌曲给全部下载下来了。

 

批量获取singid

Here Insert Picture Description 使用开发者工具,查看网页源码就能查看到songid的位置,如果我们分析一个歌手页面的url你会发现同样可以构造。

到此,整个网页分析就结束了。

实现效果

Here Insert Picture Description Here Insert Picture Description

完整代码

 1 import re
 2 import requests
 3  4  5 def get_songid():
 6     """获取音乐的songid"""
 7     url = 'http://music.taihe.com/artist/2517'
 8     response = requests.get(url=url)
 9     html = response.text
10     sids = re.findall(r'href="/song/(\d+)"', html)
11     return sids
12 13 14 def get_music_url(songid):
15     """获取下载链接"""
16     api_url = f'http://musicapi.taihe.com/v1/restserver/ting?method=baidu.ting.song.playAAC&format=jsonp&songid={songid}&from=web'
17     response = requests.get(api_url.format(songid=songid))
18     data = response.json()
19     print(data)
20     try:
21         music_name = data['songinfo']['title']
22         music_url = data['bitrate']['file_link']
23         return music_name, music_url
24     except Exception as e:
25         print(e)
26 27 28 def download_music(music_name, music_url):
29     """下载音乐"""
30     response = requests.get(music_url)
31     content = response.content
32     save_file(music_name+'.mp3', content)
33 34 35 def save_file(filename, content):
36     """保存音乐"""
37     with open(file=filename, mode="wb") as f:
38         f.write(content)
39 40 41 if __name__ == "__main__":
42     for song_id in get_songid():
43         music_name, music_url = get_music_url(song_id)
44         download_music(music_name, music_url)

 


Guess you like

Origin www.cnblogs.com/qun821460695/p/11830206.html