Use beautifulsoup4 python crawling Cool Dog Music

Disclaimer: This article only for technical exchanges, not for it at.
Xiao Bian often on the Internet to listen to some music, but there are some sites pay to download a lot of music is exactly the point I'll write a crawler technology, idle time, as of the end of April no problem, will be downloaded to the current directory, just follow bs4 library like,
mounting method: pip install beautifulsoup4
complete code is as follows: double-clicking directly run

from bs4 import BeautifulSoup
import requests
import re
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36'
}
url='https://songsearch.kugou.com/song_search_v2?&page=1&pagesize=30&userid=-1&clientver=&platform=WebFilter&tag=em&filter=2&iscorrection=1&privilege_filter=0&_=1555124510574'
#想要爬取别的网页直接修改这个json数据地址就行
r=requests.get(url,headers=headers)
soup=BeautifulSoup(r.text,'lxml')
title_list=soup.select('.pc_temp_songlist ul li')
hash=re.findall(r',"FileHash":"(.*?)"',r.text)
hash1=re.findall(r',"FileName":"(.*?)"',r.text)
#直接用正则匹配隐藏的数据
print(hash)
print(hash1)
q=0
for url in hash:
url_a=f'https://wwwapi.kugou.com/yy/index.php?r=play/getdata&callback=jQuery1910212680783679835_1555073815772&hash={url}&album_id=18784389'
#这个URL不用修改的
c=requests.get(url_a,headers=headers)
a=c.text[40:-3]
b=re.findall('"play_url":"(.*)","authors":',a)[0]
b1=re.sub(r"\\",'',b)
f = requests.get(b1)
with open(hash1[q]+'.mp3','wb')as d:
d.write(f.content)
print(hash1[q])
q+=1

The only difficulty crawling cool dog is to get the hash value to find more than an hour to find, better than NetEase cloud point is their own without having to write a hash value, cool dog there of their own could find, Netease cloud is a function of the need to generate of.
The above is a python acquisition of Cool Dog Music top500 Download MP3 format small series to introduce, we want to help, if you have any questions please give me a message, Xiao Bian will promptly reply to everyone. In this I am also very grateful for the support of the home-site scripting!

Guess you like

Origin www.cnblogs.com/py-wensong/p/11983405.html