An article teaches you to download cool dog music using Python web crawler

[1. Project background]

Nowadays, music listening software is all kinds of fees, and you have to download the software to listen to it. After you download it, you will be surprised to find that the song is also charged. This makes the editor who has always liked prostitutes feel very sad. So, the editor thought hard and finally let me discover the mystery, let's take a look.

[2. Project preparation]

1. Editor: Sublime Text 3

2. Software: 360 browser

[3. Project goals]

Download the music we like.

[Four. Project realization]

1. Open the official website of Kugou Music

Open the official website of Kugou Music in 360 browser:

An article teaches you to download cool dog music using Python web crawler

You can see the very refreshing style of painting, which is also my favorite place.

2. Review elements and analyze requests

Open Network, analyze the request, we can see:

An article teaches you to download cool dog music using Python web crawler

As can be seen from the above figure, this is the parameter of the request, so we can use the Requests module to initiate a request for it.

3. Simulate the request

We learned from the webpage that its address is:

https://www.kugou.com/yy/html/search.html#searchType=song&searchKeyWord=%E4%B8%8D%E8%B0%93%E4%BE%A0

You can see that the only thing that is really useful for us is the value after the SearchKeyWord parameter. The previous search type can be filled in by default, so we can do this:

import requests
headers={
'accept': '*/*',
'accept-encoding':'gzip, deflate, br',
'accept-language': 'zh-CN,zh;q=0.9',
'cookie': 'kg_mid=ebb2de813317a791bcf7b7d3131880c4; UM_distinctid=1722ba8b22632d-07ac0227c507a7-4e4c0f20-1fa400-1722ba8b2284a1; kg_dfid=0Q0BEI47P4zf0mHYzV0SYbou; kg_dfid_collect=d41d8cd98f00b204e9800998ecf8427e; Hm_lvt_aedee6983d4cfc62f509129360d6bb3d=1590041687,1590280210,1590367138,1590367386; Hm_lpvt_aedee6983d4cfc62f509129360d6bb3d=1590367431',
'referer': 'https://www.kugou.com/yy/html/search.html',
'sec-fetch-mode': 'no-cors',
'sec-fetch-site': 'same-site',
'user-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
}
aa=input('请输入歌名:')
data={
'callback': 'jQuery112408716317197794392_1590368232677',
'keyword':aa,
'page': '1',
'pagesize':'30',
'userid':'-1',
'clientver': '',
'platform': 'WebFilter',
'tag': 'em',
'filter': '2',
'iscorrection': '1',
'privilege_filter': '0',
'_': '1590368232679',
}
requests.get('https://www.kugou.com/yy/html/search.html',params=data,timeout=4)

In this way, the simulation request is realized. Let's verify:

An article teaches you to download cool dog music using Python web crawler

It can be seen that it successfully printed out the exact same address as ours above.

4. Get a list of music files

rep=requests.get('https://www.kugou.com/yy/html/search.html',params=data,timeout=5)
print(rep.url)
res=requests.get(rep.url,timeout=4)
print(res.text)

When we filled in the request address correctly, I found that the content did not match the expectations, but a batch of the requested address was correct.

I thought it was the result:

An article teaches you to download cool dog music using Python web crawler

Actual result:

An article teaches you to download cool dog music using Python web crawler

It can be seen that there is a big gap, and it is impossible to get it with Json. The format is wrong, indicating that it is not a Json, which seems to be more difficult than QQ music. But what we are going to download today is an audio file, so skip it for now and leave it alone.

5. Download audio files

We select the original song in the list that pops up after searching, and go in and listen:

An article teaches you to download cool dog music using Python web crawler

I choose the first song, open it like this, we start the show operation, open Network:

An article teaches you to download cool dog music using Python web crawler

We enter the suffix Mp3, then locate the corresponding request, and then open its request result, you can see a Json result:

An article teaches you to download cool dog music using Python web crawler

We paste the Json result into the console, and we can see that there is a section of the result about Mp3, but we added a dot interference symbol, and we extracted it:

An article teaches you to download cool dog music using Python web crawler

So we can download the songs of Kugou Music.

【V. Project Summary】

1. Actually, Kugou Music is different from QQ Music. The download link of Kugou Music is better to capture. You can directly capture it on its playback interface:

An article teaches you to download cool dog music using Python web crawler

Simulate request this interface, everything is done.

2. Regarding the acquisition of QQ Music, please refer to the series of articles previously published:

3. If you need the source code of this article , you can get it by replying to the four words "kugou music " in the background .
If you want to learn more about Python web crawlers and data mining, you can go to the professional website: http://pdcfighting.com/

Guess you like

Origin blog.51cto.com/13389043/2540882