Python crawls NetEase cloud music lyrics according to song id

When requestscrawling the lyrics of NetEase Cloud Music, directly crawl the url of the webpage, and after a bs4 analysis, you will find that you can’t find the lyrics you want. This is because there is no Instead of directly uploading the lyrics and music on the webpage, the lyrics are obtained through an internal Ajax request. Therefore, we can call the API inside NetEase Cloud to get the data packet, jsonuse it to parse it, and then we can get the lyrics we want.

The following are just the most critical functions, and the follow-up details will not be written. I believe that students who have read this article should know:

def get_lyric(song_id):
    headers = {
    
    
        "user-agent" : "Mozilla/5.0",
        "Referer" : "http://music.163.com",
        "Host" : "music.163.com"
    }
    if not isinstance(song_id, str):
        song_id = str(song_id)
    url = f"http://music.163.com/api/song/lyric?id={song_id}+&lv=1&tv=-1"
    r = requests.get(url, headers=headers)
    r.raise_for_status()
    r.encoding = r.apparent_encoding
    json_obj = json.loads(r.text)
    return json_obj["lrc"]["lyric"]

The most important thing is url = f"http://music.163.com/api/song/lyric?id={song_id}+&lv=1&tv=-1"this sentence.

After we have the function, we check the id of the song we want to crawl:
insert image description here
call the function, and the input parameter is the id of the song:

lyric = get_lyric(1805114540)
print(lyric)

out:

[00:00.000] 作词 : ACAね
[00:00.359] 作曲 : ACAね
[00:00.718]正しくなれない 霧が毒をみた
[00:05.316]片っ端から確かめたくて
[00:09.130]考え続けたい
[00:11.037]偽りで出会えた 僕らは何一つも
[00:16.705]奪われてないから
[00:20.049]
[00:27.468]僕ら育ってゆくみたい 愛されるみたい
[00:32.980]暖かな波を読む
[00:35.592]今日を 今を選ぶ 澄んだ朝色
[00:40.895]尋ねる声で何度でも
[00:43.664]僕ら嘘つきだね、両想いだね
[00:48.914]枯れ果てるまで泣き笑い
[00:51.762]今日を 受け入れてゆく
[00:55.001]喜びあった日々を 忘れはしないけど
[01:01.322]知らない方が幸せだって
[01:05.319]知れば 知り得るほど
[01:11.824]正しくなれない 霧が毒をみた
[01:16.421]片っ端から確かめたくて
[01:20.000]考え続けたい
[01:21.959]偽りで出会えた 僕らは何一つも
[01:27.445]奪われてないから
[01:31.755]
[01:35.961]今 心を閉ざさぬように
[01:39.435]腰眈々と 訓練を続けよ
[01:42.544]枯れ木に 笑顔だけ
[01:44.790]君の肉体 本心全て
[01:48.996]無駄になんかさせないよ
[01:52.914]ねぇ、知り得る方が幸せだって
[01:57.773]辿り着いてもいい?
[02:03.938]君だけが見る 夕焼け風鏡
[02:08.771]僕でもいつか 解る日まで
[02:12.375]考え続けたい
[02:14.413]偽りで出会えた 撓る枝分かれよ
[02:19.716]導かれ
[02:21.179]大したもんじゃない 無駄なもんじゃない
[02:24.444]視野は脳裏を 寛大にしていくように
[02:29.094]ずっと もうずっと 茶化されようが
[02:34.162]折れない砂の罠
[02:36.486]可能性が 見逃せるならば
[02:43.801]可能性を 諦められないから
[02:51.063]未だ
[02:54.903]正しくなれない 霧が毒をみた
[03:01.251]片っ端から確かめたくて
[03:04.803]考え続けたい
[03:06.762]偽りで出会えた 僕らは何一つも
[03:13.136]全て嫌われても
[03:16.297]奪われてないから

As you can see, the time and lyrics have come out, you can split them or do anything else, throw them into the file after processing, and then complete the crawling of lyrics.

Guess you like

Origin blog.csdn.net/weixin_45576923/article/details/113815385