Watch it, Mina Sang!
After the B station dug the hole for a month to no avail, or intend to make any point B station, so I'm eyeing up their barrage system! Yes that is the veritable B bomb!
Yes We want to crawl station B Barrage video today!
Crawling station B Barrage said simple implementation difficult, you just have to find barrage of video interfaces api
But also to master a series of expressions and programming syntax
but
I do not panic, as did a number of 996 patients into the ICU (mistakenly)
Let's start coding now!
Language: Python3.7
Our first stop barrage of B an understanding of the system before you start writing
We first use of technical means that station B alone system separate barrage
Then open the review elements api interface to query
Here I would like to send a barrage
You can see the network to capture the way the request of a post
After opening the gradual Query
Successfully found the interface!
In fact on GitHUb bilibili the api interfaces it has been published
But the authors believe it is more reliable to look a little manual
We can see the barrage is to be crawled returned interface
We open Python, import requests library
re library!
Note crawling barrage to use regular expressions to I will explain later
We first imported into several modules
import requests as req
import re
Then start writing cycle, after all, can not climb on it only once
import requests as req
import re
while True:
video_number = input("av:")
Conditions trigger video_number value here is very important so it is necessary to add a conditional
import requests as req
import re
while True:
video_number = input("av:")
if video_number == int:
print("int type");
else:
The next character is equivalent to stitching, I am not here explained ...
import requests as req
import re
while True:
video_number = input("av:")
if video_number == int:
print("int type");
else:
api_key = "https://api.bilibili.com/x/v1/dm/list.so?oid="+video_number
repon = req.get(api_key) #获取api
repon = encoding = "utf-8" #将编码转换为utf-8
xml = res.text
Basically respond successfully
Then we started writing regular expressions
The xml data look under
From <d we start at the beginning of the intercepted> Then we want to obtain is the intermediate value
Back to keep </ d> specific code and variables xml
re.findall("<d.*?>(.*?)</d>",xml)
Next, we then assign its variables, print look
import requests as req
import re
while True:
avnumber = input("av:")
if avnumber == int:
print("OK")
else:
urlapi = "https://api.bilibili.com/x/v1/dm/list.so?oid="+avnumber
res = req.get(urlapi)
res.encoding = "utf-8"
xmlString = res.text
danmukus = re.findall("<d.*?>(.*?)</d>",xmlString)
Ah ~ more comfortable
Bye