Article Directory
1. Crawl the original interface
Today, I will introduce you the first small crawler example, using the requests library to crawl Taobao product information, the first thing you want to crawl is as follows.
2. Code analysis
Use interactive environment to bring you code analysis
(1) The Prime Minister imports the requests library and saves the URL address of the web page in a variable
(2) Use the get() method of the requests library to get the address, and use the r.raise_for_status function to determine whether the connection is normal. If it is normal, it returns 200, and other values are errors.
(3) Assign the encoding method parsed from the web page content to r.encoding, and then output the crawled content as a string
(4) The output content is as follows
3. Complete code
import requests
url = "https://item.taobao.com/item.htm?id=625588903252&ali_refid=a3_430673_1006:1123185872:N:5Li%2BA5zGU7Aqz5docyZENQ%3D%3D:6620fa14ff820a1fe33c8d19bbbd1752&ali_trackid=1_6620fa14ff820a1fe33c8d19bbbd1752&spm=a2e15.8261149.07626516002.2"
try:
r = requests.get(url)
r.raise_for_status()
r.encoding = r.apparent_encoding
print(r.text)
except:
print("爬取失败")
In all programming practices, the final effect of the code is very important, but the stability of the code is more important, so use the try except code block to catch exceptions
At the end of this article, please point out any errors~
Quote from
中国大学MOOC Python网络爬虫与信息提取
https://www.icourse163.org/course/BIT-1001870001