Python crawler entry example 2: Amazon product page crawling

1. Crawl the original page

  The original page cited is as shown below, which is an Amazon product

Insert picture description here

2. Analysis of error-prone points

  Since Amazon has set up source review, you need to change the code to crawl the above content, that is, change the header information, which is the headers, and use a dictionary to construct key-value pairs.

kv = {
    
    'user-agent':'Mozilla/5.0'}

  For detailed explanation, please refer to the article I wrote before (find yourself, hehe)

Link: https://blog.csdn.net/weixin_44578172/article/details/109302571

3. Complete code

import requests
url = "https://www.amazon.cn/gp/product/B01M8L5Z3Y"
try:
    kv = {
    
    'user-agent':'Mozilla/5.0'}
#使用字典构造键值对,用Mozilla/5.0代替之前发送请求的header中的user-agent
    r = requests.get(url,headers=kv)
    r.raise_for_status()
    r.encoding = r.apparent_encoding
    print(r.text[:1000])
except:
    print("爬取失败")

The crawling results are as follows:

Insert picture description here
  At the end of this article, please point out any errors~

Quote from

中国大学MOOC Python网络爬虫与信息提取
https://www.icourse163.org/course/BIT-1001870001

Guess you like

Origin blog.csdn.net/weixin_44578172/article/details/109323613