Added usr-agent, but the HTTP code still returns 418<Response [418]>

Project scenario:

I have recently started to learn Python's network data collection. However, when I use Python's requests to collect data from web pages, even if user-agent has been added as headers, the returned HTTP status is still 418.


Problem Description

Since most websites now have certain anti-** mechanisms, when using Python requests to collect web page data, you need to add headers, otherwise it will be easily recognized by the website's anti-** mechanism and return status code 418. .

Then when I used Python requests to collect data from the web page, even though user-agent was added as headers, I still couldn't get the results. By printing res, I found that the returned status code was <418>. The following is a display of part of the code

def get_data(n):
  base_url = 'https://book.douban.com/top250'
  headers = {
    'User - Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36'
  }
  params = {
      'start':(n-1)*25
  }
  res=requests.get(base_url,headers=headers,params=params)
  print(res)
get_data(1)

Cause Analysis:

Through constant inspection of the code, I found that the problem occurred in the headers. My code is 'User - Agent' . This is because I directly chose to copy the user-agent in the network for the sake of convenience. ,as the picture shows.

 


solution:

Therefore, by deleting the spaces in the middle, the http status code was finally successfully returned to <Response [200]>

Summary: In fact, in the process of writing code, we often encounter problems caused by laziness. Such problems are relatively small, but sometimes they are not easy to find.

Guess you like

Origin blog.csdn.net/weixin_45913327/article/details/126563580