requests module - send requests with request headers

Header is the metadata information in HTTP request and response, which is used to pass additional parameters and configuration in the request. Sending a request with a header can achieve customized functions and more precise control. The following are some common HTTP header fields and their functions:

header field effect example
Authorization Provide authentication credentials to allow access to resources that require permissions Authorization: Bearer <token>
User-Agent Used to identify the client type and version information sending the request User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)
Content-Type Specifies the media type of the body part of the request or response Content-Type: application/json
Accept Specifies the response content types that the client can accept Accept: application/json
Cookie Used to pass session information between client and server Cookie: session_id=ABC123
Refer Indicates the source address of the request, used to prevent cross-site request forgery attacks Referer: https://example.com/page1
If-Modified-Since When the resource is not modified, return the cached version, reducing data transfer If-Modified-Since: Sat, 01 Jan 2023 00:00:00 GMT
User-Custom-Header User-defined Header field, which can be used to pass custom information X-Custom-Header: custom_value

Note: Header field names are not case-sensitive.

Different header fields can be used to pass different information in the HTTP request to achieve a more flexible and personalized request and response process. However, it should be noted that when using the header, you need to follow the relevant HTTP specifications and ensure the security and legality of the data.

Let's first write a code to get Baidu's homepage

# 导入requests库

import requests

# 设置要访问的URL
url = 'https://www.baidu.com'

# 发送GET请求获取响应
response = requests.get(url)

# 打印响应内容
print(response.content.decode())

# 打印响应对应请求的请求头信息
print(response.request.headers)

think

  1. Comparing the source code of the Baidu homepage on the browser and the source code of the Baidu homepage in the code, what is the difference?

    • To view the source code of a web page:
      • Right click - view web page source code or
      • right click - inspect
  2. What is the difference between the response content of the corresponding url and the source code of the Baidu homepage in the code?

    • The method to view the response content corresponding to the url:
      1. right click - inspect
      2. clickNet work
      3. tickPreserve log
      4. refresh page
      5. View Namethe URL under the same column as the address bar of the browserResponse
  3. The source code of the Baidu homepage in the code is very small, why?

    • We need to bring request header information

      Review the concept of crawlers, simulate browsers, deceive servers, and obtain content consistent with browsers

    • There are many fields in the request header, among which the User-Agent field is essential, indicating the client's operating system and browser information

The method of sending a request with a request header

requests.get(url, headers=headers)
  • The headers parameter receives request headers in the form of a dictionary
  • The field name of the request header is used as the key, and the value corresponding to the field is used as the value

Complete code implementation

Copy the User-Agent from the browser to construct a headers dictionary; after completing the following code, run the code to view the result

# 导入requests库
import requests

# 设置要访问的URL
url = 'https://www.baidu.com'

# 构造请求头字典,模拟浏览器发送请求
headers = {
    
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}

# 在请求头中带上User-Agent,模拟浏览器发送请求
response = requests.get(url, headers=headers) 

# 打印响应内容
print(response.content)

# 打印请求头信息
print(response.request.headers)

Guess you like

Origin blog.csdn.net/m0_67268191/article/details/131784036