requests module - send requests with request headers

Header is the metadata information in HTTP request and response, which is used to pass additional parameters and configuration in the request. Sending a request with a header can achieve customized functions and more precise control. The following are some common HTTP header fields and their functions:

header field	effect	example
Authorization	Provide authentication credentials to allow access to resources that require permissions	`Authorization: Bearer <token>`
User-Agent	Used to identify the client type and version information sending the request	`User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)`
Content-Type	Specifies the media type of the body part of the request or response	`Content-Type: application/json`
Accept	Specifies the response content types that the client can accept	`Accept: application/json`
Cookie	Used to pass session information between client and server	`Cookie: session_id=ABC123`
Refer	Indicates the source address of the request, used to prevent cross-site request forgery attacks	`Referer: https://example.com/page1`
If-Modified-Since	When the resource is not modified, return the cached version, reducing data transfer	`If-Modified-Since: Sat, 01 Jan 2023 00:00:00 GMT`
User-Custom-Header	User-defined Header field, which can be used to pass custom information	`X-Custom-Header: custom_value`

Note: Header field names are not case-sensitive.

Different header fields can be used to pass different information in the HTTP request to achieve a more flexible and personalized request and response process. However, it should be noted that when using the header, you need to follow the relevant HTTP specifications and ensure the security and legality of the data.

Let's first write a code to get Baidu's homepage

# 导入requests库

import requests

# 设置要访问的URL
url = 'https://www.baidu.com'

# 发送GET请求获取响应
response = requests.get(url)

# 打印响应内容
print(response.content.decode())

# 打印响应对应请求的请求头信息
print(response.request.headers)

think

Comparing the source code of the Baidu homepage on the browser and the source code of the Baidu homepage in the code, what is the difference?
- To view the source code of a web page:
  - Right click - view web page source code or
  - right click - inspect
What is the difference between the response content of the corresponding url and the source code of the Baidu homepage in the code?
- The method to view the response content corresponding to the url:
  1. right click - inspect
  2. clickNet work
  3. tickPreserve log
  4. refresh page
  5. View Namethe URL under the same column as the address bar of the browserResponse
The source code of the Baidu homepage in the code is very small, why?
- We need to bring request header information
  
  Review the concept of crawlers, simulate browsers, deceive servers, and obtain content consistent with browsers
- There are many fields in the request header, among which the User-Agent field is essential, indicating the client's operating system and browser information

The method of sending a request with a request header

requests.get(url, headers=headers)

The headers parameter receives request headers in the form of a dictionary
The field name of the request header is used as the key, and the value corresponding to the field is used as the value

Complete code implementation

Copy the User-Agent from the browser to construct a headers dictionary; after completing the following code, run the code to view the result

# 导入requests库
import requests

# 设置要访问的URL
url = 'https://www.baidu.com'

# 构造请求头字典，模拟浏览器发送请求
headers = {
    
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}

# 在请求头中带上User-Agent，模拟浏览器发送请求
response = requests.get(url, headers=headers) 

# 打印响应内容
print(response.content)

# 打印请求头信息
print(response.request.headers)