requests module - send requests with parameters

Send a request with parameters

When we use Baidu search, we often find that there will be one in the url address ?, then the question mark is followed by the request parameter, also called the query string

Carry parameters in url

Initiate a request directly to the url with parameters

import requests  # 导入 requests 模块

headers = {
    
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}  # 设置请求头,模拟浏览器访问

url = 'https://www.baidu.com/s?wd=python'  # 要访问的 URL

response = requests.get(url, headers=headers)  # 发起 GET 请求,并将响应保存在 response 变量中

Carry parameter dictionary through params

1. Build a dictionary of request parameters

2. Bring the parameter dictionary when sending a request to the interface, and set the parameter dictionary to params

import requests  # 导入 requests 模块

headers = {
    
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}  # 设置请求头,模拟浏览器访问

# 这是目标url
# url = 'https://www.baidu.com/s?wd=python'

# 最后有没有问号结果都一样
url = 'https://www.baidu.com/s?'  # 要访问的 URL,注意此时未添加请求参数

# 请求参数是一个字典,即wd=python
kw = {
    
    'wd': 'python'}  # 设置请求参数

# 带上请求参数发起请求,获取响应
response = requests.get(url, headers=headers, params=kw)  # 发起带有请求参数的 GET 请求,并将响应保存在 response 变量中

print(response.content)  # 打印响应内容
print(response.status_code)     # 打印响应状态码
print(response.request.url)     # 打印请求的URL

Of course, carrying parameters is not the only way. The second way is to directly splice the parameters into the URL.

import requests  # 导入 requests 模块

headers = {
    
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}  # 设置请求头,模拟浏览器访问

# 目标url
url = 'https://www.baidu.com/s?wd={}'.format("python")

# 发送GET请求
response = requests.get(url, headers=headers)  # 发起带有请求参数的 GET 请求,并将响应保存在 response 变量中

print(response.content)  # 打印响应内容
print(response.status_code)     # 打印响应状态码
print(response.request.url)     # 打印请求的URL
  1. Likewise, we import requestslibraries and define request headers.
  2. Then, we construct the complete request URL, splicing the parameter "python" into the URL.
  3. Next, we use requests.getthe function to send the GET request, passing headersthe headers to the request.
  4. Finally, we print the response status code and the requested URL.

The results of these two methods should be the same, and a GET request with the specified header and parameters will be sent to the Baidu search page. One way is paramsto pass parameters through parameters, and the other way is to directly splice them into the URL. According to your needs and habits, you can choose one of the methods to send a request with headers and parameters.

operation result

insert image description here

It can be seen from the running results

1. returned响应内容·

2. returned状态码

3. Returned after sending the request after we carry the parametersurl地址

think

There is no problem with the first two

The url address returned by the third item is like this

https://wappass.baidu.com/static/captcha/tuxing.html?&logid=8942509947662239692&ak=c27bbc89afca0463650ac9bde68ebe06&backurl=https%3A%2F%2Fwww.baidu.com%2Fs%3Fwd%3D%25E4%25BC%25A0%25E6%2599%25BA%25E6%2592%25AD%25E5%25AE%25A2&ext=x9G9QDmMXq%2FNo87gjGO0P1Mp5uvFzuAL%2F5H%2B6udoILQaaeyhUvJM%2FH2FbXwC2V2ug1T9i%2F9aWxTqDk%2BHyX%2BPs2XG9KoIOTaqsDSpFr2Y1s11BToUucrnPfSjdXR1JUIw4uIBNlXlGYC%2BrQSfMy70TKgNjaVYFxC53U6q5godYvQ%3D&signature=f03cf9c4de5408e13ab9ca4a61e8af8c&timestamp=1689611641

Why is this so?

We talked about what parameters are at the beginning, now let's introduce them in full

Parameters are a way of passing additional data to the server when sending a request. They usually come in the form of key-value pairs and are appended to the URL or request body of the request so that the server can understand and use the data.

When sending a request using requeststhe library, you can pass parameters in different ways:

  1. URL parameters : Add parameters directly to the query string of the URL, usually use a question mark (?) to separate the URL path and the parameter part, and use an ampersand (&) to connect multiple parameters. For example: https://www.example.com/search?q=keyword&page=1.

  2. Request body parameters : For some request methods (such as POST, PUT), parameters can be sent as data in the request body. This method is often used to submit form data or pass a large amount of data. Parameters are sent as part of the request body.

    After reviewing the knowledge points and practice cases of parameters, we can get

    In the Baidu search URL, wdit is a parameter that represents the search keyword (word). The format of Baidu search URL is usually https://www.baidu.com/s?wd={搜索关键词}, where {搜索关键词}is the content to be searched.

    When you enter a keyword in the browser address bar and press the Enter key to search, the browser will wdadd the keyword as a parameter value to the search URL, and then send a request to the Baidu search server. The Baidu search server will return the corresponding search result page according to the value of this parameter.

    You can wdreplace the parameter with another name, but you need to ensure that the parameter name is consistent with the parameter name accepted by the Baidu search server when constructing the URL. Generally speaking, wdthe parameter is the standard parameter name agreed by the Baidu search interface.

    For example, the following two URLs are equivalent:

    1. https://www.baidu.com/s?wd=python
    2. https://www.baidu.com/s?q=python

    In these two URLs, wdand qare the parameter names of the search keywords. As long as the parameter names in the URL are correct, the Baidu search server can correctly parse the search keywords and return corresponding search results.

In the first step, we now control the output to find clues
insert image description here

The second step is to open Baidu search python in the browser, and check the actual URL address obtained

insert image description here

https://www.baidu.com/s?wd=python&rsv_spt=1&rsv_iqid=0xfafed5dd002d3968&issp=1&f=8&rsv_bp=1&rsv_idx=2&ie=utf-8&tn=baiduhome_pg&rsv_enter=1&rsv_dl=tb&rsv_sug3=7&rsv_sug1=5&rsv_sug7=100&rsv_sug2=0&rsv_btype=i&inputT=2276&rsv_sug4=4493&rsv_sug=1

It can be found that the url addresses obtained by the two are different.

为什么返回的url地址里会有%这些符号呢,是经过了编码还是加密呢?

The percent (%) symbol in the returned URL address is the result of URL encoding, not encryption.

URL encoding is a process used to escape special characters in URLs. It ensures the integrity and transportability of URLs, since certain characters have special meanings in URLs that may cause parsing errors or interfere with the URL's structure.

Why is URL encoding required?

  • Special characters in URLs (such as spaces, question marks, equal signs, etc.) must be encoded in order to be passed and parsed correctly.
  • URL encoding also allows non-ASCII characters (such as Chinese characters) to be included in URLs, because the encoding format of URLs uses ASCII codes and does not support direct inclusion of non-ASCII characters.

The process of URL encoding:

  1. Convert the special characters in the URL to % plus two hexadecimal numbers. For example, a space would be encoded as %20.
  2. Non-ASCII characters are converted to UTF-8 encoded byte sequences, and each byte is converted to % followed by two hexadecimal digits.

Example of URL encoding: Original URL: https://www.example.com/search?q=keyword spaceEncoded URL:https://www.example.com/search?q=keyword%20space

In Python, you can use the functions urllib.parsein the quote()URL-encoding module to convert special characters to URL-encoded form. The unquote()function is used for decoding, which restores the encoded characters to the original characters.

有编码就有解码

  • Decode through python's third-party library
# 导入模块 urllib.parse,用于 URL 编码和解码
from urllib.parse import unquote

# 编码后的 URL
encoded_url = 'https://wappass.baidu.com/static/captcha/tuxing.html?&logid=9021321451410687651&ak=c27bbc89afca0463650ac9bde68ebe06&backurl=https%3A%2F%2Fwww.baidu.com%2Fs%3Fwd%3Dpython&ext=x9G9QDmMXq%2FNo87gjGO0P7j2SskTsRDFQ5%2BYRfbG895WkhrGqVqihBVl6ZY8QtPHg1T9i%2F9aWxTqDk%2BHyX%2BPs0ChrtIMtQkyMCB6zc%2FRMjiq88vjYNjKcIXs1AYHPxXwhDn1Gvd2wllOjP2lxjF%2FsKgNjaVYFxC53U6q5godYvQ%3D&signature=00dca057de577487460ab979bb170ad0&timestamp=1689650369'

# 使用 unquote 方法解码 URL
decoded_url = unquote(encoded_url)

# 打印解码后的 URL
print(decoded_url)

insert image description here

insert image description here

Guess you like

Origin blog.csdn.net/m0_67268191/article/details/131784098