Requests library
introduce
- Requests is a library in Python for making HTTPS requests. It provides a simple and intuitive API for sending HTTP, HTTPS requests and handling responses.
request.get() function
parameter
-
url, generally place the URL that needs to be requested
-
headers, generally used for User-agent (UA) camouflage, to prevent the server from identifying machine requests, the headers can be obtained by right-clicking the browser to select check, then clicking the network, refreshing ( fn+ F5 ) , and then clicking a random data package , find the user-agent field
-
proxies, generally used for batch crawling, the purpose is to prevent the server from identifying frequent requests from the same machine, which leads to prohibiting the host from crawling
-
Cookies, carry cookies (dictionary form) when sending the request, which is convenient for the server to save user information. Right-click cookies in the browser to select check, then click the network , refresh ( fn+F5 ), and find a cookie field.
-
parms , pass in other parameters, convenient and flexible to use
example
improt request
header = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
url = 'https://baidu.com'
proxies = {
'http': 'http://10.0.0.1:8080', 'https': 'https://10.0.0.1:8080'}
response = request.get(url=url, headers=header)
print(response.text) # 显示的是获取到的html文档
request.get() returns the result
- status_code, is the return status code, if it is 200, it proves that the request is successful
- text, the returned html document element
- content, because the request may not be a URL, if the URL is not a URL, the content will be returned to the content.
- response.json() method,
data=response.json()
, data returns the data that processes the content as json type
Then request the data that can be parsed by Xpath, bs4 and other methods.
request.post() function
introduce
- Mainly send a POST request to the specified url, send data, return the text/Response object of the response , and the return value is a response.Response object
parameter
- url, the parameters to be sent
- data, the data to be passed in, which can be a dictionary, a list of tuples, bytes, or a file object to be sent to the URL
- json, the JSON object sent to the URL
- cookies, similar to the above get method
- proxies, similar to the above get method
example
import requests
url = 'https://www.begtut.com/try/python/demopage.php'
data= {
'somekey': 'somevalue'}
response = requests.post(url, data = data)
print(response)
Remark
You can see it combined with XPath application and so on. The link is below. Xpath introduction and syntax