Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
Requests to use
Hold down the way:pip3 install requests
1, response common methods:
A, get request
print(response.text) #页面源码
print(response.status_code) # 状态吗
print(response.headers) # 响应头
print(response.request.headers) #获取请求头
print(response.content) #获取页面的二进制数据
* response.encoding = 'utf-8' 可以设置编码类型
* response.encoding 获取当前的编码
* response.json() 内置的JSON解码器,以json形式返回,前提返回的内容确保是json格式的,不然解析出错会抛异常
Two, post a request
response = requests.post(url=url, data = data)
* url:post请求的目标url
* data:post请求的表单数据
post request to upload files
from_data = {
'username':'LXJ',
'password':'292143060li'
}
url = 'http://127.0.0.1:8001/api/login/'
headers = {
'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'
}
response = requests.post(url=url,data=from_data,headers=headers)
url = 'https://httpbin.org/post'
files = {'file':open('pages.html','r',encoding='gbk')}
# 读取本地文件
response = requests.post(url=url,files=files,headers=headers)
if response.status_code == 200:
print('文件上传成功')
print(response.text)
Set up a proxy (proxies parameter)
import requests
url = 'http://college.gaokao.com/schlist/'
# params : 跟的是get请求url地址后?后面拼接的参数
headers = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
proxies = {
'http':'192.168.2.913:8992',
'http':'192.168.2.913:9923',
}
response = requests.get(url,params=None,headers=headers,proxies=proxies)
XPath selectors
<1> What is XPath?
- XPath (XML Path Language) is an XML document to find information in the language, it can be used to traverse the elements and attributes in an XML document.
<2> XPath path expression most common:
- / Select from the root node.
- // Select the document matches the selected node from the current node, regardless of their location.
- . Select the current node.
- ... Select the parent of the current node.
- @ Select Properties.
- All child nodes of the bookstore element selected bookstore.
- / Bookstore selected root element bookstore. Note: If the path starts with a forward slash (/), then this path is always representative of the absolute path to an element!
- bookstore / book book select all elements belonging to sub-elements of the bookstore.
- // book Selects all book sub-elements, regardless of their position in the document.
- bookstore // book Selects all book elements that belong to the descendants of the bookstore element, and no matter what position they are located below the bookstore.
- // @ lang Selects all the property named lang.
- / Bookstore / * select all child elements of the bookstore element.
- // * Select all elements in the document. html / node () / meta / @ * meta attributes to select all nodes below the node in any html
- // title [@ *] to select all elements with title attributes.