Requests Library
Installation Requests
pip install requests
Resquests library of seven main method
method | Explanation |
---|---|
requests.request() | A configuration request, the following method of supporting foundation method |
requests.get() | Get HTML pages of the main method |
requests.head() | Get HTML page header method |
requests.post() | POST method to submit requests to HTML pages |
requests.put() | PUT method to submit requests to HTML pages |
requests.patch() | Submit a request to modify the local HTML page |
requests.delete() | Submit a request to delete the HTML page |
A. Get () method
r = requests.get(url)
The above code returns Response object
1.1 Response Object
Response object: Contains content reptile returned
Response object attributes
Attributes | Explanation |
---|---|
r.status_code | HTTP return status of the request, indicates a successful link 200, 404 for failure |
r.text | String HTTP response content, i.e. the corresponding page content url |
r.encoding | Guessed from the HTTP header in the response content encoding |
r.apparent_encoding | Analysis of the content of the response from the content encoding (encoding alternatively) |
r.content | HTTP response binary form content |
1.2 Resquest objects
Abnormal Request Library
abnormal | Explanation |
---|---|
requests.ConnectionError | Network connection error exceptions, such as DNS query failed, refused connections |
requests.HTTPError | HTTP error exception |
requests.URLRequired | URL missing abnormal |
request.TooManyRedirects | Exceeds the maximum number of redirects, redirect produce abnormal |
requests.ConnectTimeout | Connect to a remote server timeout exception |
requests.Timeout | URL request times out, resulting in a timeout exception |
1.3 crawled pages generic code frame
import requests
def getHTMLTest(url):
try:
r = requests.get(url, timeout=30)
r.raise_for_status() # 如果状态不是200, 引发HTTPError异常
r.encoding = r.apparent_encoding
return r.text
except:
return "产生异常"
# 只有在这个文件下运行才执行, 当在其他文件导入时是不执行的
if __name__ == '__main__':
url = "http://www.baidu.com"
print(getHTMLTest(url))
Two. Request major analytical method
requests.get(method, url, **kwargs)
** kwargs: access control parameters, are optional
For example:
kv = {'key1 : 'value1', 'key2' : 'values2'}
r = requests.request('GET', 'http://python123.io/ws', params=kv)
print(r.url)
# http://python123.io/ws?key1=value1&key2=value2
#模拟Chrome10来访问网站
hd = {'User-Agent':'Chrome/10'}
r = requests.request('POST', 'http://python123.io/ws', headers=hd)
#向服务器传输文件
fs = {'file' : open('data.xls', 'rb')}
r = requests.request('POST', 'http://python123.io/ws', file=fs)
# 设置超时时间
r = requests.request('GET', 'http://www.baidu.com', timeout=10)
pxs = {'http':'http://user:[email protected]:1231',
'https':'https://11.10.10.1:3212'}
r = requests.request('GET', 'http://www.baidu.com', proxies=pxs)
parameter | effect |
---|---|
params | Byte dictionary or a sequence, as a parameter added to the url |
data | Dictionary, a sequence of bytes or a file object, as the contents of the Request |
json | JSON-formatted data, as the content of the Request |
headers | Dictionary, HTTP custom header |
cookies | Dictionary or CookieJar, Request of cookie |
auth | Ganso, support for HTTP authentication |
files | Dictionary type, file transfer |
timeout | Set timeout period, in seconds |
proxies | Dictionary type, set the access proxy server, you can increase the login authentication |
allow_redirects:True/False | The default is True, the switch redirects |
stream:True/False | The default is True, immediate access to content downloads switch |
verify:True/False | The default is True, SSL certificate authentication switch |
cert | Local SSL certificate path |