Requests library
7 main methods
method | illustrate |
---|---|
requests.request() | Constructs a request that underpins the foundation of the following methods |
requests.get() | The main method for obtaining HTML pages, corresponding to HTTP GET |
requests.head() | The method for obtaining the header information of HTML pages, corresponding to the HEAD of HTTP |
requests.post() | A method for submitting a POST request to an HTML web page, corresponding to HTTP POST |
requests.put() | A method for submitting a PUT request to an HTML web page, corresponding to HTTP PUT |
requests.patch() | Submit a partial modification request to an HTML page, corresponding to HTTP PATCH |
requests.delete() | Submit a delete request to an HTML page, corresponding to HTTP DELETE |
get method
requests.get(url, params=None, **kwargs)
- url: the url of the page to be retrieved
- params: extra parameters in url, dictionary or byte stream format, optional
- **kwargs: 13 parameters that control access, as follows:
field | illustrate | Example |
---|---|---|
params | A dictionary or byte sequence, added to the url as a parameter | kv = {'key1':'value1', 'key2':'value2'} r = requests.request('GET','http://www.python123.io/ws', params=kv``print(r.url) #https://www.python123.io/ws?key1=value1&key2=value2 |
data | Dictionary, endian or file object, as the content of the Request | kv = {'key1':'value1', 'key2':'value2'} r = requests.request('GET','http://www.python123.io/ws', data=kv) |
json | JSON format data, as the content of the Request | kv = {'key1':'value1'} r = requests.request('POST','http://www.python123.io/ws', json=kv) |
headers | Dictionary, HTTP custom headers | hd = {'user-agent':'Chrome/10'} r = requests.request('POST','http://www.python123.io/ws', headers=hd) |
cookies | Dictionary or cookieJar, cookie in Request | |
auth | Tuple, support HTTP authentication function | |
files | Dictionary type, transfer file | fs = {'file':open('test.xls', 'rb')} r = requests.request('POST','http://www.python123.io/ws', files=fs) |
timeout | timeout, in seconds | r = requests.request('GET','http://www.python123.io/ws', timeout=10) |
proxies | Dictionary type, set the access server, you can add login authentication | pxs = {'http':'http://user:[email protected]:1234', 'https':'https://10.10.10.1:4321'} r = requests.request('GET','http://www.python123.io/ws', proxies=pxs) |
allow_redirects | True/False, the default is True, redirect switch | |
stream | True/False, the default is True, get the content immediately download switch | |
varify | True/False, the default is True, authentication SSL switch | |
cert | Local SSL certificate path |
### Properties of the Response object
Attributes | illustrate |
---|---|
r.status_code | The status of the HTTP request, 200 for success, 404 for failure |
r.text() | The string form of the HTTP response content, that is, the page content corresponding to the url |
r.encoding | Response content encoding format guessed from HTTP headers |
r.apparent_encoding | Response content encoding method analyzed from the content |
r.content | The binary form of the HTTP response content |
r.raise_for_status() | If not 200, requests.HTTPError is generated |
Requests library exception
abnormal | illustrate |
---|---|
requests.ConnectionError | Abnormal network connection errors, such as DNS query failure, connection refused, etc. |
requests.HTTPError | Http error exception |
requests.URLRequired | URL missing exception |
requests.TooManyRedirects | If the maximum number of redirects is exceeded, a redirection exception is generated |
requests.ConnectTimeout | Connection to remote server timed out exception |
requests.Timeout | The request URL timed out, resulting in a timeout exception |
Common code framework
import requests
def get_html_text(url):
try:
r = requests.get(url, timeout=30)
r.raise_for_status()
r.encoding = r.apparent_encoding
return r.text
except :
return ''
url = 'http://www.baidu.com'
text = get_html_text(url)#得到页面text,可进一步处理