Request library -Python reptiles (a)

Requests Library


Installation Requests

pip install requests

Resquests library of seven main method

method Explanation
requests.request() A configuration request, the following method of supporting foundation method
requests.get() Get HTML pages of the main method
requests.head() Get HTML page header method
requests.post() POST method to submit requests to HTML pages
requests.put() PUT method to submit requests to HTML pages
requests.patch() Submit a request to modify the local HTML page
requests.delete() Submit a request to delete the HTML page

A. Get () method

r = requests.get(url)

The above code returns Response object

1.1 Response Object


Response object: Contains content reptile returned

Response object attributes

Attributes Explanation
r.status_code HTTP return status of the request, indicates a successful link 200, 404 for failure
r.text String HTTP response content, i.e. the corresponding page content url
r.encoding Guessed from the HTTP header in the response content encoding
r.apparent_encoding Analysis of the content of the response from the content encoding (encoding alternatively)
r.content HTTP response binary form content

1.2 Resquest objects


Abnormal Request Library

abnormal Explanation
requests.ConnectionError Network connection error exceptions, such as DNS query failed, refused connections
requests.HTTPError HTTP error exception
requests.URLRequired URL missing abnormal
request.TooManyRedirects Exceeds the maximum number of redirects, redirect produce abnormal
requests.ConnectTimeout Connect to a remote server timeout exception
requests.Timeout URL request times out, resulting in a timeout exception

1.3 crawled pages generic code frame


import requests

def getHTMLTest(url):
    try:
        r = requests.get(url, timeout=30)
        r.raise_for_status()  # 如果状态不是200, 引发HTTPError异常
        r.encoding = r.apparent_encoding
        return r.text
    except:
       return "产生异常"

# 只有在这个文件下运行才执行, 当在其他文件导入时是不执行的
if __name__ == '__main__':
    url = "http://www.baidu.com"
    print(getHTMLTest(url))

Two. Request major analytical method


requests.get(method, url, **kwargs)
** kwargs: access control parameters, are optional

For example:

kv = {'key1 : 'value1', 'key2' : 'values2'}
r = requests.request('GET', 'http://python123.io/ws', params=kv)
print(r.url)
# http://python123.io/ws?key1=value1&key2=value2

#模拟Chrome10来访问网站
hd = {'User-Agent':'Chrome/10'}
r = requests.request('POST', 'http://python123.io/ws', headers=hd)

#向服务器传输文件
fs = {'file' : open('data.xls', 'rb')}
r = requests.request('POST', 'http://python123.io/ws', file=fs)

# 设置超时时间
r = requests.request('GET', 'http://www.baidu.com', timeout=10)

pxs = {'http':'http://user:[email protected]:1231',
		'https':'https://11.10.10.1:3212'}
r = requests.request('GET', 'http://www.baidu.com', proxies=pxs)
parameter effect
params Byte dictionary or a sequence, as a parameter added to the url
data Dictionary, a sequence of bytes or a file object, as the contents of the Request
json JSON-formatted data, as the content of the Request
headers Dictionary, HTTP custom header
cookies Dictionary or CookieJar, Request of cookie
auth Ganso, support for HTTP authentication
files Dictionary type, file transfer
timeout Set timeout period, in seconds
proxies Dictionary type, set the access proxy server, you can increase the login authentication
allow_redirects:True/False The default is True, the switch redirects
stream:True/False The default is True, immediate access to content downloads switch
verify:True/False The default is True, SSL certificate authentication switch
cert Local SSL certificate path
He published 190 original articles · won praise 153 · views 90000 +

Guess you like

Origin blog.csdn.net/qq_36852780/article/details/104264452