Python crawler learning 13
-
requests library
Before we learned the basic usage of python's built-in urllib library, there are many inconveniences, such as when dealing with web page authentication and cookies, we need to write Opener and Handler to deal with it. In addition, it is also inconvenient to implement requests such as POST and PUT.
So today we will learn about the more powerful requests library.
-
Installation of the requests library
Or install from source:
# 访问镜像站 # 下载文件到本地后,解压到python安装目录,之后打开解压文件 # 运行命令行 输入 pythton setup.py install
-
Example introduction
# urllib 库中的 urlopen方法实际上使用的就是Get方式请求网页,在requests库中,与之相对应的就是 get()方法。 import requests r = requests.get('http://www.baidu.com') print(type(r)) # 可以看到返回了一个 requests.models.Response 类 print(r.status_code) # 返回 响应码 可以看到响应码为 200(响应成功) print(r.cookies) # 返回 cookie 格式为 RequestsCookieJar 类 print(r.text) # 返回响应内容 格式为 str字符串类型
operation result:
-
Some of the commonly used methods
Relatively speaking, our most commonly used methods are get() and post(), which are used to send GET requests and POST requests respectively
import requests req0 = requests.get('http://www.baidu.com') req1 = requests.post('http://www.baidu.com') req2 = requests.put('http://www.baidu.com') req3 = requests.delete('http://www.baidu.com') print(req0.status_code,req1.status_code,req2.status_code,req3.status_code,sep='\n')
operation result:
-
Common properties of the Response object
import requests url = 'http://www.baidu.com' resp = requests.get(url) # 设置编码格式 resp.encoding = 'utf-8' # 获取相关信息 cookie = resp.cookies headers = resp.headers print(cookie, headers, sep='\n')
operation result:
-
Today ends, to be continued...