Python crawler learning 13

Python crawler learning 13

  • requests library

    ​ Before we learned the basic usage of python's built-in urllib library, there are many inconveniences, such as when dealing with web page authentication and cookies, we need to write Opener and Handler to deal with it. In addition, it is also inconvenient to implement requests such as POST and PUT.

    ​ So today we will learn about the more powerful requests library.

    • Installation of the requests library

      insert image description here

      Or install from source:

      # 访问镜像站
      # 下载文件到本地后,解压到python安装目录,之后打开解压文件
      # 运行命令行 输入 pythton setup.py install
      
    • Example introduction

      # urllib 库中的 urlopen方法实际上使用的就是Get方式请求网页,在requests库中,与之相对应的就是 get()方法。
      
      import requests
      
      r = requests.get('http://www.baidu.com')
      
      print(type(r))          # 可以看到返回了一个 requests.models.Response 类
      print(r.status_code)	# 返回 响应码 可以看到响应码为 200(响应成功)
      print(r.cookies)		# 返回 cookie 格式为 RequestsCookieJar 类
      print(r.text)			# 返回响应内容 格式为 str字符串类型
      

      operation result:

      insert image description here

    • Some of the commonly used methods

      insert image description here

      Relatively speaking, our most commonly used methods are get() and post(), which are used to send GET requests and POST requests respectively

      import requests
      
      req0 = requests.get('http://www.baidu.com')
      req1 = requests.post('http://www.baidu.com')
      req2 = requests.put('http://www.baidu.com')
      req3 = requests.delete('http://www.baidu.com')
      
      print(req0.status_code,req1.status_code,req2.status_code,req3.status_code,sep='\n')
      

      operation result:

      insert image description here

    • Common properties of the Response object

      insert image description here

      import requests
      
      url = 'http://www.baidu.com'
      resp = requests.get(url)
      # 设置编码格式
      resp.encoding = 'utf-8'
      # 获取相关信息
      cookie = resp.cookies
      headers = resp.headers
      print(cookie, headers, sep='\n')
      

      operation result:

      insert image description here

Today ends, to be continued...

Guess you like

Origin blog.csdn.net/szshiquan/article/details/123412611