Getting Started with Requests

Send GET request

  1. The simplest way to send a get request is through requests.get
response=requests.get("http://www.baidu.com/")
  1. Add headers and query parameters:
    If you want to add headers, you can pass in the headers parameter to add header information in the request header. If you want to pass parameters in the url, you can use the params parameter. The relevant sample code is as follows:
 import requests

 kw = {
    
    'wd':'中国'}

 headers = {
    
    "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36"}

 # params 接收一个字典或者字符串的查询参数,字典类型自动转换为url编码,不需要urlencode()
 response = requests.get("http://www.baidu.com/s", params = kw, headers = headers)

 # 查看响应内容,response.text 返回的是Unicode格式的数据
 print(response.text)

 # 查看响应内容,response.content返回的字节流数据,为btype类型
 print(response.content)

 # 查看完整url地址
 print(response.url)

 # 查看响应头部字符编码
 print(response.encoding)

 # 查看响应码
 print(response.status_code) 

Python strings are encoded data during hard disk and network transmission, so the data types are all btype

The difference between response.text and response.content:

  • response.content: This is data captured directly from the Internet. No decoding has been done. So it is a bytes type. In fact, the strings transmitted on the hard disk and on the network are of type bytes.
  • response.text: This is the data type of str, which is the string that the requests library decodes response.content. Decoding requires specifying an encoding method, and requests will determine the encoding method based on their own guesses. Therefore, sometimes the guess may be wrong, which will cause decoding to produce garbled characters. At this time, you should use, for example, response.content.decode('utf-8') for manual decoding.

demo

#抓取页面保存到本地
import requests

#查询字符串
params = {
    
    
    'wd':'中国'
}
headers={
    
    
    'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36'
}
res = requests.get('http://www.baidu.com/s',params=params,headers=headers)
with open('baidu.html','w',encoding='utf-8') as fp:
    #指定解码方式
    fp.write(res.content.decode('utf-8'))

print(res.url)

Guess you like

Origin blog.csdn.net/Pang_ling/article/details/105671755