Python Reptile practice - urllib.request and requests

Two demo before using the request module within urllib, which we can not help but find the body returns to obtain valid information request body needs to decode or encode splice after loading, http request, then you need to get or post request to construct recall , proxy header and the like need to request header structure. And requests the library to help us further encapsulates the request module, we just need to call the method directly corresponds to the request method, you can easily construct an http request. But the face of the case of simulated landing, etc., using custom-tailored urllib http request is also essential.

Compare and requests of different urlib.request

The first is urllib.request

# demo_urllib
from urllib import request

headers = {
"User-Agent": "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-cn; BLA-AL00 Build/HUAWEIBLA-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.132 MQQBrowser/8.9 Mobile Safari/537.36"
}
wd = {"wd": "中国"}
url = "http://www.baidu.com/s?"
req = request.Request(url, headers=headers)
response = request.urlopen(req)
print(type(response))
print(response)
res = response.read().decode()
print(type(res))
print(res)

result:

urllib response object library is to create httprequest object loaded into reques.urlopen in complete http request, returns httpresponse object is actually html property, use .read (). decode () after decoding converted into a string str after the Chinese character type, you can also see decode decoding can be displayed

Followed by reuqests

# demo_requests
import requests

headers = {
"User-Agent": "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-cn; BLA-AL00 Build/HUAWEIBLA-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.132 MQQBrowser/8.9 Mobile Safari/537.36"
}
wd = {"wd": "中国"}
url = "http://www.baidu.com/s?"
response = requests.get(url, params=wd, headers=headers)
data = response.text
data2 = response.content
print(response)
print(type(response))
print(data)
print(type(data))
print(data2)
print(type(data2))
print(data2.decode())
print(type(data2.decode()))

result:

requests.get library call requests are passed url and method parameters, the object is returned Response object, is a printed response status codes may be returned by the method is unicode .text data type generally defined in the header of the page encoded form, and the content is returned bytes, two data type system, there may be a method .json return json string. If you want to extract the text on the use of text, but if you want to extract images, files and other binary files, we must use the content, of course, after the decode, Chinese characters will display properly it> _ <

Guess you like

Origin www.cnblogs.com/liuchaodada/p/12050745.html