Simple network requests:
from urllib import request url = "http://www.baidu.com" rep=request.urlopen(url)
urlopen request data, returned to the variable rep
Operation of the return data:
urlopen returned result () is a file-like object, but also can be iterative, the following commonly used methods include
read (), readline (), readlines (): read the file
rep = request.urlopen(url) rep.read() 【Squeezed text】
rep.readline()
b'<!DOCTYPE html>\n' rep.readlines() 【Squeezed text】
info (): returns the header information
print(rep.info()) Date: Sat, 27 Jul 2019 03:32:18 GMT Content-Type: text/html Transfer-Encoding: chunked Connection: Close Vary: Accept-Encoding Set-Cookie: BAIDUID=71F5315626EBFC522CD27C212E0BDC71:FG=1; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com Set-Cookie: BIDUPSID=71F5315626EBFC522CD27C212E0BDC71; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com #太多了省略.......
getcode (): returns the status code http
print(rep.getcode()) 200
geturl (): Returns the url
print(rep.geturl()) http://www.baidu.com
urllib built in a way to make the data easier to save the page
from urllib.request import urlretrieve url = "http://www.baidu.com" urlretrieve(url,r'D:/baidu.html')
The above code is equivalent to
from urllib import request request.urlretrieve("http://www.baidu.com",r"D:/baidu.html") ('D:/baidu.html', <http.client.HTTPMessage object at 0x03576A30>)
urlretrieve(url, filename=None, reporthook=None, data=None)
url: File URL
filename: when saved to the local file used (path) NAME
reporthook: file transfer callback function, commonly used in the progress bar
data: post the data submitted to the server,
which returns a two tuple ( " a local file path ", <http.client.HTTPMessage Object>)