Import the urllib Request from from the urllib Import the parse # 1.urlopen open the page, read the page content RESP = request.urlopen ( 'http://www.baidu.com') print (resp.read ()) # Read all print (resp.read (10)) # 10 reads the first print (resp.readline ()) # reads a line print (resp.readlines ()) # reads a plurality of rows, in the manner of a list of print (resp.getcode ( )) # fetch response code # 2.urlretrieve download request.urlretrieve ( 'HTTP: //www.baidu.com','baidu.html') request.urlretrieve ( 'https://img-blog.csdn.net/ 20180103144259961? Watermark / 2 / text / aHR0cDovL2Jsb2cuY3Nkbi5uZXQvbGVlYWZheQ == / font / 5a6L5L2T / fontSize / 400 / Fill / I0JBQkFCMA == / the Dissolve / 70 / Gravity / SouthEast ',' VS.jpg ') # 3.urlencode coding function url = " http://www.baidu.com/s " the params = {" WD ":" Jay "} qs=parse.urlencode(params) url + = newURL "?" + QS RESP = request.urlopen (url) Print (RESP) # 4.parse_qs decoding function result = parse.parse_qs ( "wd =% E5% 91% A8% E6% 9D% B0% E4 BC% A6% ") Print (the Result) # 5.urlparse acquire content information url url =" http://www.baidu.com/s?wd=python&username=abc#1 " the Result = parse.urlparse (url) Print (result) # url get all the information content Print ( "scheme:", result.scheme) Print ( "netloc:", result.netloc) Print ( "path:", result.path) Print ( "params:", the Result .params) Print ( "Query:", result.query) Print ( "the fragment:", result.fragment) # 6.urlsplit get url content information, no params url = "http://www.baidu.com/s ? wd = python & username = abc # 1 " = parse.urlsplit the Result (url) Print (the Result) # url get all the information contenturlsplit(url) print("scheme:",result.scheme) print("netloc:",result.netloc) print("path:",result.path) # print("params:",result.params) print("query:",result.query) print("fragment:",result.fragment)
`urllib` library is` Python` a basic network request library. Can simulate the behavior of the browser, it sends a request to the specified server, and can save the data returned by the server. ### urlopen function: in the `urllib`` Python3` library, all network requests and related methods, are set to `urllib.request` below the module to first look at the use of basic functions` urlopen` : `` `Python from urllib Import Request RESP = request.urlopen ( 'http://www.baidu.com') Print (resp.read ()) ` `` In fact, using a browser to access Baidu, right View source code. You will find us just print out the data is exactly the same. In other words, the above three lines of code you have helped us to Baidu home page of all the code climb down. A basic url python code corresponding to the request is really quite simple. `Urlopen` hereinafter, the functions explained in detail: 1.` url`: URL request. 2. `data`: request` data`, if you set this value, it will become `post` request. 3. Return Value: The return value is a `http.client.HTTPResponse` object that is a class file handle objects. There `read (size)`, ` readline`,` readlines` `getcode` and the like. ### urlretrieve Function: This function can easily be saved to a local file on a web page. The following code can very easily be downloaded to the local Baidu's home page: Python `` ` from the urllib Import Request request.urlretrieve ( 'http://www.baidu.com/','baidu.html') ` `` ### urlencode function: When the browser sends a request, if url contains Chinese or other special characters, the browser will be automatically encoded to us. If using the code transmission request, then it must be encoded manually, this time should be used to achieve `urlencode` function. `urlencode` can` URL` dictionary data into coded data. Sample code is as follows: `` `Python from the urllib the parse Import Data = { 'name': 'crawler base', 'the greet': 'Hello World', 'Age':} 100 QS = parse.urlencode (Data) Print (QS ) `` ` ### parse_qs function: can be decoded through url parameters coded. Sample code is as follows: `` `Python from the urllib Import the parse QS =" name =% E7% 88% the AC% E8% 99% AB% E5% 9F% BA% E7% A1% 80 & the greet = Hello + World & Age = 100 " Print ( parse. Sometimes to get a url, wanted for the various components of this url in the split, so this time we can use `urlparse` or` urlsplit` to split. Sample code is as follows: `` `Python from the urllib Import Request, the parse URL = 'http://www.baidu.com/s?username=python' Result = parse.urlsplit (URL) # Result = parse.urlparse (URL) Print ( 'scheme:', result.scheme) Print ( 'netloc:', result.netloc) Print ( 'path:', result.path) Print ( 'Query:', result.query) `` ` ` urlparse` and `urlsplit` substantially identical. The only place is not the same, `urlparse` more inside a` params` property, and `urlsplit` do not have this` params` property. For example, there is a `url`:` url = 'http: //www.baidu.com/s ; hello wd = python & username = abc # 1'`,? It `urlparse` can get to` hello`, and `urlsplit `you can not get to. `url` in` params` also with less than.