01-urllib library of commonly used functions

Import the urllib Request from 
from the urllib Import the parse 
# 1.urlopen open the page, read the page content 
RESP = request.urlopen ( 'http://www.baidu.com') 
print (resp.read ()) # Read all 
print (resp.read (10)) # 10 reads the first 
print (resp.readline ()) # reads a line 
print (resp.readlines ()) # reads a plurality of rows, in the manner of a list of 
print (resp.getcode ( )) # fetch response code 

# 2.urlretrieve download 
request.urlretrieve ( 'HTTP: //www.baidu.com','baidu.html') 
request.urlretrieve ( 'https://img-blog.csdn.net/ 20180103144259961? Watermark / 2 / text / aHR0cDovL2Jsb2cuY3Nkbi5uZXQvbGVlYWZheQ == / font / 5a6L5L2T / fontSize / 400 / Fill / I0JBQkFCMA == / the Dissolve / 70 / Gravity / SouthEast ',' VS.jpg ') 

# 3.urlencode coding function 
url = " http://www.baidu.com/s " 
the params = {" WD ":" Jay "}
qs=parse.urlencode(params)
url + = newURL "?" + QS 
RESP = request.urlopen (url) 
Print (RESP) 

# 4.parse_qs decoding function 
result = parse.parse_qs ( "wd =% E5% 91% A8% E6% 9D% B0% E4 BC% A6% ") 
Print (the Result) 

# 5.urlparse acquire content information url 
url =" http://www.baidu.com/s?wd=python&username=abc#1 " 
the Result = parse.urlparse (url) 
Print (result) # url get all the information content 
Print ( "scheme:", result.scheme) 
Print ( "netloc:", result.netloc) 
Print ( "path:", result.path) 
Print ( "params:", the Result .params) 
Print ( "Query:", result.query) 
Print ( "the fragment:", result.fragment) 

# 6.urlsplit get url content information, no params 
url = "http://www.baidu.com/s ? wd = python & username = abc # 1 "
= parse.urlsplit the Result (url) 
Print (the Result) # url get all the information contenturlsplit(url)
print("scheme:",result.scheme)
print("netloc:",result.netloc)
print("path:",result.path)
# print("params:",result.params)
print("query:",result.query)
print("fragment:",result.fragment)

  

`urllib` library is` Python` a basic network request library. Can simulate the behavior of the browser, it sends a request to the specified server, and can save the data returned by the server. 

### urlopen function: 

in the `urllib`` Python3` library, all network requests and related methods, are set to `urllib.request` below the module to first look at the use of basic functions` urlopen` : 

`` `Python 
from urllib Import Request 
RESP = request.urlopen ( 'http://www.baidu.com') 
Print (resp.read ()) 
` `` 

In fact, using a browser to access Baidu, right View source code. You will find us just print out the data is exactly the same. In other words, the above three lines of code you have helped us to Baidu home page of all the code climb down. A basic url python code corresponding to the request is really quite simple. 
`Urlopen` hereinafter, the functions explained in detail: 

1.` url`: URL request. 
2. `data`: request` data`, if you set this value, it will become `post` request. 
3. Return Value: The return value is a `http.client.HTTPResponse` object that is a class file handle objects. There `read (size)`, ` readline`,` readlines` `getcode` and the like. 

### urlretrieve Function: 

This function can easily be saved to a local file on a web page. The following code can very easily be downloaded to the local Baidu's home page:

Python `` ` 
from the urllib Import Request 
request.urlretrieve ( 'http://www.baidu.com/','baidu.html') 
` `` 

### urlencode function: 

When the browser sends a request, if url contains Chinese or other special characters, the browser will be automatically encoded to us. If using the code transmission request, then it must be encoded manually, this time should be used to achieve `urlencode` function. `urlencode` can` URL` dictionary data into coded data. Sample code is as follows: 

`` `Python 
from the urllib the parse Import 
Data = { 'name': 'crawler base', 'the greet': 'Hello World', 'Age':} 100 
QS = parse.urlencode (Data) 
Print (QS ) 
`` ` 

### parse_qs function: 

can be decoded through url parameters coded. Sample code is as follows: 

`` `Python 
from the urllib Import the parse 
QS =" name =% E7% 88% the AC% E8% 99% AB% E5% 9F% BA% E7% A1% 80 & the greet = Hello + World & Age = 100 " 
Print ( parse.


Sometimes to get a url, wanted for the various components of this url in the split, so this time we can use `urlparse` or` urlsplit` to split. Sample code is as follows: 

`` `Python 
from the urllib Import Request, the parse 

URL = 'http://www.baidu.com/s?username=python' 

Result = parse.urlsplit (URL) 
# Result = parse.urlparse (URL) 

Print ( 'scheme:', result.scheme) 
Print ( 'netloc:', result.netloc) 
Print ( 'path:', result.path) 
Print ( 'Query:', result.query) 
`` ` 

` urlparse` and `urlsplit` substantially identical. The only place is not the same, `urlparse` more inside a` params` property, and `urlsplit` do not have this` params` property. For example, there is a `url`:` url = 'http: //www.baidu.com/s ; hello wd = python & username = abc # 1'`,? 
It `urlparse` can get to` hello`, and `urlsplit `you can not get to. `url` in` params` also with less than.

  

Guess you like

Origin www.cnblogs.com/wcyMiracle/p/12454661.html
Recommended