(Reptile) Python Reptile 01

table of Contents:

First, the basic knowledge introduction

Second, get the page

 

First, the basic knowledge introduction

1, mainly rely urllib: That URL (web address) + lib (package); more details, refer python documentation (open IDLE - Help - Python Docs - you can query);

2, the general format of the URL (ps: the [] may be omitted)

Protocol: // domain name [: port] / path /

Where the term is explained as follows:

Protocol: such as: http, https, ftp, file and so on;

Domain: storage resource server domain name or IP address of the system (plus port number required portion, such as: 8080), such as: www.baidu.com (Domain example), localhost (local IP address) and the like;

Path: specific address storage resource, directory or file name, such as: index.html and so on.

Second, get the page

# Introduction rely 
Import urllib.request
 # open garden blog login address (ie get the page), the returned object is stored in response in                                                       
response = the urllib.request.urlopen ( " https://account.cnblogs.com/signin " )    
 # reading object just returned, will be stored in the form of binary strings in html_d 
html_d = response.read () 
 # binary decoding utf-8 string (mainly to see what the page is encoded, but typically are utf -8)                                                   
HTML html_d.decode = ( " UTF-. 8 " ) 
 # will print out the results                                             
Print (HTML)                                                                
View Code

 

 

Reference in this blog:

Zero-based learning portal Python                      https://www.bilibili.com/video/av4050443?p=54

Guess you like

Origin www.cnblogs.com/hwh000/p/12445199.html