Simple basic code of crawler

The following code can be run without comments:

1  import urllib.request
 2  
3 url = ' http://www.jianshu.com/ ' 
4 response = urllib.request.urlopen(url=url) #The first parameter is the url to be opened and the second is the data representation print(type(response))\ 
5  #    #returns an HTTPResponse object 
6  # print(response.read()) #reads the content of all web pages including newlines and tabs, and obtains Binary data 
7  # print(response.read().decode('utf-8')) #output after decoding #string-"byte: encoding encode() byte-"string: decoding decode() 
8  # print(response.readline()) #read line 
9  # print(response.readlines()) #read all and return a list 
10  #print(response.getheaders()) #Return a response header information, there are tuples 
11 in the list  # urllib.request.urlretrieve(url=url,filename='baidu.html') #Download the file to the local and name it, you can Download web pages, pictures, videos, etc. 
12  # urllib.parse # urllib.parse.urlencode for processing urls When introducing post requests, let’s talk about this function 
13  print (response.getheaders())
 14  #Encoding : Because the browser does not recognize the content in your request Chinese characters 
15  #encoding 16 # string = urllib.parse.quote('http://www.baidu.com?username=dogdan&password=123') 17 # print(string) 18 #decoding 19 # string = urllib . parse.unquote('http%3A//www.baidu.com%3Fusername%3D%E7%8B%97%E8%9B%8B%26password%3D123') 20
 
 
 
 
 # print(string)

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325044659&siteId=291194637