Crawling sidebar Today News Network

Title: crawling on the left sidebar News headlines today, and to save csv file form

Code:

Import IO
 Import SYS
 Import the urllib.request
 Import PANDAS AS PD
 from pyquery Import pyquery AS PQ 
sys.stdout = io.TextIOWrapper (sys.stdout.buffer, encoding = ' GB18030 ' ) # change the default output encoding standard 
URL = ' HTTPS: //mini.eastday.com/jrdftt/ ' 
DEF get_info (URL): 
    RES = the urllib.request.urlopen (URL) 
    htmlBytes = res.read () 
    DOC = PQ (htmlBytes.decode ( ' UTF-. 8 ' )) 
    RES= doc(".channel-item span")
    t = [i.text for i in res]
    se = pd.Series(t)
    se.to_csv("列表.csv")

result:

 

 

Guess you like

Origin www.cnblogs.com/CJR-QYF/p/11919559.html