Python Reptile project combat - crawling cat's eye movie

How to learn Python reptiles, reptile difficulty does not lie in the fact reptile itself. But a wide variety of anti-reptile measures. Below a small case to share with you a taste of the charm of the python.

How to learn Python reptiles, reptile difficulty does not lie in the fact reptile itself. But a wide variety of anti-reptile measures. Below a small case to share with you a taste of the charm of the python.

How to learn Python reptiles, reptile difficulty does not lie in the fact reptile itself. But a wide variety of anti-reptile measures. Below a small case to share with you a taste of the charm of the python.

Crawling "River of Sorrow" Cat's Eye Information Project Share Source:

1  '' ' 
2  What I do not know how you can add in the learning process
 3  Python learning exchanges buttoned Qun, 934 109 170
 4  group, there are good tutorials, development tools and e-books.
5  Share python current business needs and your talent and how good python learning from zero base, and learn what content.
. 6  '' ' 
. 7  Import Requests
 . 8  from fake_useragent Import UserAgent
 . 9  Import JSON
 10  Import pymongo
 . 11   
12 is  # stored in the database 
13 is CLIEN = pymongo.MongoClient (= Host ' fill in the database the IP ' )
 14 DB = clien.The_cat_s_eye_essay
 15 Coll =db.eye_essay
 16   
. 17  # Create a randomly generated objects of the user-aengt 
18 is UA = UserAgent ()
 . 19   
20 is  # extracted we want Commentary 
21 is  DEF parse_json (JSON):
 22 is      IF JSON:
 23 is          items json.get = ( ' CMTS ' )
 24          I = 0
 25          for Item in items:
 26 is              Data = {
 27                  ' ID ' : item.get ( ' NICKNAME ' ),
 28                  ' Commentary ': item.get ( ' Content ' ),
 29                  ' score ' : item.get ( ' Score ' ),
 30                  ' user locations ' : item.get ( ' cityName ' ),
 31                  ' Comments Time ' : item.get ( ' the startTime ' ),
 32                  ' replies ' : item.get ( ' reply ' ),
 33 is                  ' sex ' : item.get ( ' gender ')
34             }
35              # coll.insert_one (Data) 
36        Print (Data)
 37 [   
38 is   
39   
40   
41 is  DEF Crawl_JSON ():
 42 is      UA = UserAgent ()
 43 is      headers = {
 44 is          ' UserAgent ' : ua.random,
 45          ' the Host ' : ' m.maoyan.com ' ,
 46 is          ' the Referer ' : ' http://m.maoyan.com/movie/1217236/comments?_v_=yes ' 
47      }
 48   
49      # cat film interfaces Commentary
50      # because the data inside the cat's eye type AJAX offset is changed for the first time is the third time 0 15 30 so that the second page is then recycled equivalent 100/15 
51      # confidently observed data cat AJAX request parameter will know 
52 is      Page 100 =
 53 is      U = 0
 54 is      for I in Range (Page):
 55          the try :
 56 is              offset = U
 57 is              the startTime = ' 2018-10-11 ' 
58              comment_api = ' HTTP: // m. maoyan.com/mmdb/comments/movie/1217236.json?_v_=yes&offset={0}&startTime={1}%2021%3A09%3A31 ' .format (offset, the startTime)
 59              #发送get请求
60             response_coment=requests.get(url=comment_api,headers=headers)
61             json_comment=response_coment.text
62             json_comments=json.loads(json_comment)
63             parse_json(json_comments)
64             u+=15
65         except Exception as e:
66             print('出现错误:',e.args)
67  
68  
69  
70 parse_json(Crawl_JSON())

 

Guess you like

Origin www.cnblogs.com/xiaoyiq/p/11441467.html