Based on the use day05 selenium library

selenium crawling Jingdong library implementation request product information:
    introducing selenium Library

    Using a try-catch connection can be achieved whether the abnormality detection capture

    Jingdong send a request to get home

    Find iput input box by id

    send_kyes the current label by value

    By send_keys press the Enter key for query

    Find_elements_by_class_name each item by crawling           

      Take the name of each loop through the use of commodities, url (() method to get through a session of get.Attribute), the price and the number of evaluation:

    find_element_by_css_selector('.p-name em').text

 

     Finally, the file is stored jd.txt

    Close drive

 

 
. 1   . 1 from Selenium Import the webdriver
 2   2 # introduced keyboard Keys 
. 3   . 3 from selenium.webdriver.common.keys Import Keys
 . 4   . 4 Import Time
 . 5   . 5 
 . 6   . 6 Driver = webdriver.Chrome ()
 . 7   . 7 
 . 8   . 8 # detection code block 
. 9   . 9 the try :
 10 10      # implicit wait, wait for loading labels 
. 11 . 11 driver.implicitly_wait (10 )
 12 is 12 is 
 13 is13      # to send a request jingdong Home 
14 14 driver.get ( ' https://www.jd.com/ ' )
 15 15 
 16 16      # Find input by the input box ID 
. 17 . 17 The input_tag = driver.find_element_by_id ( ' Key ' )
 18 is 18 is 
 . 19 . 19      # send_keys pass the current value of the tag 
20 is 20 is input_tag.send_keys ( ' Chinese dictionary ' )
 21 is 21 is 
 22 is 22 is      # press the Enter key of the keyboard 
23 is 23 is      input_tag.send_keys (Keys.ENTER)
 2424 
 25 25 the time.sleep (. 3 )
 26 is 26 is 
 27 27      '' ' 
28  28 Product Information crawling Jingdong:
 29  29 Doll
 30  30 Title
 31 is  31 is URL
 32  32 Price
 33 is  33 is evaluated
 34 is  34 is      ' '' 
35 35      # Element to find a 
36 36      # Elements plurality find 
37 [ 37 [      # Find all the product list 
38 is 38 is good_list = driver.find_elements_by_class_name ( ' GL-Item ' )
39 39      # Print (good_list) 
40 40 
 41 is 41 is      # loop through each item 
42 is 42 is      for Good in good_list:
 43 is 43 is          # Find product details page by url attribute selector 
44 is 44 is          # url 
45 45 good_url = good.find_element_by_css_selector ( ' . IMG A-P ' ) .get_attribute ( ' the href ' )
 46 is 46 is          Print (good_url)
 47 47 
 48 48          # 名称
49 49         good_name = good.find_element_by_css_selector('.p-name em').text
50 50         print(good_name)
51 51 
52 52         # 价格
53 53         good_price = good.find_element_by_class_name('p-price').text
54 54         print(good_price)
55 55 
56 56         # 评价数
57 57         good_commit = good.find_element_by_class_name('p-commit').text
58 58          Print (good_commit)
 59 59 
 60 60 
 61 is 61 is str1 = F '' ' 
62 is  62 is URL: {good_url}
 63 is  63 is the name: {good_name}
 64  64 Price: {good_price}
 65  65 Evaluation: {good_commit}
 66  66 \ n-
 67  67          '' ' 
68 68          # the commodity information written text 
69 69 with Open ( ' jd.txt ' , ' A ' , encoding = ' UTF-. 8 ' ) AS F:
70 70              f.write (str1)
 71 is 71 is 
 72 72 
 73 is 73 is the time.sleep (10 )
 74 74 
 75 75 # catch exceptions 
76 76 the except Exception AS E:
 77 77      Print (E)
 78 78 
 79 79 # will ultimately drive the browse closes off 
80 80 the finally :
 81 81 driver.close ()

 

Guess you like

Origin www.cnblogs.com/cooperstar/p/11101280.html