selenium crawling Jingdong library implementation request product information:
introducing selenium Library
Using a try-catch connection can be achieved whether the abnormality detection capture
Jingdong send a request to get home
Find iput input box by id
send_kyes the current label by value
By send_keys press the Enter key for query
Find_elements_by_class_name each item by crawling
Take the name of each loop through the use of commodities, url (() method to get through a session of get.Attribute), the price and the number of evaluation:
find_element_by_css_selector('.p-name em').text
Finally, the file is stored jd.txt
Close drive
. 1 . 1 from Selenium Import the webdriver 2 2 # introduced keyboard Keys . 3 . 3 from selenium.webdriver.common.keys Import Keys . 4 . 4 Import Time . 5 . 5 . 6 . 6 Driver = webdriver.Chrome () . 7 . 7 . 8 . 8 # detection code block . 9 . 9 the try : 10 10 # implicit wait, wait for loading labels . 11 . 11 driver.implicitly_wait (10 ) 12 is 12 is 13 is13 # to send a request jingdong Home 14 14 driver.get ( ' https://www.jd.com/ ' ) 15 15 16 16 # Find input by the input box ID . 17 . 17 The input_tag = driver.find_element_by_id ( ' Key ' ) 18 is 18 is . 19 . 19 # send_keys pass the current value of the tag 20 is 20 is input_tag.send_keys ( ' Chinese dictionary ' ) 21 is 21 is 22 is 22 is # press the Enter key of the keyboard 23 is 23 is input_tag.send_keys (Keys.ENTER) 2424 25 25 the time.sleep (. 3 ) 26 is 26 is 27 27 '' ' 28 28 Product Information crawling Jingdong: 29 29 Doll 30 30 Title 31 is 31 is URL 32 32 Price 33 is 33 is evaluated 34 is 34 is ' '' 35 35 # Element to find a 36 36 # Elements plurality find 37 [ 37 [ # Find all the product list 38 is 38 is good_list = driver.find_elements_by_class_name ( ' GL-Item ' ) 39 39 # Print (good_list) 40 40 41 is 41 is # loop through each item 42 is 42 is for Good in good_list: 43 is 43 is # Find product details page by url attribute selector 44 is 44 is # url 45 45 good_url = good.find_element_by_css_selector ( ' . IMG A-P ' ) .get_attribute ( ' the href ' ) 46 is 46 is Print (good_url) 47 47 48 48 # 名称 49 49 good_name = good.find_element_by_css_selector('.p-name em').text 50 50 print(good_name) 51 51 52 52 # 价格 53 53 good_price = good.find_element_by_class_name('p-price').text 54 54 print(good_price) 55 55 56 56 # 评价数 57 57 good_commit = good.find_element_by_class_name('p-commit').text 58 58 Print (good_commit) 59 59 60 60 61 is 61 is str1 = F '' ' 62 is 62 is URL: {good_url} 63 is 63 is the name: {good_name} 64 64 Price: {good_price} 65 65 Evaluation: {good_commit} 66 66 \ n- 67 67 '' ' 68 68 # the commodity information written text 69 69 with Open ( ' jd.txt ' , ' A ' , encoding = ' UTF-. 8 ' ) AS F: 70 70 f.write (str1) 71 is 71 is 72 72 73 is 73 is the time.sleep (10 ) 74 74 75 75 # catch exceptions 76 76 the except Exception AS E: 77 77 Print (E) 78 78 79 79 # will ultimately drive the browse closes off 80 80 the finally : 81 81 driver.close ()