Selenium extract data summary with mind map

Selenium extract data summary with mind map

Insert picture description here

1. Common attributes and methods of the driver object

In the process of using selenium, after instantiating the driver object, the driver object has some commonly used attributes and methods

  1. driver.page_source The source code of the web page rendered by the current tab browser
  2. driver.current_url The url of the current tab
  3. driver.close() Close the current tab, if there is only one tab, close the entire browser
  4. driver.quit() Close the browser
  5. driver.forward() Page forward
  6. driver.back() Page back
  7. driver.screen_shot(img_name) Page screenshot

Knowledge points: understand the common attributes and methods of the driver object

2. The method to locate the label element of the driver object and obtain the label object

There are many ways to locate the label in selenium and return the label element object

find_element_by_id 						(返回一个元素)
find_element(s)_by_class_name 			(根据类名获取元素列表)
find_element(s)_by_name 				(根据标签的name属性值返回包含标签对象元素的列表)
find_element(s)_by_xpath 				(返回一个包含元素的列表)
find_element(s)_by_link_text 			(根据连接文本获取元素列表)
find_element(s)_by_partial_link_text 	(根据链接包含的文本获取元素列表)
find_element(s)_by_tag_name 			(根据标签名获取元素列表)
find_element(s)_by_css_selector 		(根据css选择器来获取元素列表)
  • note:
    • The difference between find_element and find_elements:
      • If there is more s, return the list, if there is no s, return the first label object that matches
      • An exception will be thrown if find_element fails to match, and an empty list will be returned if find_elements fails to match.
    • The difference between by_link_text and by_partial_link_tex: all text and containing a certain text
    • How to use the above functions
      • driver.find_element_by_id('id_str')

Knowledge point: master the method of locating the label element of the driver object and obtaining the label object

3. The label object extracts text content and attribute values

find_element can only get the element, not the data directly, if you need to get the data, you need to use the following methods

  • Perform click operations on elementselement.click()

    • Click on the targeted label object
  • Enter data into the input boxelement.send_keys(data)

    • Enter data for the located label object
  • Get textelement.text

    • textGet the text content by locating the properties of the label object
  • Get attribute valueelement.get_attribute("属性名")

    • get_attributeGet the value of the attribute by locating the function of the obtained label object and passing in the attribute name

  • The code implementation is as follows:

    from selenium import webdriver
    
    driver = webdriver.Chrome()
    
    driver.get('http://www.itcast.cn/')
    
    ret = driver.find_elements_by_tag_name('h2')
    print(ret[0].text) # 
    
    ret = driver.find_elements_by_link_text('黑马程序员')
    print(ret[0].get_attribute('href'))
    
    driver.quit()
    

Insert picture description here

This is the end, if it helps you, welcome to like and follow, your likes are very important to me

Guess you like

Origin blog.csdn.net/qq_45176548/article/details/111637850