python爬虫入门(4)----- selenium

selenium

简介

selenium使用JavaScript模拟真实用户对浏览器进行操作。测试脚本执行时,浏览器自动按照脚本代码做出点击,输入,打开,验证等操作,就像真实用户所做的一样,从终端用户的角度测试应用程序。

与python集成

from selenium import webdriver


driver = webdriver.Firefox()

driver.get("http://www.baidu.com")
driver.find_element_by_id("kw").send_keys("selenium")
driver.find_element_by_id("su").click()
driver.quit()

selenium可以操纵各大主流浏览器chrome、firefox、ie等等,但需要下载相应的驱动包
chrome: http://chromedriver.storage.googleapis.com/index.html
firefox:https://github.com/mozilla/geckodriver/releases/
ie:http://selenium-release.storage.googleapis.com/index.html

webdriver(即:浏览器对象)基本使用方法

  1. 打开关闭标签页

     #打开
     def get(self, url)
    
     #关闭
     def close(self)
    
     #退出浏览器
     def quit(self)  
  2. 设置浏览器宽高

     def set_window_size(self, width, height, windowHandle='current'):
         """
         Sets the width and height of the current window. (window.resizeTo)
    
         :Args:
          - width: the width in pixels to set the window to
          - height: the height in pixels to set the window to
    
         :Usage:
         driver.set_window_size(800,600)
         """
  3. 对象定位
    #通过id方式定位
    driver.find_element_by_id("kw")

     #通过name方式定位
     driver.find_element_by_name("wd")
    
     #通过tag name方式定位
     driver.find_element_by_tag_name("input")
    
     #通过class name 方式定位
     driver.find_element_by_class_name("s_ipt")
    
     #通过CSS方式定位
     driver.find_element_by_css_selector("#kw")
    
     #通过xphan方式定位
     driver.find_element_by_xpath("//input[@id='kw']")
    
     #通过link方式定位
     driver.find_element_by_link_text("贴 吧")
    
     #Partial Link Text 定位
     driver.find_element_by_partial_link_text("贴")
    
     #通过by指定方法类型定位
     driver.find_element(By.ID, 'foo')
  4. 定位一组元素

     #与上面类似加上s,但上面会抛出NoSuchElementException,下面找不到则返回empty list
     #通过by指定方法类型定位
     driver.find_elements(By.ID, 'foo')
  5. 框架和窗口定位

     def switch_to(self):
         """
         :Returns:
             - SwitchTo: an object containing all options to switch focus into
    
         :Usage:
             element = driver.switch_to.active_element
             alert = driver.switch_to.alert
             driver.switch_to.default_content()
             driver.switch_to.frame('frame_name')
             driver.switch_to.frame(1)
             driver.switch_to.frame(driver.find_elements_by_tag_name("iframe")[0])
             driver.switch_to.parent_frame()
             driver.switch_to.window('main')
         """
  6. 执行js

     def execute_script(self, script, *args):
         """
         Synchronously Executes JavaScript in the current window/frame.
    
         :Args:
          - script: The JavaScript to execute.
          - \*args: Any applicable arguments for your JavaScript.
    
         :Usage:
             driver.execute_script('return document.title;')
         """
    
     def execute_async_script(self, script, *args):
         """
         Asynchronously Executes JavaScript in the current window/frame.
    
         :Args:
          - script: The JavaScript to execute.
          - \*args: Any applicable arguments for your JavaScript.
    
         :Usage:
             script = "var callback = arguments[arguments.length - 1]; " \
                      "window.setTimeout(function(){ callback('timeout') }, 3000);"
             driver.execute_async_script(script)
         """

webelement(元素)基本使用方法

  1. 点击

     driver.find_element_by_id("su").click()
     driver.find_element_by_id("su").submit()
  2. 输入文本

     driver.find_element_by_id("kw").send_keys("xxx")
  3. 获取属性/文本

     driver.find_element_by_id("kw").text()
     driver.find_element_by_id("kw").get_attribute()
     driver.find_element_by_id("kw").get_property()
  4. 层次定位

     #与webdiriver操作一样,可以以当前元素为父元素查找子元素
     parent = driver.find_element(By.ID, 'parent')
     parent.find_element(By.ID, 'child')

猜你喜欢

转载自www.cnblogs.com/wuweishuo/p/10606139.html