seleniumIt is the most popular automated testing tool in web applications and can be used for automated testing or browser crawlers. The official website address is: selenium . Compared with another web automated testing tool QTP, it has the following advantages:

Free open source and lightweight, different languages only need a small dependency package
Support multiple systems, including Windows, Mac, Linux
Support multiple browsers, including Chrome, FireFox, IE, safari, opera, etc.
Support multiple languages, including Java, C, python, c# and other mainstream languages
Support for distributed test case execution

python+selenium environment installation

First, you need to install python (recommended 3.7+) environment, and then directly use pip install seleniumthe installation dependency package.

In addition, you need to download the corresponding webdriverdriver of the browser. Note that the downloaded driver version must match the browser version .

Firefox browser driver: geckodriver
Chrome browser driver: chromedriver
IE browser driver: IEDriverServer
Edge browser driver: MicrosoftWebDriver
Opera browser driver: operadriver

After downloading, you can add the driver to the environment variable, so that you don't need to manually specify the driver path when using it.

Start the browser with selenium

ChromeYou can use the following code to start a browser in python , and then control the behavior of the browser or read data.

from selenium import webdriver

# 启动Chrome浏览器，要求chromedriver驱动程序已经配置到环境变量
# 将驱动程序和当前脚本放在同一个文件夹也可以
driver = webdriver.Chrome()

# 手动指定驱动程序路径
driver = webdriver.Chrome(r'D:/uusama/tools/chromedriver.exe')

driver = webdriver.Ie()        # Internet Explorer浏览器
driver = webdriver.Edge()      # Edge浏览器
driver = webdriver.Opera()     # Opera浏览器
driver = webdriver.PhantomJS()   # PhantomJS

driver.get('http://uusama.com')  # 打开指定路径的页面

You can also set startup parameters during startup. For example, the following code implements adding an agent at startup and ignoring httpscertificate verification.

from selenium import webdriver

# 创建chrome启动选项对象
options = webdriver.ChromeOptions()


options.add_argument("--proxy-server=127.0.0.1:16666")  # 设置代理
options.add_argument("---ignore-certificate-errors")  # 设置忽略https证书校验
options.add_experimental_option("excludeSwitches", ["enable-logging"])  # 启用日志

# 设置浏览器下载文件时保存的默认路径
prefs = {"download.default_directory": get_download_dir()}
options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(options=options)

Some very useful startup options, used below options = webdriver.ChromeOptions():

options.add_argument("--proxy-server=127.0.0.1:16666"): Set up a proxy, which can be combined mitmproxywith packet capture, etc.
option.add_experimental_option('excludeSwitches', ['enable-automation']): Set bypass seleniumdetection
options.add_argument("---ignore-certificate-errors"): Set to ignore https certificate verification
options.add_experimental_option("prefs", {"profile.managed_default_content_settings.images": 2}): Set no request image mode to speed up page loading
chrome_options.add_argument('--headless'): Set up a headless browser

selenium page load wait and detection

After using selenium to open the page, it cannot be operated immediately. It needs to wait until the loading of the pending page elements is completed. At this time, it is necessary to detect and wait for the loading of the page elements.

use `time.sleep()`wait

The easiest way is time.sleep()to wait for a certain period of time after opening the page. This method can only set a fixed time to wait. If the page is loaded in advance, it will be blocked in vain.

from time import sleep
from selenium import webdriver

driver = webdriver.Chrome()
driver.get('http://uusama.con')
time.sleep(10)
print('load finish')

`implicitly_wait`Set the maximum wait time using

In addition, you can also implicitly_waitset the maximum waiting time. If the page is loaded within a given time or has timed out, the next step will be executed. This method will wait until all resources are loaded, that is, the loading chart in the browser tab bar does not turn before executing the next step. It is possible that the page elements have been loaded, but resources such as js or pictures have not been loaded yet, and you need to wait at this time.

Also note that you implicitly_waitonly need to set it once, and driverit works for the entire life cycle, and it will block whenever the page is loading.

Examples are as follows:

from selenium import webdriver

driver = webdriver.Chrome()
driver.implicitly_wait(30)   # 设置最长等30秒
driver.get('http://uusama.com')
print(driver.current_url)

driver.get('http://baidu.com')
print(driver.current_url)

`WebDriverWait`Set wait conditions using

Using WebDriverWait(selenium.webdriver.support.wait.WebDriverWait) can set the waiting time more accurately and flexibly. WebDriverWaitIt can detect whether a certain condition is met at regular intervals within the set time. If the condition is met, the next step will be performed. If the setting is exceeded If the time is not satisfied, an exception is thrown TimeoutException, and the method declaration is as follows:

WebDriverWait(driver, timeout, poll_frequency=0.5, ignored_exceptions=None)

The meanings of the parameters are as follows:

driver: browser driver
timeout: The maximum timeout period, the default is in seconds
poll_frequency: detection interval (step size), the default is0.5秒
ignored_exceptions: Ignored exceptions that do not break even if the given exception is thrown during a call to until()oruntil_not()

WebDriverWait()Generally used with until()the or until_not()method, it means waiting to block until the return value Trueis or False. It should be noted that the parameters of these two methods must be callable objects, that is, the method name. You can use expected_conditionsthe method in the module or the method encapsulated by yourself.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions

driver = webdriver.Chrome()
driver.get("http://baidu.com")

# 判断id为`input`的元素是否被加到了dom树里，并不代表该元素一定可见，如果定位到就返回WebElement
element = WebDriverWait(driver, 5, 0.5).until(expected_conditions.presence_of_element_located((By.ID, "s_btn_wr")))

# implicitly_wait和WebDriverWait都设置时，取二者中最大的等待时间
driver.implicitly_wait(5)

# 判断某个元素是否被添加到了dom里并且可见，可见代表元素可显示且宽和高都大于0
WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.ID, 'su')))

# 判断元素是否可见，如果可见就返回这个元素
WebDriverWait(driver,10).until(EC.visibility_of(driver.find_element(by=By.ID, value='kw')))

expected_conditionsSome commonly used methods are listed below :

title_is: Determine whether the title of the current page is exactly as expected
title_contains: Determine whether the title of the current page contains the expected string
presence_of_element_located: Judging whether an element has been added to the dom tree does not mean that the element must be visible
visibility_of_element_located: Determine whether an element is visible (the element is not hidden, and the width and height of the element are not equal to 0)
visibility_of: Do the same thing as the above method, except that the above method needs to be passed to the locator, and this method can directly pass the located element
presence_of_all_elements_located: Determine whether at least one element exists in the dom tree. For example, if there are n elements on the page whose class is 'column-md-3', then as long as there is 1 element, this method will return True
text_to_be_present_in_element: Determine whether the text in an element contains the expected string
text_to_be_present_in_element_value: Determine whether the value attribute in an element contains the expected string
frame_to_be_available_and_switch_to_it: Determine whether the frame can be switched in, if yes, return True and switch in, otherwise return False
invisibility_of_element_located: Determine whether an element does not exist in the dom tree or is invisible
element_to_be_clickable: Determine whether an element is visible and enabled, so it is called clickable
staleness_of: Wait for an element to be removed from the dom tree. Note that this method also returns True or False
element_to_be_selected: Determine whether an element is selected, generally used in drop-down lists
element_selection_state_to_be: Determine whether the selected state of an element meets expectations
element_located_selection_state_to_be: The function is the same as the above method, except that the above method passes in the positioned element, and this method passes in the locator

Check `document`if loading is complete

It can also be used driver.execute_script('return document.readyState;') == 'complete'to detect documentwhether the loading is complete.

Pay attention to documentthe completion of loading, which does not include the asynchronous loading ajax request to dynamically render dom, which needs to be used to WebDriverWaitdetect whether an element has been rendered.

Selenium element positioning and reading

find element

Selenium provides a series of APIs to facilitate access to elements in chrome. These APIs all return WebElementobjects or lists, such as:

find_element_by_id(id): Find the first element matching id
find_element_by_class_name(): Find classthe first element that matches
find_elements_by_xpath(): Find xpathall elements that match
find_elements_by_css_selector(): Find all elements matching the css selector

In fact, you can look at WebDriverthe implementation source code in the class. Its core implementation is to call two basic functions:

find_element(self, by=By.ID, value=None): Find the first element of the matching strategy
find_elements(self, by=By.ID, value=None): Find all elements matching the strategy

The byparameters can be ID, CSS_SELECTOR, CLASS_NAME, XPATHetc. Here are a few simple examples:

Query 登录the first element containing text via xpath: find_element_by_xpath("//*[contains(text(),'登录')]")
Query refreshfor all elements containing a class name: find_elements_by_class_name('refresh')
Query tablethe second row of the form: find_element_by_css_selector('table tbody > tr:nth(2)')

dom element interaction

For the element search result object introduced above WebElement, commonly used APIs are:

element.text: Returns the text content of the element (including the content of all descendant nodes), note that if the element is display=nonereturned as an empty string
element.screenshot_as_png: screenshot of the element
element.send_keys("input"): element input box input inputstring
element.get_attribute('data-v'): Get data-vthe name attribute value, in addition to custom node attributes, you can also get textContentattributes such as
element.is_displayed(): Determine whether the element is visible to the user
element.clear(): clear element text
element.click(): Click on the element, if the element is not clickable, ElementNotInteractableExceptionan exception will be thrown
element.submit(): Simulate form submission

Find element failure handling

If the specified element cannot be found, an exception will be thrown NoSuchElementException, and it should be noted that display=nonethe element can be obtained, and all domelements in the node can be obtained.

And when you actually use it, you should pay attention to some elements dynamically created by js code, which may need to be polled or monitored.

A method that checks for the existence of a specified element is as follows:

def check_element_exists(xpath):
    try:
        driver.find_element_by_xpath(xpath)
    except NoSuchElementException:
        return False
    return True

selenium interactive control

`ActionChains`action chain

Webdriver ActionChainssimulates user operations through objects, which represent an action link queue, and all operations will enter the queue in turn but will not be executed immediately until the perform()method is called. Its commonly used methods are as follows:

click(on_element=None): click the left mouse button
click_and_hold(on_element=None): Click the left mouse button without releasing it
context_click(on_element=None): right click
double_click(on_element=None): Double click the left mouse button
send_keys(*keys_to_send): Send a key to the currently focused element
send_keys_to_element(element, *keys_to_send): send a key to the specified element
key_down(value, element=None): Press a key on a keyboard
key_up(value, element=None): release a key
drag_and_drop(source, target): Drag to an element and release
drag_and_drop_by_offset(source, xoffset, yoffset): Drag to a certain coordinate and release
move_by_offset(xoffset, yoffset): The mouse moves from the current position to a certain coordinate
move_to_element(to_element): Move the mouse to an element
move_to_element_with_offset(to_element, xoffset, yoffset): How much distance to move to an element (upper left corner coordinates)
perform(): Execute all actions in the chain
release(on_element=None): Release the left mouse button at an element position

Simulate mouse events

The following code simulates mouse movement, click, drag and other operations. Note that you need to wait for a certain period of time during the operation, otherwise the page will not have time to render.

from time import sleep
from selenium import webdriver
# 引入 ActionChains 类
from selenium.webdriver.common.action_chains import ActionChains

driver = webdriver.Chrome()
driver.get("https://www.baidu.cn")
action_chains = ActionChains(driver)

target = driver.find_element_by_link_text("搜索")
# 移动鼠标到指定元素然后点击
action_chains.move_to_element(target).click(target).perform()
time.sleep(2)

# 也可以直接调用元素的点击方法
target.click()
time.sleep(2)

# 鼠标移动到(10, 50)坐标处
action_chains.move_by_offset(10, 50).perform()
time.sleep(2)

# 鼠标移动到距离元素target(10, 50)处
action_chains.move_to_element_with_offset(target, 10, 50).perform()
time.sleep(2)

# 鼠标拖拽，将一个元素拖动到另一个元素
dragger = driver.find_element_by_id('dragger')
action.drag_and_drop(dragger, target).perform()
time.sleep(2)

# 也可以使用点击 -> 移动来实现拖拽
action.click_and_hold(dragger).release(target).perform()
time.sleep(2)
action.click_and_hold(dragger).move_to_element(target).release().perform()

Simulate keyboard input events

By send_keyssimulating keyboard events, commonly used are:

send_keys(Keys.BACK_SPACE): Delete key (BackSpace)
send_keys(Keys.SPACE): Space bar (Space)
send_keys(Keys.TAB): tab key (Tab)
send_keys(Keys.ESCAPE): Back key (Esc)
send_keys(Keys.ENTER): Enter key (Enter)
send_keys(Keys.F1): Keyboard F1
send_keys(Keys.CONTROL,'a'): Select all (Ctrl+A)
send_keys(Keys.CONTROL,'c'): Copy (Ctrl+C)
send_keys(Keys.CONTROL,'x'): Cut (Ctrl+X)
send_keys(Keys.CONTROL,'v'): Paste (Ctrl+V)

Example: Locate the input box and enter content

# 输入框输入内容
driver.find_element_by_id("kw").send_keys("uusamaa")

# 模拟回车删除多输入的一个字符a
driver.find_element_by_id("kw").send_keys(Keys.BACK_SPACE)

Alert Box Handling

Used to handle alertthe alert box popped up by the call.

driver.switch_to_alert(): switch to alert box
text: alert/confirm/promptThe text information in the return, such as a js call alert('failed')will get faileda string
accept(): accept existing alert box
dismiss(): close an existing alert box
send_keys(keysToSend): Send text to alert box

selenium browser control

Basic common api

Some very useful browser control APIs are listed below:

driver.current_url: Get the url of the current active window
driver.switch_to_window("windowName"): Move to the specified tab window
driver.switch_to_frame("frameName"): move to the specified nameiframe
driver.switch_to_default_content(): move to the default text content area
driver.maximize_window(): maximize the browser
driver.set_window_size(480, 800): Set the browser width to 480 and height to 800 to display
driver.forword(), driver.back(): browser forward and backward
driver.refresh(): refresh page
driver.close(): close the current tab
driver.quiit(): close the entire browser
driver.save_screenshot('screen.png'): Save a screenshot of the page
driver.maximize_window(): maximize the browser
browser.execute_script('return document.readyState;'): execute js script

selenium reads and loads cookies

Using get_cookiesand add_cookiecan cache the cookie locally, and then load it at startup, so that the login state can be preserved. The implementation is as follows

import os
import json
from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://www.baidu.cn")

# 读取所有cookie并保存到文件
cookies = driver.get_cookies()
cookie_save_path = 'cookie.json'
with open(cookie_save_path, 'w', encoding='utf-8') as file_handle:
    json.dump(cookies, file_handle, ensure_ascii=False, indent=4)

# 从文件读取cookie并加载到浏览器
with open(cookie_save_path, 'r', encoding='utf-8') as file_handle:
    cookies = json.load(file_handle)
    for cookie in cookies:
        driver.add_cookie(cookie)

selenium opens a new tab window

By driver.get(url)default, the specified link will be opened in the first tab window, and _blanka new tab window will also be opened when a link in the page is clicked.

You can also use the following method to manually open a tab window of a specified page. Note that after opening a new window or closing it, you need to manually call to switch the switch_to.windowcurrently active tab window, otherwise NoSuchWindowExceptionan exception will be thrown.

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://www.baidu.cn")

new_tab_url = 'http://uusama.com'
driver.execute_script(f'window.open("{new_tab_url}", "_blank");')
time.sleep(1)

# 注意：必须调用switch_to.window手动切换window，否则会找不到tab view
# 聚焦到新打开的tab页面，然后关闭
driver.switch_to.window(driver.window_handles[1])
time.sleep(2)
driver.close()   # 关闭当前窗口

# 手动回到原来的tab页面
driver.switch_to.window(driver.window_handles[0])
time.sleep(1)

In addition to using execute_script, you can also create a new tab window by simulating the button to open a new tab page:

driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + 't')
ActionChains(driver).key_down(Keys.CONTROL).send_keys('t').key_up(Keys.CONTROL).perform()

Some problem records of selenium

Get the text content of hidden elements

If an element is hidden, that is display=none, although find_elementthe element can be found by searching, but element.textthe text content of the element cannot be obtained by using attributes, and its value is an empty string, then the following method can be used to obtain it:

element = driver.find_element_by_id('uusama')
driver.execute_script("return arguments[0].textContent", element)
driver.execute_script("return arguments[0].innerHTML", element)

# 相应的也可以把隐藏的元素设置为非隐藏
driver.execute_script("arguments[0].style.display = 'block';", element)

Browser crash `WebDriverException`exception handling

For example, if Chromea page runs for a long time Out Of Memory, an error of insufficient memory will occur. At this time, an exception WebDriverwill be thrown WebDriverException. Basically all APIs will throw this exception. At this time, it needs to be caught and handled specially.

My processing method is to record some basic information of the page, such as url, cookie, etc., and write it to the file regularly. If the abnormality is detected, restart the browser and load data such as url and cookie.

Selenium crawls page request data

There are driver.requestsways to obtain page requests by parsing or parsing logs on the Internet, but I don't think it works very well. Finally, use mitmproxya proxy to capture packets, and then seleniumfill in the proxy at startup to achieve.

proxy.pyFor mitmproxythe custom proxy request processing encapsulated on the basis, the code is as follows:

import os
import gzip
from mitmproxy.options import Options
from mitmproxy.proxy.config import ProxyConfig
from mitmproxy.proxy.server import ProxyServer
from mitmproxy.tools.dump import DumpMaster
from mitmproxy.http import HTTPFlow
from mitmproxy.websocket import WebSocketFlow


class ProxyMaster(DumpMaster):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def run(self, func=None):
        try:
            DumpMaster.run(self, func)
        except KeyboardInterrupt:
            self.shutdown()


def process(url: str, request_body: str, response_content: str):
    # 抓包请求处理，可以在这儿转存和解析数据
    pass


class Addon(object):
    def websocket_message(self, flow: WebSocketFlow):
        # 监听websockt请求
        pass

    def response(self, flow: HTTPFlow):
        # 避免一直保存flow流，导致内存占用飙升
        # flow.request.headers["Connection"] = "close"
        # 监听http请求响应，并获取请求体和响应内容
        url = flow.request.url
        request_body = flow.request
        response_content = flow.response

        # 如果返回值是压缩的内容需要进行解压缩
        if response_content.data.content.startswith(b'\x1f\x8b\x08'):
            response_content = gzip.decompress(response_content.data.content).decode('utf-8')
        Addon.EXECUTOR.submit(process, url, request_body, response_content)


def run_proxy_server():
    options = Options(listen_host='0.0.0.0', listen_port=16666)
    config = ProxyConfig(options)
    master = ProxyMaster(options, with_termlog=False, with_dumper=False)
    master.server = ProxyServer(config)
    master.addons.add(Addon())
    master.run()


if __name__ == '__main__':
    with open('proxy.pid', mode='w') as fin:
        fin.write(os.getpid().__str__())
    run_proxy_server()

In the process of use , the problem of memory usage will soar mitmproxyas time goes by . Some people have encountered it in the issue area of github . Some people say that because http connection requests will always be saved and will not be released, resulting in more requests and more memory usage. Then manually close the connection by adding it. After I added it, there was some relief, but it still couldn't be solved fundamentally.proxy.pykeep-alive=trueflow.request.headers["Connection"] = "close"

Finally, the memory leak problem is solved by writing to proxy.pidthe recording agent process, and then restarting it regularly with another program .proxy.py

write at the end

If you think the article is not bad, please like, share, and leave a message , because this will be the strongest motivation for me to continue to output more high-quality articles!

People who read this article feel that my understanding is wrong, and welcome comments and discussions~

You can also join the group chat below to communicate with fellow masters

Advanced must-see series for testers "Guide to the use of selenium, a python automated testing tool"

Contents: Guide

overview

python+selenium environment installation

Start the browser with selenium

selenium page load wait and detection

use `time.sleep()`wait

`implicitly_wait`Set the maximum wait time using

`WebDriverWait`Set wait conditions using

Check `document`if loading is complete

Selenium element positioning and reading

find element

dom element interaction

Find element failure handling

selenium interactive control

`ActionChains`action chain

Simulate mouse events

Simulate keyboard input events

Alert Box Handling

selenium browser control

Basic common api

selenium reads and loads cookies

selenium opens a new tab window

Some problem records of selenium

Get the text content of hidden elements

Browser crash `WebDriverException`exception handling

Selenium crawls page request data

write at the end

Guess you like

Advanced must-see series for testers "Guide to the use of selenium, a python automated testing tool"

Contents: Guide

overview

python+selenium environment installation

Start the browser with selenium

selenium page load wait and detection

use time.sleep()wait

implicitly_waitSet the maximum wait time using

WebDriverWaitSet wait conditions using

Check documentif loading is complete

Selenium element positioning and reading

find element

dom element interaction

Find element failure handling

selenium interactive control

ActionChainsaction chain

Simulate mouse events

Simulate keyboard input events

Alert Box Handling

selenium browser control

Basic common api

selenium reads and loads cookies

selenium opens a new tab window

Some problem records of selenium

Get the text content of hidden elements

Browser crash WebDriverExceptionexception handling

Selenium crawls page request data

write at the end

Guess you like

use `time.sleep()`wait

`implicitly_wait`Set the maximum wait time using

`WebDriverWait`Set wait conditions using

Check `document`if loading is complete

`ActionChains`action chain

Browser crash `WebDriverException`exception handling