Selenium Webdriver automated testing framework

Recently I am writing the selenium webdriver automation framework. After several days of hard work, I have basically implemented a set of automation framework that can satisfy both data-driven and Web keyword-driven needs (mainly based on ant+jenkins+testng+selenium webdriver+jxl accomplish). Through this automated framework development, I deeply discovered the power of webdriver. We can even see that Alibaba's F2etest browser compatibility testing platform is also based on webdriver. The following is a special introduction to selenium webdriver, allowing us to understand webdriver from a deep level:

What is WebDriver?
Webdriver (Selenium2.0) is an automatic testing tool for Web applications;
it provides a set of friendly APIs;
Webdriver is completely a set of class libraries and does not depend on any testing framework, except for necessary Browser driver;
development language supported by WebDriver API
Official website documentation: https://docs.seleniumhq.org/docs/03_webdriver.jsp

Java
Python
PHP
JavaScript
Perl
Ruby
C#
Why should you learn WebDriver?

Automated testing concept
WebDriver-positioning element
WebDriver-operation element
Built based on Python environment

Windows system
Python 3.5 (or above)
installs the selenium package
. Browser
installs PyCharm
selenium. Install, uninstall, and view commands.

Installation: pip install selenium==2.48.0
1). pip: general Python package management tool. Provides functions for searching, downloading, installing, and uninstalling Python packages.
2). install: Installation command
3). selenium==2.48.0: Specify the selenium 2.48.0 version to install (if no version is specified, the default is the latest version)
Uninstall: pip uninstall selenium
View: pip show selenium
browser
Firefox browser

FireFox 48 or above version
Selenium 3.X + FireFox driver - geckodriver
Firefox 48 or below version
Selenium 2.X built-in driver
IE browser

IE 9 or above version
Selenium3 .

Download address of each driver: http://www.seleniumhq.org/download/Note
:

The browser version and driver version must be consistent! (If it is a 32bit browser and the Driver is 64bit, the script will fail!)
After downloading the browser driver, you need to add it to the Path environment variable, or put it directly into the Python installation directory, because Python has been added to Path.
It is recommended to use Firefox browser (24, 35) version
. After integrating selenium and webdriver, the new testing tool formed is called selenium2.x. At the time of selenium1, selenium used JavaScript to achieve the goal of test automation.

Early Selenium used JavaScript injection technology to deal with browsers, which required Selenium RC to start a Server, convert API calls to operate Web elements into pieces of Javascript, and inject this piece of Javascript after the Selenium kernel started the browser. Anyone who has developed a web application knows that Javascript can obtain and call any element of the page and operate it freely. This achieves the purpose of Selenium: automating web operations. The disadvantage of this Javascript injection technology is that the speed is not ideal, and the stability greatly depends on the quality of the Javascript translated by the Selenium core from the API.

When Selenium2.x proposed the concept of WebDriver, it provided a completely different way to interact with the browser. That is to use the browser's native API, encapsulate it into a more object-oriented Selenium WebDriver API, directly operate the elements in the browser page, and even operate the browser itself (screenshot, window size, startup, shutdown, plug-in installation, certificate configuration) Category). Since the browser's native API is used, the speed is greatly improved, and the stability of the call is left to the browser manufacturer itself, which is obviously more scientific. However, some side effects are that different browser manufacturers have some differences in the operation and presentation of Web elements. This directly leads to Selenium WebDriver providing different implementations for different browser manufacturers. For example, Firefox has a dedicated FirefoxDriver, Chrome has a dedicated ChromeDriver, and so on. (Even AndroidDriver and iOS WebDriver included)

The WebDriver Wire protocol is universal, which means that whether it is FirefoxDriver or ChromeDriver, a Web Service based on this protocol will be started on a certain port after startup. For example, after Firefox Driver is successfully initialized, it will start from http://localhost:7055 by default, while Chrome Driver will probably start at http://localhost:46350. Next, when we call any API of WebDriver, we need to use a ComandExecutor to send a command, which is actually an HTTP request to the Web Service on the listening port. In the body of our HTTP request, we will use the JSON format string specified by the WebDriver Wire protocol to tell Selenium what we want the browser to do next.

When we create a new WebDriver, Selenium will first confirm whether the browser's native component is available and the version matches. Then a complete set of Web Services is launched in the target browser. This set of Web Services uses a protocol designed and defined by Selenium itself, called The WebDriver Wire Protocol. This set of protocols is very powerful and can operate the browser to do almost anything, including opening, closing, maximizing, minimizing, element positioning, element clicks, uploading files, etc.

Here the author initially drew a diagram to represent the working principles of various WebDrivers:

From the picture above, we can see that the WebDriver subclasses of different browsers need to rely on specific browser native components. For example, Firefox requires an add-on named webdriver.xpi. For IE, a dll file is needed to convert Web Service commands into browser native calls. In addition, the figure also indicates that the WebDriver Wire protocol is a set of RESTful-based web services.

Regarding the details of the WebDriver Wire protocol, for example, if you want to know what this Web Service can do, you can read the official Selenium protocol document. In the source code of Selenium, we can find a HttpCommandExecutor class, which maintains a Map<String, CommandInfo >, which is responsible for converting simple string keys representing commands into corresponding URLs, because the concept of REST is to treat all operations as states, and each state corresponds to a URI. So when we send an HTTP request to this RESTful web service with a specific URL, it can parse out the operations that need to be performed. An excerpt of the source code is as follows:

nameToUrl = ImmutableMap.builder()

.put(NEW_SESSION, post("/session"))

.put(QUIT, delete("/session/:sessionId"))

.put(GET_CURRENT_WINDOW_HANDLE, get("/session/:sessionId/window_handle"))

.put(GET_WINDOW_HANDLES, get("/session/:sessionId/window_handles"))

.put(GET, post("/session/:sessionId/url"))

// The Alert API is still experimental and should not be used.

.put(GET_ALERT, get("/session/:sessionId/alert"))

.put(DISMISS_ALERT, post("/session/:sessionId/dismiss_alert"))

.put(ACCEPT_ALERT, post("/session/:sessionId/accept_alert"))

.put(GET_ALERT_TEXT, get("/session/:sessionId/alert_text"))

.put(SET_ALERT_VALUE, post("/session/:sessionId/alert_text"))

You can see that the URLs actually sent are all relative paths, and the suffix usually starts with /session/:sessionId. This also means that WebDriver will allocate an independent sessionId every time it starts the browser. When multi-threads are running in parallel, there will be no difference between them. Conflict and interference. For example, one of our most commonly used WebDriver APIs, getWebElement, will be converted into the URL /session/:sessionId/element, and then specific parameters such as by ID, CSS, or Xpath will be attached to the HTTP request body sent, respectively. What is the value of . After receiving and executing this operation, an HTTP response will also be returned. The content is also JSON, and various details of the found WebElement will be returned, such as text, CSS selector, tag name, class name, etc. The following is a code snippet that parses the HTTP response we are talking about:

try {

response = new JsonToBeanConverter().convert(Response.class, responseAsText);

} catch (ClassCastException e) {

if (responseAsText != null && "".equals(responseAsText)) {

// The remote server has died, but has already set some headers.

// Normally this occurs when the final window of the firefox driver

// is closed on OS X. Return null, as the return value _should_ be

// being ignored. This is not an elegant solution.

return null;

}

throw new WebDriverException("Cannot convert text to response: " + responseAsText, e);

} //...

I believe that in this summary, the operating principle of WebDriver should be clear! In fact, I really admire the design of this RESTful web service. I feel that the public API exposed by encapsulating WebDriver can be more friendly and powerful. This time I will summarize it here. I will continue to analyze the Selenium source code and continue to share it!

Summary of experience using Selenium WebDriver:

Among them, WebDriver's more object-oriented approach greatly lowers Selenium's entry barrier, and the operation of Web elements is also very simple and easy to learn. When used in actual projects, the most workload-intensive part is how you parse and locate the various elements in your target project page. For example, if you want to locate a Button, you can use ID, CSS, or XPATH. In order to click this Button, you write a function to call the API in Selenium, that is, click() or submit() in WebElement. Then in addition What to do with a Button? What to do with hundreds or thousands of Buttons?

Therefore, you need to have a set of algorithms or encapsulation implemented by yourself to provide a universal element positioning method according to the characteristics of the project page. When your general positioning logic can accurately find any element, the rest will be a matter of course, just leave it to Selenium WebElement's API. I think this set of positioning logic is the most workload-intensive part of using Selenium for web automation. Of course, some companies' Web projects use self-developed UI frameworks, such as the company where the author works, so that the positioning rules and algorithms of Web elements are easier to design. If the page code developed by the web project is messy, then you need more clever and rigorous logic to find the elements you want to operate and view!

In my project, I designed and encapsulated a set of common APIs to intelligently locate various types of elements on the page. For example, the pages in the project have a large number of dialogs and wizards, all of which are implemented using div+css. I provided a dialog component with next(), save(), finish(), click(String buttonName), cancel() and other methods, and then tracked the progress of the operation completion based on the time of the mask layer and loading Icon. . Here is just a small example. I will share more details when I have the opportunity.

[Software Testing Interview Crash Course] How to force yourself to finish the software testing eight-part essay tutorial in one week. After finishing the interview, you will be stable. You can also be a high-paying software testing engineer (automated testing)

The following are supporting learning materials. For those who are doing [software testing], it should be the most comprehensive and complete preparation warehouse. This warehouse has also accompanied me through the most difficult journey. I hope it can also help you!

Selenium Webdriver automated testing framework

Guess you like