Selenium/webdriver introduction and working principle

 I've been looking at some low-level stuff lately. The translation of driver is to drive, the meaning of the driver. If you compare webdriver to a driver, it is very appropriate.

We can compare WebDriver driving a browser to a taxi driver driving a taxi. There are three roles when driving a taxi:

· Passenger: He/she tells the taxi driver where to go and how to get there.

· Taxi Driver: He steers the taxi as requested by the passengers.

· Taxi: The taxi completes the real driving according to the driver's control, and sends the passengers to the destination.

There are similar three roles in WebDriver:

· Automated test code: The automated test code sends requests to browser drivers (such as Firefox driver, Google driver).

· Browser driver: It parses the codes of these automated tests and sends them to the browser after parsing.

· Browser: Execute the instructions sent by the browser driver, and finally complete the operation that the engineer wants.

So in this analogy:

· The automated test code written by the engineer is equivalent to the passenger.

· Browser drivers are equivalent to taxi drivers.

· The browser is like a taxi.

Let's explain the working principle of WebDriver technically:

Technically speaking, the same three roles above:

· WebDriver API (based on Java, Python, C# and other languages).

· For the java language, it is the downloaded selenium Jar package, such as the selenium-java-3.8.1.zip package, which represents the version of Selenium3.8.1.

· Browser driver (browser driver), each browser has its own driver, which exists in the form of exe files. For example, Google's chromedriver.exe, Firefox's geckodriver.exe, and IE's IEDriverServer.exe browser.

The browser is of course the commonly used browsers that we are familiar with. How do they communicate with each other when the WebDriver script is running? Why can the same browser driver handle both java and python scripts? Let's take a look at what happens in the backend when a Selenium script is executed:

· For each Selenium script, an http request is created and sent to the browser driver.

· The browser driver contains an HTTP Server to receive these http requests.

After receiving the request, the HTTP Server controls the corresponding browser according to the request.

The browser executes specific test steps,
and the browser returns the step execution results to the HTTP Server. The HTTP Server returns the result to the Selenium script. If it is a wrong http code, we will see the corresponding error message on the console.

Why use the HTTP protocol?

Because the HTTP protocol is a standard protocol for communication between a browser and a Web server, almost every programming language provides rich http libraries, so that it is convenient to handle requests and requests between the client and the server. In response to the response, the structure of WebDriver is a typical C/S structure, the WebDriver API is equivalent to the client, and the small browser driver is the server.

The protocol based on WebDriver: JSON Wire protocol.

The JSON Wire protocol is based on the http protocol, further standardizing the data in the body part of the http request and response.

We know that HTTP requests and responses often include the following parts: http request method, http request and response content body, http response status code, etc.

Common http request methods:
GET: used to obtain information from the server. For example, to obtain the title information of a web page.

POST: Send an operation request to the server. Such as findElement, Click, etc.

http response status code:

In order to give users more clear feedback information, WebDriver provides more detailed http response status codes, such as:

7:NoSuchElement

11:ElementNotVisible

200:Everything OK

Now comes the most critical http request and response body part:

The body part mainly transmits specific data. In WebDriver, these data exist and are transmitted in the form of JSON. This is the JSON Wire protocol.

Selenium is a webdriver API that encapsulates the APIs of various browsers into "a protocol designed and defined by Selenium itself, named The WebDriver Wire Protocol"

Operation level:
1. Testers write UI automation test scripts (java, python, etc.). After running the scripts, the program will open the specified webdriver browser.

The webdriver browser acts as a remote-server to accept script commands, and webservice will open a port: http://localhost:9515 and the browser will listen to this port.

2. The webservice will translate the script language into json format and pass it to the browser to execute the operation command.

Logical level:
1. After the tester executes the test script, a session is created, and a restfull request is sent to the webservice through the http request.

2. Webservice translates the restfull request into a script that the browser can understand, and then accepts the script execution result.

3. The webservice encapsulates the result -- json to the client client/test script, and then the client knows whether the operation is successful, and the test can also be verified.

We can verify it:
download the chromedriver, put it in the environment variable, pay attention to match the version of the chrome browser, and then execute the chromedriver

As you can see, a server will be started and port 9515 will be opened:

andersons-iMac:~ anderson$ chromedriver

Starting ChromeDriver 2.39.562713 (dd642283e958a93ebf6891600db055f1f1b4f3b2) on port 9515

Only local connections are allowed.

GVA info: Successfully connected to the Intel plugin, offline Gen9

Emphasizes that only local connections are allowed. As mentioned earlier, the passenger sends a request to the driver, and the behavior is to construct an http request. The constructed request looks like this:

Request method: POST

Request address: http://localhost:9515/session

Request body:

capabilities = {
 
      "capabilities": {
 
          "alwaysMatch": {
 
              "browserName": "chrome"
 
          },
 
          "firstMatch": [
 
              {}
 
          ]
 
      },
 
      "desiredCapabilities": {
 
          "platform": "ANY",
 
          "browserName": "chrome",
 
          "version": "",
 
          "chromeOptions": {
 
              "args": [],
 
              "extensions": []
 
          }
 
      }
 
  }
 
  我们可以尝试使用python requests 向 ChromeDriver发送请求
 
  import requests
 
  import json
 
  session_url = 'http://localhost:9515/session'
 
  session_pars = {"capabilities": {"firstMatch": [{}], \
 
                        "alwaysMatch": {"browserName": "chrome",\
 
                                        "platformName": "any", \
 
                                        "goog:chromeOptions": {"extensions": [], "args": []}}}, \
 
                  "desiredCapabilities": {"browserName": "chrome", \
 
                               "version": "", "platform": "ANY", "goog:chromeOptions": {"extensions": [], "args": []}}}
 
  r_session = requests.post(session_url,json=session_pars)
 
  print(json.dumps(r_session.json(),indent=2))
 
  结果:
 
  {
 
    "sessionId": "44fdb7b1b048a76c0f625545b0d2567b",
 
    "status": 0,
 
    "value": {
 
      "acceptInsecureCerts": false,
 
      "acceptSslCerts": false,
 
      "applicationCacheEnabled": false,
 
      "browserConnectionEnabled": false,
 
      "browserName": "chrome",
 
      "chrome": {
 
        "chromedriverVersion": "2.40.565386 (45a059dc425e08165f9a10324bd1380cc13ca363)",
 
        "userDataDir": "/var/folders/yd/dmwmz84x5rj354qkz9rwwzbc0000gn/T/.org.chromium.Chromium.RzlABs"
 
      },
 
      "cssSelectorsEnabled": true,
 
      "databaseEnabled": false,
 
      "handlesAlerts": true,
 
      "hasTouchScreen": false,
 
      "javascriptEnabled": true,
 
      "locationContextEnabled": true,
 
      "mobileEmulationEnabled": false,
 
      "nativeEvents": true,
 
      "networkConnectionEnabled": false,
 
      "pageLoadStrategy": "normal",
 
      "platform": "Mac OS X",
 
      "rotatable": false,
 
      "setWindowRect": true,
 
      "takesHeapSnapshot": true,
 
      "takesScreenshot": true,
 
      "unexpectedAlertBehaviour": "",
 
      "version": "71.0.3578.80",
 
      "webStorageEnabled": true
 
    }
 
  }

How to open a webpage, similar to driver.get(url)

Then the constructed request is:

Request method: POST

Request address: http://localhost:9515/session/:sessionId/url

Note: ":sessionId" in the above address

The value of sessionId in the result returned by the request to start the browser

For example: I just sent a request, started the browser, and returned the result "sessionId": "44fdb7b1b048a76c0f625545b0d2567b"  

Then request the URL address

Request address: http://localhost:9515/session/b2801b5dc58b15e76d0d3295b04d295c/url

请求body :{"url": "https://www.baidu.com", "sessionId": "44fdb7b1b048a76c0f625545b0d2567b"}

即:
import requests
 
url = 'http://localhost:9515/session/44fdb7b1b048a76c0f625545b0d2567b/url'
 
pars = {"url": "https://www.baidu.com", "sessionId": "44fdb7b1b048a76c0f625545b0d2567b"}
 
r = requests.post(url,json=pars)
 
print(r.json())

How to locate elements, similar to driver.finde_element_by_xx:

Request method: POST

Request address: http://localhost:9515/session/:sessionId/element

Note: ":sessionId" in the above address

The value of sessionId in the result returned by the request to start the browser.

For example: I just sent a request, started the browser, and returned the result "sessionId": "b2801b5dc58b15e76d0d3295b04d295c"  

Then I construct the request address to find the page element

Request address: http://localhost:9515/session/b2801b5dc58b15e76d0d3295b04d295c/element

请求body :{"using": "css selector", "value": ".postTitle a", "sessionId": "b2801b5dc58b15e76d0d3295b04d295c"}

Right now:

import requests

url = 'http://localhost:9515/session/b2801b5dc58b15e76d0d3295b04d295c/element'

pars = {"using": "css selector", "value": ".postTitle a", "sessionId": "b2801b5dc58b15e76d0d3295b04d295c"}

r = requests.post(url,json=pars)

print(r.json())

How to manipulate elements: similar to click()

Request method: POST

Request address: http://localhost:9515/session/:sessionId/element/:id/click

Note: ":sessionId" in the above address

The value of sessionId in the result returned by the request to start the browser

:id returns the value of ELEMENT after requesting to locate the element

For example: I just sent a request, started the browser, and returned the result "sessionId": "b2801b5dc58b15e76d0d3295b04d295c"  

Element positioning, returns the value of ELEMENT "0.11402119390850629-1"

Then I construct the request address to click on the page element

Request address: http://localhost:9515/session/b2801b5dc58b15e76d0d3295b04d295c/element/0.11402119390850629-1/click

请求body :{"id": "0.11402119390850629-1", "sessionId": "b2801b5dc58b15e76d0d3295b04d295c"}

Right now:

import requests
 
  url = 'http://localhost:9515/session/b2801b5dc58b15e76d0d3295b04d295c/element/0.11402119390850629-1/click'
 
  pars ={"id": "0.5930642995574296-1", "sessionId": "b2801b5dc58b15e76d0d3295b04d295c"}
 
  r = requests.post(url,json=pars)
 
  print(r.json())

 As can be seen from the above, UI automation can actually be written as API automation.

  just, just

  It's so cumbersome, there is no packaged wedriver command to use, it feels a bit like taking off your pants and farting.

  Let's write a piece of code to feel:

  import requests
 
  import time
 
  capabilities = {
 
      "capabilities": {
 
          "alwaysMatch": {
 
              "browserName": "chrome"
 
          },
 
          "firstMatch": [
 
              {}
 
          ]
 
      },
 
      "desiredCapabilities": {
 
          "platform": "ANY",
 
          "browserName": "chrome",
 
          "version": "",
 
          "chromeOptions": {
 
              "args": [],
 
              "extensions": []
 
          }
 
      }
 
  }

# Open browser http://127.0.0.1:9515/session

res = requests.post('http://127.0.0.1:9515/session', json=capabilities).json()

session_id = res['sessionId']

# Open Baidu

requests.post('http://127.0.0.1:9515/session/%s/url' % session_id,

              json={"url": "http://www.baidu.com", "sessionId": session_id})

time.sleep(3)

# Close the browser and delete the session

requests.delete('http://127.0.0.1:9515/session/%s' % session_id, json={"sessionId": session_id})

In fact, to understand the real principle, that is, for the convenience of solving problems, it is more convenient to view and solve problems during debugging.

Of course, if you also need to call a small amount of UI automation in interface automation, you can consider this method.

Friends who are studying for the test can click the small card below

Guess you like

Origin blog.csdn.net/2301_76643199/article/details/131897671