Python reptile from entry to the master - to use advanced framework Selenium (b): node operations

Category: "Python reptile from entry to the master" List

In the " high-level framework Selenium is used (a): The Basics " in our understanding of the use of Selenium framework statement browser object, page visits and other operations. In fact, Selenium may be the same as parsing library XPath, BeautifulSoup, pyquery etc., to parse HTML. In addition, the browser Selenium can also drive to complete a variety of operations, this paper describes the node operating under Selenium, and subsequent articles will continue to elaborate Selenium parsing of the page.

Mentioned above. Selenium may be driven browser perform various operations, such as filling the form, etc. to simulate a click. For example, we want to complete the operation to enter text into an input box, you need to know where this input box. The Selenium offers a range of ways to find nodes, we can use these methods to obtain the desired node to the next step to perform some action or to extract information.

A single node

For example, you want to extract the search box Taobao page from this node, the first to observe its source code.
Taobao Home HTML
You may find that it's idShi q, namealso q. In addition, there are many other attributes, then we can use a variety of ways to obtain it. For example, find_element_by_name()according to namethe value of acquisition, find_element_by_id()based on idacquired. In addition, there are obtained based on XPath, CSS selectors, etc.
from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
input_first = browser.find_element_by_id('q')
input_second = browser.find_element_by_css_selector('#q')
input_third = browser.find_element_by_xpath('//*[@id="q"]')
print(input_first, input_second, input_third)
browser.close()

Here we use 3 ways to acquire input box, respectively, according to ID, CSS selectors and XPath acquired, they return to exactly the same result:

<selenium.webdriver.remote.webelement.WebElement (session="5e53d9e1c8646e44c14c1c2880d424af", element="0.5649563096161541-1")> 
<selenium.webdriver.remote.webelement.WebElement (session="5e53d9e1c8646e44c14c1c2880d424af", element="0.5649563096161541-1")> 
<selenium.webdriver.remote.webelement.WebElement (session="5e53d9e1c8646e44c14c1c2880d424af", element="0.5649563096161541-1")>

Here are all methods of obtaining a single node:

find_element_by_id
find_element_by_name
find_element_by_xpath
find_element_by_link_text
find_element_by_partial_link_text
find_element_by_tag_name
find_element_by_class_name
find_element_by_css_selector

另外,Selenium还提供了通用方法find_element(),它需要传入两个参数:查找方式By和值。实际上,它就是find_element_by_id()这种方法的通用函数版本,比如find_element_by_id(id)就等价于find_element(By.ID, id),二者得到的结果完全一致。我们用代码实现一下:

from selenium import webdriver
from selenium.webdriver.common.by import By

browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
input_first = browser.find_element(By.ID, 'q')
print(input_first)
browser.close()

实际上,这种查找方式的功能和上面列举的查找函数完全一致,不过参数更加灵活。

多个节点

如果查找的目标在网页中只有一个,那么完全可以用find_element()方法。但如果有多个节点,再用find_element()方法查找,就只能得到第一个节点了。如果要查找所有满足条件的节点,需要用find_elements()这样的方法。注意,在这个方法的名称中,element多了一个s,注意区分。

比如,要查找淘宝左侧导航条的所有条目:

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
lis = browser.find_elements_by_css_selector('.service-bd li')
print(lis)
browser.close()

可以看到,得到的内容变成了列表类型,列表中的每个节点都是WebElement类型。也就是说,如果我们用find_element()方法,只能获取匹配的第一个节点,结果是WebElement类型。如果用find_elements()方法,则结果是列表类型,列表中的每个节点是WebElement类型。

[<selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-1")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-2")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-3")>...<selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-16")>]

这里列出所有获取多个节点的方法:

find_elements_by_id
find_elements_by_name
find_elements_by_xpath
find_elements_by_link_text
find_elements_by_partial_link_text
find_elements_by_tag_name
find_elements_by_class_name
find_elements_by_css_selector

当然,我们也可以直接用find_elements()方法来选择,这时可以这样写,结果是完全一致的:

lis = browser.find_elements(By.CSS_SELECTOR, '.service-bd li')

节点交互

Selenium可以驱动浏览器来执行一些操作,也就是说可以让浏览器模拟执行一些动作。比较常见的用法有:输入文字时用send_keys()方法,清空文字时用clear()方法,点击按钮时用click()方法:

from selenium import webdriver
import time

browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
input = browser.find_element_by_id('q')
input.send_keys('HUAWEI”')
time.sleep(1)
input.clear()
input.send_keys('HUAWEI MareBook X Pro')
button = browser.find_element_by_class_name('btn-search')
button.click()

Here first drive browser to open Taobao, and then find_element_by_id()get the input box method, and then use the send_keys()method of input "HUAWEI" text, after one second later with a clear()way to clear the input box, called again send_keys()way to enter "HUAWEI MareBook X Pro" character, then after the find_element_by_class_name()method get the search button, and finally call the click()method to complete the search operation.

Through the above method, we completed a number of actions common node operation, more operations can be found in the official documentation of the interaction introduction.

Action Chain

In the example above, a number of interactive actions are performed for a node. For example, for an input box, we call it the empty text and text input methods; for the button, you call its click method. In fact, there are some other operations that do not perform specific objects, such as dragging a mouse, keyboard keys, etc., to perform these operations another way, that the operation of the chain. For example, a node is now achieved a drag operation, the drag from a certain node to another one, can be achieved:

from selenium import webdriver
from selenium.webdriver import ActionChains

browser = webdriver.Chrome()
url = 'http://www.runoob.com/try/try.php?filename=jqueryui-api-droppable'
browser.get(url)
browser.switch_to.frame('iframeResult')
source = browser.find_element_by_css_selector('#draggable')
target = browser.find_element_by_css_selector('#droppable')
actions = ActionChains(browser)
actions.drag_and_drop(source, target)
actions.perform()

First, open a web page in the instance of drag, followed by check and destination nodes to be dragged to drag, and then declare ActionChainsan object and assigns it to actionsa variable, then by calling the actionsvariable drag_and_drop()method, and then call the perform()method to perform an action, then you completion of the drag operation.
Examples of the operation chain

JavaScript execution

For some operations, Selenium API does not provide. For example, pull down the progress bar, it can be directly simulated running JavaScript, this time using the execute_script()method may be implemented, as follows:

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.zhihu.com/explore')
browser.execute_script('window.scrollTo(0, document.body.scrollHeight)')
browser.execute_script('alert("To Bottom")')

Here on the use of execute_script()the method will progress bar down to the bottom, then the pop-up alertprompt box. So with this method, all the features are basically API does not provide a way to execute JavaScript can be used to achieve.

Guess you like

Origin blog.csdn.net/hy592070616/article/details/93759895