Selenium common commands

Selenium common commands

Chrome remote debugging

In order to prevent the browser from being detected as being controlled by an automated program, we can use Chrome's remote debugging combined with selenium to remotely control Chrome to bypass the monitoring

First use cmd to add additional startup parameters for the startup of chrome.exe

# 如果chrome.exe未在环境变量中配置,需进入目录下运行
cd C:\Program Files (x86)\Google\Chrome\Application
chrome.exe --remote-debugging-port=9222 --user-data-dir="C:\selenum\AutomationProfile" --profile-directory="Profile 1"
  • Pay attention to the port not to be occupied
  • user-data-dir: Specify the configuration file directory, use the default user can ignore
  • profile-directory specifies the browser user, the default user can be ignored

Common Chrome configuration settings

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')  # 解决DevToolsActivePort文件不存在的报错
chrome_options.add_argument('--disable-gpu')  # 谷歌文档提到需要加上这个属性来规避bug
chrome_options.add_argument('blink-settings=imagesEnabled=false')  # 不加载图片, 提升速度
chrome_options.add_argument('--headless')  # 浏览器不提供可视化页面. linux下如果系统不支持可视化不加这条会启动失败
# 添加User-Agent
chrome_options.add_argument('user-agent="MQQBrowser/26 Mozilla/5.0 (Linux; U; Android 2.3.7; zh-cn; MB200 Build/GRJ22; CyanogenMod-7) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1"')
chrome_options.add_argument('--proxy-server=http://')  # 使用代理IP登录浏览器
chrome_options.add_experimental_option("debuggerAddress", "127.0.0.1:9222") # 使用远程调试端口操作浏览器,避免被监测为正在使用自动化程序控制
# 禁用浏览器弹窗
prefs = {
    
     'profile.default_content_setting_values' :  {
    
     'notifications' : 2 }}  
chrome_options.add_experimental_option('prefs',prefs)
#禁止插件
chrome_options.add_argument('--disable-plugins')     
# 禁用弹出拦截
chrome_options.add_argument('--disable-popup-blocking')  


from selenium import webdriver

# 创建WebDriver对象(通常默认为wd),指明Chrome浏览器驱动,并添加上述配置
wd = webdriver.Chrome(
    executable_path='C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe',
	options=chrome_options,
)
  • executable_path is the path of the downloaded chromedriver.exe, if placed in the python environment, you can ignore this option

Webdriver common instructions

  • Set the browser window size
# 窗口最大化
wd.maximize_window()
# 设置窗口指定大小
wd.set_window_size(1920, 1080)
  • Let the browser open the specified URL
wd.get('https://www.baidu.com')
  • Set the maximum waiting time
from selenium.webdriver.support.ui import WebDriverWait
wait = WebDriverWait(driver, 15)  
  • Close the browser window
wd.close()	# 关闭当前窗口
wd.quit()	# 关闭浏览器

View and select elements in the browser

If you want to find the characteristics of web elements. You can use the browser’s developer toolbar to help us view and select web elements. (Select the element, right-click to check, you can view the HTML element corresponding to the page)

[External link image transfer failed. The source site may have an anti-hotlinking mechanism. It is recommended to save the image and upload it directly (img-T1qHceGp-1591874068197) (C:\Users\hq0749a\AppData\Roaming\Typora\typora-user-images\ 1591864536461.png)]

Positioning element

wd = webdriver.Chrome()

Through the above instructions, we assign a WebDriver type object to wd, so that we can use this object to control the browser, such as opening the URL, selecting interface elements, etc.

The code below

wd.find_element_by_id('kw')

Use the method find_element_by_id of the WebDriver object,

There are many ways to locate elements in the WebDriver object. Among them are:

# 根据 元素的id 属性选择元素
find_element_by_id

# 根据 class属性、tag名 选择元素
find_element_by_tag_name
find_element_by_class_name

# 根据 元素的css 属性选择元素
find_element_by_css_selector

# 根据xpath来选择元素
find_element_by_xpath

Select elements based on their id attribute

[External link image transfer failed. The source site may have an anti-hotlinking mechanism. It is recommended to save the image and upload it directly (img-s9a5REnY-1591874068200)(C:\Users\hq0749a\AppData\Roaming\Typora\typora-user-images\ 1591864753087.png)]

According to the attributes of the tags in the above html, we can find an id attribute whose attribute value is kw.

In html, id can be used as the number of an element. If the element has an id attribute, then this id is unique and unique.

Therefore, if the element has an id, it is the simplest and most efficient way to find the element based on the id.

Example:

wd.find_element_by_id('kw')

After the browser finds the element whose id is kw, it returns the result to the automation program through the browser driver, so the find_element_by_id method will return a WebElement type object. Through this WebElement object, you can manipulate the corresponding interface elements.

# 找到元素后点击
wd.find_element_by_id('kw').click()

# 找到元素后传递字符串
wd.find_element_by_id('kw').send_keys('')

Select the element based on the element’s class attribute and tag name

In addition to the id of the element, we can also select the element based on the class attribute of the element.

There are types of elements, and the class attribute is used to mark the type of the element. In a specific html page, since the types of elements may be the same, we can use find_elements to locate all elements of the same type in actual positioning.

E.g:

wd.find_elements_by_class_name('s_ipt')

Similarly, we can use the method find_elements_by_tag_name to select all elements whose tag name is input

wd.find_elements_by_tag_name('input')

Note:

By WebElement object text属性you can get the text content of the element in a Web page.

element = wd.find_elements_by_class_name('s_ipt')
print(element.text)

If none of the above methods can locate the control, it is likely to be in the iframe, the following is the method to switch to the iframe

driver.switch_to.frame()               #转入网页内iframe(内嵌的网页元素)
driver.switch_to.parent_frame()         #切回上一层frame
driver.switch_to_default_content()      #返回到主页面

The difference between find_element and find_elements

Use find_elements to select all elements that meet the conditions. If there are no elements that meet the conditions, return an empty list

Use find_element to select the first element that meets the condition, if there is no element that meets the condition, throw NoSuchElementException

Switch web window

driver.execute_script('window.open()')  # 开启一个选项卡
windows=driver.window_handles           # 获得当前浏览器所有窗口
driver.switch_to.window(windows[0])     # 切换到最左侧窗口
driver.switch_to.window(windows[-1])    # 切换到最新打开窗口(注:也就是最右侧窗口)

submit Form

The submit() method is used to submit the form, especially when there is no submit button

For example, the "Enter" operation after entering a keyword in the search box, then you can submit the content of the search box through submit().

driver.find_element_by_id('query').submit()

Get page information

driver.current_url            # 获取当前网址
driver.page_source            # 获取源代码
driver.title                  # 获取当前页面标题内容
driver.delete_all_cookies()   # 删除所有cookie
driver.add_cookie({
    
    'name':5}) # 添加cookie
  • Get cookies
driver.get_cookies()  
cookie_list = []
for dict in cookies:
    cookie = dict['name'] + '=' + dict['value']
    cookie_list.append(cookie)
cookie = ';'.join(cookie_list)

Implicit wait and explicit wait

Implicit wait

When an element is not found, it does not immediately return an error that the element cannot be found. Instead, look for the element periodically (every half a second) until the element is found, or the specified maximum waiting time is exceeded, then an exception is thrown (if it is a method such as find_elements, an empty list is returned) .

Selenium's Webdriver object has a method called implicitly_wait

This method accepts a parameter to specify the maximum waiting time.

# 设置缺省等待时间
driver.implicitly_wait(10)

Explicit wait

As the name implies, explicit wait is to set a fixed waiting time after running the code, and then continue to run, the commonly used sleep method of the time module

This method accepts a parameter to specify the maximum waiting time.

import time
time.sleep(1)

The river object has a method called implicitly_wait

This method accepts a parameter to specify the maximum waiting time.

# 设置缺省等待时间
driver.implicitly_wait(10)

Explicit wait

As the name implies, explicit wait is to set a fixed waiting time after running the code, and then continue to run, the commonly used sleep method of the time module

This method accepts a parameter to specify the maximum waiting time.

import time
time.sleep(1)

Guess you like

Origin blog.csdn.net/weixin_45609519/article/details/106695171