Table of contents
Windows selenium configuration
Chrome Chromedriver Version Correspondence
Download and install third-party libraries (the simplest version)
Create a screenshot png view in the directory
Getting Selenium to run in headed mode in Linux
Windows selenium configuration
Download address (you just need to click directly)
Selenium
ChromeDriver
Chrome
GeckoDriver
Firefox
Correspondence between Chrome Chromedriver versions
We maintain multiple versions of ChromeDriver. Which version you choose depends on the version of Chrome you are using.
- Specifically, ChromeDriver uses the same version number scheme as Chrome. See https://www.chromium.org/developers/version-numbers for more details.
- Each version of ChromeDriver supports Chrome with the same major, minor, and build numbers. For example, ChromeDriver 73.0.3683.20 supports all Chrome versions starting with 73.0.3683.
- Before a new major version of Chrome goes into beta, a matching version of ChromeDriver is released.
- After the initial release of a new major version, we will release patches as needed. These patches may or may not coincide with updates to the Chrome browser.
Here are the steps to choose which version of ChromeDriver to download:
- First, find out which version of the Chrome browser you're using. Let's say your Chrome is 72.0.3626.81.
- Take the Chrome version number, drop the last part, and append the result to the URL "https://chromedriver.storage.googleapis.com/LATEST_RELEASE_". For example, using Chrome version 72.0.3626.81, you will get a URL "https://chromedriver.storage.googleapis.com/LATEST_RELEASE_72.0.3626".
- Use the URL created in the last step to retrieve a small file containing the version of ChromeDriver to use. For example, the above URL will result in a file containing "72.0.3626.69". (Of course, actual numbers may change in the future).
- Use the version number obtained from the previous step to construct the URL to download ChromeDriver. For version 72.0.3626.69, the URL will be "https://chromedriver.storage.googleapis.com/index.html?path=72.0.3626.69/".
- After the initial download, it is recommended that you go through the above process occasionally to see if there are any bugfix releases.
practice test
operating elements
1、.send_keys() # 输入方法
2、.click() # 点击方法
3、.clear() # 清空方法
browser operation
1、driver.maximize_window() # 最大化浏览器
2、driver.set_window_size(w,h) # 设置浏览器大小 单位像素 【了解】
3、driver.set_window_position(x,y) # 设置浏览器位置 【了解】
4、driver.back() # 后退操作
5、driver.forward() # 前进操作
6、driver.refrensh() # 刷新操作
7、driver.close() # 关闭当前主窗口(主窗口:默认启动那个界面,就是主窗口)
8、driver.quit() # 关闭driver对象启动的全部页面
9、driver.title # 获取当前页面title信息
10、driver.current_url # 获取当前页面url信息
Get element information
1、text 获取元素的文本; 如:driver.text
2、size 获取元素的大小: 如:driver.size
3、get_attribute 获取元素属性值;如:driver.get_attribute("id") ,传递的参数是元素的属性名
4、is_displayed 判断元素是否可见 如:element.is_displayed()
5、is_enabled 判断元素是否可用 如:element.is_enabled()
6、is_selected 判断元素是否被选中 如:element.is_selected()
mouse operation
1、context_click(element) # 右击
2、double_click(element) #双击
3、double_and_drop(source, target) # 拖拽
4、move_to_element(element) # 悬停 【重点】
5、perform() # 执行以上事件的方法 【重点】
Combat demo
# demo
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
options = webdriver.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--disable-gpu')
options.add_argument('--disable-dev-shm-usage')
# options.add_argument('--proxy-server=http://{0}'.format(ip))
driver = webdriver.Chrome(options=options)
# 用户正常访问该值为false。使用selenium时该值为true。
# 下面代码解决掉这个问题
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
})
"""
})
driver.get("https://www.baidu.com/")
time.sleep(5)
# 截图看是否访问了百度
driver.save_screenshot("baidu.png")
selenium add proxy
No matter how you do a crawler, you need to use a proxy, even if it is automated, it is impossible for an IP address to visit thousands of tens of thousands a day.
# 添加无认证代理,以参数形式添加
chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_argument('--proxy-server=http://ip:port')
driver = webdriver.Chrome(chrome_options=chromeOptions)
You need to find the ip and port by yourself, get it directly through the proxy platform api, and just install it.
Linux selenium configuration
Check server environment
[root@aa /]# lsb_release -a
Distributor ID: CentOS
Release: 7.9.2009
[root@aa /]# python -V
Python 2.7.5
[root@aa /]# python3 -V
Python 3.6.8
Download and install third-party libraries (the simplest version)
# install selenium
pip3 install selenium
# install chromedriver
yum install https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm yum install mesa-libOSMesa-devel gnu-free-sans-fonts wqy-zenhei-fonts
# Download the corresponding version of Chromedriver (the URL corresponding to the version below is correct) https://chromedriver.storage.googleapis.com/index.html?path= 110.0.5481.30 /
# move Place
mv chromedriver /usr/bin/
# Give execute permission
chmod +x /usr/bin/chromedriver
practice test
code testing
# demo
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
#options = webdriver.ChromeOptions()
#options.add_argument('--headless')
options = webdriver.ChromeOptions()
# 服务器无界面运行,否则会报错,后续配置插件解决
options.add_argument("headless")
options.add_argument('--no-sandbox')
options.add_argument('--disable-gpu')
options.add_argument('--disable-dev-shm-usage')
# options.add_argument('--proxy-server=http://{0}'.format(ip))
driver = webdriver.Chrome(options=options)
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
})
"""
})
driver.get("https://www.baidu.com/")
time.sleep(5)
# 截图看是否访问了百度
driver.save_screenshot("aaaaaaaaaaaaaaaaaa.png")
Create a screenshot png view in the directory
Getting Selenium to run in headed mode in Linux
Introduction to Xvfb
Xvfb implements the X11 display service protocol on a machine without an image device. It implements various interfaces that other graphical interfaces have, but does not have a real graphical interface
So when a program calls GUI-related operations in Xvfb, these operations will run in virtual memory, but you can't see anything
Using Xvfb, we can trick Selenium or Puppeteer into thinking that it is running in a system with a graphical interface, so that we can use the headed mode normally
# Install
yum install Xvfb
combat test
# 更改 demo
# 服务器无界面运行,否则会报错,后续配置插件解决
# 注释掉 以正常有界面模式运行
# options.add_argument("headless")
xvfb-run XXX
# 例如
xvfb-run python3 selenium_test.py
运行查看截图 成功
----------
2023.2.20