Selenium simulates browser access

Table of contents

1. Introduction to selenium

 1.1, what is selenium

 1.2, why use selenium

 1.3, install selenium

  1.3.1, Google browser driver download

  1.3.2, the mapping table between Google Drive and Google Chrome versions

  1.3.3. View the version of Google Chrome

  1.3.4, install the selenium library

Second, the use of selenium

 2.1, the use steps of selenium

 2.2. Examples

Three, selenium element operation

 3.1, element positioning of selenium

 3.2, selenium access element information

 3.3, selenium interaction

Four, no interface browser handleless

 4.1, basic configuration of handleless - generate browser object

 4.2, handleless package


1. Introduction to selenium

 1.1, what is selenium

(1) selenium is a tool for web application testing.

(2) The selenium test runs directly in the browser, just like a real user is operating.

(3) Support to drive real browsers through various drivers (FirfoxDriver, InternetExplorerDriver, OperaDriver, ChromeDriver) to complete the test.

(4) selenium supports no interface browser operation.

 1.2, why use selenium

        Simulate the browser function, automatically execute the js code in the web page, and realize dynamic loading.

 1.3, install selenium

  1.3.1, Google browser driver download

        1》Download address: http://chromedriver.storage.googleapis.com/index.html

         2》windows (32/64) can download win32.zip

         3"Unzip after downloading: get chromedriver.exe

          4"Place the chromedriver.exe file in the python program

  1.3.2, the mapping table between Google Drive and Google Chrome versions

Selenium's chromedriver and chrome version mapping table (updated to v2.46 ) The correspondence table between chromedriver and chrome has been organized as follows, hoping to be useful to everyone: Chrome versions supported by chromedriver version v2.37 v64-66 v2.36 v63-65 v2.3... https://blog .csdn.net/huilan_same/article/details/51896672

  1.3.3. View the version of Google Chrome

  1.3.4, install the selenium library

Second, the use of selenium

 2.1, the use steps of selenium

(1) Import: from selenium import webdriver

(2) Create a Google Chrome operation object:

        path = Google Chrome driver file path

        browser = webdriver.Chrome(path)

(3) Visit URL

        url = URL to visit

        browser.get(url)

 2.2. Examples

# (1)导入selenium
from selenium import webdriver

# (2)创建浏览器操作对象(导入浏览器驱动)
path = 'chromedriver.exe'

browser = webdriver.Chrome(path)

# (3)访问网站
url = 'https://www.baidu.com/'

urlJd = 'https://www.jd.com/'

# 模拟浏览器访问网址
browser.get(urlJd)

# 获取网页源码
content = browser.page_source
print(content)

 

Three, selenium element operation

 3.1, element positioning of selenium

   Element positioning: Simulate mouse and keyboard to operate elements to realize operations such as click and input. Before operating an element , it needs to be positioned first , and webDriver provides a variety of methods for element positioning.

# 元素定位(根据标签属性的属性值获取对象)
# find_element  返回对象
# find_elements 返回对象列表

# 根据id返回对象
button1 = browser.find_element(by=By.ID, value='su')

# 根据标签属性的属性值返回对象
button2 = browser.find_element(by=By.NAME, value='wd')

# 根据xpath返回对象
button3 = browser.find_element(by=By.XPATH, value='//input[@id="su"]')

# 根据标签的名字获取对象
button4 = browser.find_elements(by=By.TAG_NAME, value='input')

# 根据bs4的语法,获取对象
button5 = browser.find_element(by=By.CSS_SELECTOR, value='#su')

# 获取当前页面中的链接文本
button6 = browser.find_element(by=By.LINK_TEXT, value='地图')

 3.2, selenium access element information

        1) Get the tag attribute value

         2) Get the tag name

         3) Get element text

from selenium import webdriver
from selenium.webdriver.common.by import By

path = 'chromedriver.exe'

browser = webdriver.Chrome(path)

url = 'https://www.baidu.com'

browser.get(url)

input = browser.find_element(by=By.ID, value='su')

# 1、获取标签的属性值
print(input.get_attribute('class'))

# 2、获取标签名
print(input.tag_name)

# 3、获取元素文本
a = browser.find_element(by=By.LINK_TEXT, value='新闻')
print(a.text)

 3.3, selenium interaction

        

from selenium import webdriver
from selenium.webdriver.common.by import By
import time

# 创建浏览器对象
path = 'chromedriver.exe'

browser = webdriver.Chrome(path)

# url
url = 'https://www.baidu.com'

browser.get(url)

time.sleep(2)

# 获取文本框的对象
input = browser.find_element(by=By.ID, value='kw')

# 在文本框中输入内容
input.send_keys('周杰伦')
time.sleep(2)

# 获取百度一下的按钮
button = browser.find_element(by=By.ID, value='su')

# 点击按钮
button.click()
time.sleep(2)

# 划到页面底部
js_bottom = 'document.documentElement.scrollTop=100000'
browser.execute_script(js_bottom)
time.sleep(2)

# 获取下一页的按钮
next = browser.find_element(by=By.XPATH, value='//a[@class="n"]')

# 点击下一页
next.click()
time.sleep(2)

# 回到上一页
browser.back()
time.sleep(2)

# 回去
browser.forward()
time.sleep(3)

# 退出
browser.quit()

Four, no interface browser handleless

        Chrome-handless mode, Google has added a new mode for Chrome browser version 59, which can use Chrome browser without opening the UI interface, but its operation effect is consistent with Chrome.

 4.1, basic configuration of handleless - generate browser object

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')

# path是chrome浏览器的文件路径
path = r'C:\Program Files (x86)\Google\Chrome\Application\chrome.exe'
chrome_options.binary_location = path

browser = webdriver.Chrome(chrome_options=chrome_options)

 4.2, handleless package

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

def share_browser():
    chrome_options = Options()
    chrome_options.add_argument('--headless')
    chrome_options.add_argument('--disable-gpu')

    # path是chrome浏览器的文件路径
    path = r'C:\Program Files (x86)\Google\Chrome\Application\chrome.exe'
    chrome_options.binary_location = path

    browser = webdriver.Chrome(chrome_options=chrome_options)
    return browser

browser = share_browser()
url = 'https://www.baidu.com'
browser.get(url)

Guess you like

Origin blog.csdn.net/weixin_44302046/article/details/126774306
Recommended