[selenium] solve the problem of too long page loading time

1. Problem phenomenon

In the selenium test, it is found that the page elements have been loaded and can be operated, but the browser address bar is still turning around, and the following code has not been able to be executed.

Two, the cause of the problem

Methods such as selenium element operations need to wait for all elements of the page to be fully loaded before they are executed. Therefore, before the page is loaded, the code will wait for the page to load and not continue to execute.

3. Solution

1. Set the page loading time of WebDriver ( set_page_load_timeout )

The set_page_load_timeout(time) method can set the page loading timeout time. When the page loading exceeds the set time, an error will be reported, Timed out receiving message from renderer: time

We can use try...except in combination with Javascript's window.stop() method to forcibly stop page loading after a timeout, and continue to perform subsequent operations.

import time
from selenium import webdriver

start = time.time()

driver = webdriver.Chrome()
# 设置页面加载时间
driver.set_page_load_timeout(5)

try:
    driver.get('https://search.damai.cn/search.html?keyword=111&spm=a2oeg.home.searchtxt.dsearchbtn')
except:
    # 超时后执行Javascript停止页面加载
    driver.execute_script('window.stop()')

end = time.time()
# 计算页面加载时间
print(end - start)


>>>6.229357481002808

* The method of setting the page loading time is only applicable to open the webpage by using the get() method, and it is not applicable if the page is redirected by operation

2. Modify  WebDriver's page loading strategy (page_load_strategy)

When WebDriver loads a page, according to the default loading strategy, the page of the get address and all static resources are downloaded.

In addition to the default strategy, you can also choose eager and none strategies, and adjust the page loading strategy according to the actual situation to shorten the waiting time and improve the execution speed.

  • normal (default): wait for the entire page to load, including files, css, js, etc.
  • eager: Wait for the entire dom tree to be loaded, that is, the DOMContentLoaded event is completed, that is, as long as the html is fully loaded and parsed. Give up waiting for images, styles, and subframes to load.
  • none: Wait for the html download to complete, even if parsing has not yet started.

1) The page loading strategy is normal

import time
from selenium import webdriver

start = time.time()

# 默认加载策略为normal,可以不进行设置
driver = webdriver.Chrome()
driver.get('https://search.damai.cn/search.html?keyword=111&spm=a2oeg.home.searchtxt.dsearchbtn')

end = time.time()
# 计算页面加载时间
print(end - start)


>>>22.998253345489502

2) The page loading strategy is eager

import time
from selenium import webdriver

start = time.time()

options = webdriver.ChromeOptions()
# 设置加载策略为eager
options.page_load_strategy = 'eager'
driver = webdriver.Chrome(options=options)
driver.get('https://search.damai.cn/search.html?keyword=111&spm=a2oeg.home.searchtxt.dsearchbtn')

end = time.time()
# 计算页面加载时间
print(end - start)


>>>1.859546184539795

3) The page loading strategy is none

import time
from selenium import webdriver

start = time.time()

options = webdriver.ChromeOptions()
# 设置加载策略为none
options.page_load_strategy = 'none'
driver = webdriver.Chrome(options=options)
driver.get('https://search.damai.cn/search.html?keyword=111&spm=a2oeg.home.searchtxt.dsearchbtn')

end = time.time()
# 计算页面加载时间
print(end - start)


>>>1.1394140720367432

* When modifying the page loading strategy, it can cooperate with display waiting, so as to avoid that when the required elements are in a non-interactive state, the subsequent operations will be carried out prematurely, resulting in an error

Guess you like

Origin blog.csdn.net/Yocczy/article/details/132339514