(First release on the whole network in 2020) The crawler engineer correctly removed window.navigator.webdriver in Selenium

In the article "One Skill a Day: How to Correctly Remove the Value of window.navigator.webdriver in Selenium"window.navigator.webdriver , we introduced the method that could be correctly removed from the Chrome browser started by Selenium at that time .

Later, time passed and Chrome was upgraded, which made the method at that time invalid. As shown below:

insert image description here

For the latest version of Chrome, how should we properly hide this parameter?

In that article, I scolded a way of deception:

Open the webpage, and then JavaScripthide the value by executing the following statement window.navigator.webdriver:

Object.defineProperty(navigator, 'webdriver', {
      get: () => undefined
    })

I scolded this method as deception, because they run this JavaScript code after the webpage has been loaded, but at this time, the js program of the website itself has already known that you are using window.navigator.webdriver. Simulate a browser, what's the use of hiding it?

So even if this JavaScript statement is to be executed, it should be before the browser runs all the JavaScript that comes with the website.

This is our current plan.

Some readers may think that by writing a plug-in for the Chrome browser, the JavaScript statement in the plug-in can be run just after the website page is opened and before the built-in JavaScript is run.

Although this method can solve the problem, it is a little troublesome. Our method today is very simple. It is to use Google's Chrome Devtools-Protocol (Chrome Development Tools Protocol) referred to as CDP.

we openOfficial documentation of CPD[1], you can see the following commands:
insert image description here

"Run the given script just before each Frame is opened and before running the Frame's script."

JavaScriptThrough this command, we can give a piece of JavaScript code, so that Chrome will execute the given piece of code before running the code that comes with the website just after opening each page .

So how do Seleniumyou call CDPthe command in ? It's actually pretty simple, we use driver.execute_cdp_cmd. according toSelenium's official documentation [2], just pass in the CDP command and parameters that need to be called:
insert image description here

So we can write the following code:

from selenium.webdriver import Chrome

driver = Chrome('./chromedriver')
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
  "source": """
    Object.defineProperty(navigator, 'webdriver', {
      get: () => undefined
    })
  """
})
driver.get('http://exercise.kingname.info')

The running effect is shown in the figure below:
insert image description here

Perfectly hides window.navigator.webdriver. And, the key statement:

driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
  "source": """
    Object.defineProperty(navigator, 'webdriver', {
      get: () => undefined
    })
  """
})

It only needs to be executed once, and as long as you don't close the window opened by the driver, no matter how many URLs you open, it will automatically execute this statement in advance before all the js that comes with the website, and hide it window.navigator.webdriver.

If someone runs the above code, the following error occurs:
insert image description here

Then please upgrade yours ChromeDriver. The old version Chrome + ChromeDrivercan only use the previous method, not today's method. Newer versions Chrome + ChromeDrivercan use today's methods, but not the old ones. It is exactly the sentence:

"When God closes a door for you, he quietly opens a window for you."

Although the above code can be used to achieve the goal, in order to achieve a better hidden effect, you can also continue to add two experimental options:

from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path='./chromedriver')
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
  "source": """
    Object.defineProperty(navigator, 'webdriver', {
      get: () => undefined
    })
  """
})
driver.get('http://exercise.kingname.info')

References

[1] Official documentation of CPD: https://chromedevtools.github.io/devtools-protocol/tot/Page#method-addScriptToEvaluateOnNewDocument

[2]
Official document: https://www.selenium.dev/selenium/docs/api/py/webdriver_chrome/selenium.webdriver.chrome.webdriver.html#selenium.webdriver.chrome.webdriver.WebDriver.execute_cdp_cmd

references:

Source Qingnan Gangster:
(latest version) How to correctly remove window.navigator.webdriver in Selenium
One skill a day: How to correctly remove the value of window.navigator.webdriver in Selenium

Guess you like

Origin blog.csdn.net/weixin_41173374/article/details/104686243