Front-end js solution to simulate browser crawlers

Front-end js solution to simulate browser crawlers

  1. We like to use Selenium + Chromedriver in the process of developing crawlers, and then a line of Javascript code in the front end can be identified, thus killing you

          First, we use the code to start the simulated browser

          from selenium.webdriver import Chrome

          driver = Chrome()

          As shown below:

             

 

 

 

Next we use a line of js code to debug window.navigator.webdriver

The returned result is true

Next, let’s return to the normal browser to view and run the same code, as shown in the figure below

You can see that the normal browser is undefined, so we know that some websites will know that our browser is an analog browser, this is just one of the methods.

It is also influential that we cracked the encryption of a certain website, and the encrypted data obtained when decrypting with the help of a simulated browser was incorrect. Part of the reason was that the other party judged the browser when encrypting, including some Encryption and encryption of a certain tone.

Next, let's talk about how to solve it.

Students who are more familiar with js will definitely use the following line of code to solve

Object.defineProperties(navigator, {webdriver:{get:()=>undefined}});

As shown in the following figure:
It is indeed solved now, but we can do it when we turn the page

It doesn’t work anymore now. There is a simple way to modify the code directly. Before starting Chromedriver, enable the experimental function parameter excludeSwitches for Chrome, and its value is ['enable-automation']

 

from selenium.webdriver import Chrome

from selenium.webdriver import ChromeOptions

option = ChromeOptions()

option.add_experimental_option('excludeSwitches', ['enable-automation'])

driver = Chrome(options=option)

 

 

 

Guess you like

Origin blog.csdn.net/zyc__python/article/details/106690641