slenium simulation login know almost 2020

introduction

The main reasons for writing this article are:

  • I wrote a lot of selenium crawler-related articles earlier, which helped many readers solve many problems
  • Selenium crawler has a low threshold and is more friendly to beginners
  • I don’t know if there are many readers who know Zhihu practice hands, or friends who hit the wall found my article
  • There are always friends in the background asking why I failed to log in according to my method
  • I’ve always said that I’m empty to try, so I have today’s article

login successful

First go to the effect diagram of successful selenium login

Insert picture description here

solution

Problems encountered

Let me talk about the problems encountered, which may also be encountered by many people

  • Window.navigator.webdriver is True. I won’t say more about it. Basically, I asked me from the previous article: About modifying window.navigator.webdriver code failure . I tried it at the beginning. The login failed, so the reply I gave was basically that my blog was to solve the problem of modifying window.navigator.webdriver to undefined . This step is necessary. If it is not resolved, please look back at my previous article
  • The second point should be the problem that everyone has not solved. As shown in the figure below, verification is required. This is a false verification. The function of sending verification code after blocking the direct stage is equivalent to falling into the infinite loop of while True, switching to password login It's the same Insert picture description here

solution

In fact, you may not believe it. I don’t believe it is so simple. The code
is just ok. It’s ok to log in directly with a third party, and it’s ok for personal testing. WeChat, QQ, and Weibo are all ok
. After logging in successfully, you can start your performance. The
sky is high and the birds fly, the sea is wide and the fish leap. Come on, Ollie will give

Insert picture description here

Post another wave of code

from selenium import webdriver
chrome_options = webdriver.ChromeOptions()

#修改windows.navigator.webdriver,防机器人识别机制,selenium自动登陆判别机制
chrome_options.add_experimental_option('excludeSwitches', ['enable-automation']) 
drive = webdriver.Chrome(options=chrome_options)


#CDP执行JavaScript 代码  重定义windows.navigator.webdriver的值
drive.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
    
    
  "source": """
    Object.defineProperty(navigator, 'webdriver', {
      get: () => undefined
    })
  """
})
url = 'https://www.zhihu.com/signin?next=%2F'
drive.implicitly_wait(10)
drive.get(url)

If you want to log in and know, it’s ok here, you can retreat. If you are not familiar with selenium, you can pull it directly to the end. In the reference materials, I will give you a good way. There is no way, the visitors are guests, in order to satisfy Everyone, I’m so troubled!

For the sake of my hard work, like, bookmark, and forward! Come and support (poor face)

The following section is an unexpected gain from this experiment. Use the dos command to start the chrome browser, and 用selenium接管dos命令新启动的chrome浏览器then use selenium to continue the follow-up operation!

In fact, when I couldn’t get around the verification at the beginning, I always wanted to do it, but I didn’t succeed. This test knows that

It happened by accident, so the name of the next section is called "Winning Harvest", which is indeed a surprise

If you use the following method to log in to Zhihu, the principle is the same, but also directly use a third-party account to log in

Windfall

A crawler way that is more friendly to Xiaobai

One, create a project folder

  • test_chrome as the bottom folder of the project
  • user_data is used to store browser data
    Insert picture description here

Two, dos command to start chrome

chrome.exe --remote-debugging-port=9222 --user-data-dir="D:\test_chrome\user_data"
  • Note whether chrome is added to the environment variable
  • If not, execute the above command after switching to the target startup path by cd
  • eg:C:\Program Files (x86)\Google\Chrome\Application

Three, add parameters to the shortcut

First copy the shortcut of chrome on the desktop to the bottom directory of the program D:\test_chrome
Insert picture description here

Modify the attribute information of the shortcut and add startup parameters for the shortcut
Insert picture description here

  • Add the parameter in cmd after the target -remote-debugging-port=9222 --user-data-dir="./user_data" , note that the target must be separated from the previous chrome.exe directly by a space, otherwise an error will be reported, the target path It’s messy like incorrect format, I’ve been in this pit for a long time
  • The starting location is your own project path D:\test_chrome

4. Project startup
If you need to use selenium for crawling, just open this shortcut directly under the project's bottom folder. This
is equivalent to the dual-opening of the application in the mobile phone. It is isolated from your original browser, but is separated from your normal browsing There is no difference between the browser, it will keep all the records, and use selenium directly to manipulate the chrome_driver every time you start it is a brand new browser. In comparison, this method 更类似人may be safer than direct driving.

After manual startup, it is ok to take over with selenium, and the subsequent simulation operation is the same as before

Take over chrome's code

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
# 相当于对你刚刚启动的chrome进行debug  127.0.0.1指本地ip
# 9222是你之前制定的程序端口
chrome_options.add_experimental_option("debuggerAddress", "127.0.0.1:9222")
# 指定自己的chromedriver路径
# 如果和我一样是使用的anconda chromedriver丢在scripts下面可以不用指定
#chrome_driver = "D:/Python/Python37/Scripts/chromedriver.exe" 
driver = webdriver.Chrome(chrome_driver, chrome_options=chrome_options)
print(driver.title)

Insert picture description here

Reference

I found two post request related login Zhihu articles, which are relatively new articles.
Advanced players can try it. The relatively long 19 years and previous related login Zhihu related articles are basically no reference.

[1] Using Selenium to realize Zhihu simulation login
[2] Python simulation login Zhihu (latest version)


My own selenium-related article series
[1] about modifying window.navigator.webdriver code failure
[2] selenium crawler related error resolution
[3] python crawler selenium visualization crawler
[4] sycm data automation
[5] One article takes you to understand Python crawler (1)-introduction to basic principles
[6] One article takes you to understand Python crawler (2)-introduction to four common basic crawling methods

Guess you like

Origin blog.csdn.net/qq_35866846/article/details/108151175