Python crawler serial control module 15 using selenium chrome

One,

1. chrome + chrome driver download and install

2.selenium operation is mainly divided into two categories:

(1) to give the UI elements

find_element_by_id: to get element by id value

find_elements_by_name (below both empathy)

find_elements_by_xpath

find_elements_by_link_text

find_elements_by_partial_link_text

find_elements_by_tag_name

find_elements_by_class_name

find_elements_by_css_selector

(2) based on the operation of the UI element simulation

Click; right; drag; input; can be done by importing ActionChains class

 

from Selenium Import the webdriver 

from selenium.webdriver.common.keys Import Keys # introduced keyboard 

Import Time 

# may need to manually add the path 

chromedriverAddress = R & lt " C: \ the Users \ lenovo1 \ AppData \ the Local \ Programs \ the Python \ Python37 \ Lib \ Site -packages \ the Selenium \ webdriver \ Chrome \ chromedriver.exe " 

Driver = webdriver.Chrome (executable_path = chromedriverAddress) 

# write this line when the incorrect report, showing the configuration here: https: //blog.csdn.net/weixin_43746433/article/ Details / 95,237,254 

URL = " http://www.baidu.com " 
driver.get (URL)



text1 = driver.find_element_by_id ( " wrapper " ) .text # get the value of this element 

Print (text1) 

Print (driver.title) 

# get a snapshot of the page 

driver.save_screenshot ( " index, PNG " ) 
driver.find_element_by_id ( 

" kw " ) .send_keys (U " giant panda " ) # to enter the id" panda "(in fact, the field id kw here is looking for) 

driver.find_element_by_id ( " su " ) .click () # clicks (actual Back on that type of information, the next step we retrieve) 

the time.sleep ( 5 )

driver.save_screenshot ( " daxiongmao.png " ) 

# retrieve current cookie interface 

Print (driver.get_cookies ()) 

# analog inputs + A two keys Ctrl 

driver.find_element_by_id ( " kW " ) .send_keys (Keys.CONTROL, ' A ' ) 

# analog ctrl + x, cut operation 

driver.find_element_by_id ( " kW " ) .send_keys (Keys.CONTROL, ' X ' ) 

driver.find_element_by_id ( " kW " ) .send_keys (U " carrier " ) 

Driver.save_screenshot("hangmu.png")

driver.find_element_by_id("su").send_keys(Keys.RETURN)

time.sleep(5)

driver.save_screenshot("hangmu2.png")

#清空输入框,clear

driver.find_element_by_id("kw").clear()

​

#关闭浏览器

driver.quit()

 

Second, the code problem

1. Verify that the biggest role code is used to determine the visitor is a robot or real, can be divided into: to see that picture; pole test (official website: www.geetest.com); 12306; telephone Daily verification code; google verification;

2. verification code to crack:

(1) General Procedure: download page and codes; verification number manually input /

(2) simple picture: Using image recognition software recognition software; you can use third-party image verification code to crack website

Third, the source

Reptile15_1_DHtmlChrome.py

https://github.com/ruigege66/PythonReptile/blob/master/Reptile15_1_DHtmlChrome.py

2.CSDN:https://blog.csdn.net/weixin_44630050

3. Park blog: https: //www.cnblogs.com/ruigege0000/

4. Welcomes the focus on micro-channel public number: Fourier transform public personal number, only for learning exchanges, backstage reply "gifts" to get big data learning materials

 

Guess you like

Origin www.cnblogs.com/ruigege0000/p/12514819.html