Programming Xiaobai's Self-study Notes Eleven (Python Crawler Introduction 3 Selenium Use + Example Detailed Explanation)

Series Article Directory

Programming Xiaobai's Self-study Notes Ten (Python Crawler Introduction 2 + Example Code Detailed Explanation

Programming Xiaobai's self-study notes 9 (introduction to python crawler + detailed code explanation) 

Programming Xiaobai's self-study notes eight (multithreading in python) 

Programming Xiaobai's self-study notes seven (inheritance of classes in python) 


Table of contents

Series Article Directory

Article directory

foreword

1. What is Selenium

2. Install Selenium

 3. The first example (open Baidu browser)

 4. The second example (enter content in the Baidu search box)

Summarize


foreword

As a programming novice, I have already learned the crawler part according to the book, but the crawler in the book is too basic and difficult to practice, so I followed the column of "Taking Mountains and Rivers as a Gift." I learned Selenium today. Use it, record it.


1. What is Selenium

The official answer is: Selenium is one of the most widely used open source Web UI (User Interface) automation test suites. Selenium test scripts can be coded in any supported programming language and run directly in most modern web browsers.

I personally understand that this thing can simulate some people's operations. Let's see what are the specific ones step by step (I also learn and remember). 

2. Install Selenium

pip install selenium , it's that simple, let's not talk about it. The big guy’s article will always talk about the driver of the next browser, and then operate the driver. Let’s try to see if it works without the driver.

 3. The first example (open Baidu browser)

Let's look at the sample code first: 

from selenium import webdriver

drive = webdriver.Chrome()
drive.set_window_size(1100, 850)
drive.get('https://www.baidu.com/')
The output result is wrong, prompting: "The version of chrome cannot be detected. Trying with latest driver version", this error means that the driver used is not compatible with the installed Chrome version , it seems that the driver must be downloaded, but I am in It is possible to run this code on another computer, so I searched carefully and found that the name of the Google browser running file in the computer is chrone.exe. With the attitude of giving it a try, I changed it to Into chrome.exe, the operation was successful.

 

Let's analyze the code in detail:

" from selenium import webdriver " means: webdriver is a class provided in the selenium library, which represents a browser instance and can be used to control the browser to perform various operations. By using from selenium import webdriver , we can directly use the webdriver class to create a Chrome browser instance, and use this instance to control the browser to perform various operations.

The following three codes are easy to understand, create an instance, set the window size, and enter the address of the opened web page.

 4. The second example (enter content in the Baidu search box)

 Selenium provides two methods to get the element position.

find_element gets the first element that satisfies the condition

find_elements gets all elements that satisfy the condition

These two methods can determine the position of the element by ID or name, let's try both.

 First of all, let's understand the webpage first. In the inspection mode, we can see that the html code of the input box is:

 

<input id="kw" name="wd" class="s_ipt" value="" maxlength="255" autocomplete="off"> 

We analyze this web page code, its ID attribute is "kw", then we try to write a piece of code, enter "refueling" in the input box, the code is as follows:

from selenium import webdriver
from selenium.webdriver.common.by import By


drive = webdriver.Chrome()
drive.set_window_size(1100, 850)
drive.get('https://www.baidu.com/')
drive.find_element(By.ID, 'kw').send_keys('加油')

 Successfully run, the result is as follows:

 

Let's try the name attribute again, the code is as follows: 

from selenium import webdriver
from selenium.webdriver.common.by import By
drive = webdriver.Chrome()
drive.set_window_size(1100, 850)
drive.get('https://www.baidu.com/')
drive.find_element(By.NAME, 'wd').send_keys('躺平')

 Also runs successfully:

 

As usual, let's analyze the code

"from selenium.webdriver.common.by import By" means that the By class is imported in the webdriver.common.by module of the Selenium library.

"drive.find_element(By.NAME, 'wd').send_keys('lay flat')" is to locate by NAME, input "lay flat".

There are many other positioning methods, such as:

drive.find_element(By.CLASS_NAME, 's_ipt').send_keys('躺平')


Summarize

Selenium is an automated testing tool that supports major browsers, including Chrome, Safari, Firefox, ie, etc. It can run directly in the browser, just like a real client is using it. Selenium can be used for crawlers to solve some complex crawler problems. Selenium can be used to obtain dynamic web page data. Some dynamic data is not displayed in the source code of the web page. At this time, you can consider using Selenium to obtain it.

 The following are some commonly used methods of selenium:
- `driver.get(url)`: Open a web page.
- `driver.find_element_by_*()`: Find elements by various means.
- `driver.find_elements_by_*()`: Find elements in various ways, and return a list.
- `driver.execute_script(*args)`: Execute JavaScript code.
- `driver.switch_to.window(*args)`: switch to the specified window.
- `driver.switch_to.frame(*args)`: switch to the specified frame.
- `driver.back()`: Return to the previous page.
- `driver.forward()`: forward to the next page.
- `driver.refresh()`: refresh the current page.

Guess you like

Origin blog.csdn.net/m0_49914128/article/details/131906001