Crawl dynamic web content there are two ways, one is to find dynamic content through an interface developer tools, by analyzing the interface parameters and return value to crawl data fetch site. Another is to capture data by simulating a browser, python Selenium the library can be simulated by the browser to fetch the data codes.
I. Overview
Run Selenium selenium you need to rely on the Python library, drive, and corresponds to a browser (WebDriver).
Installation selenium library
pip install selenium
Project Address: https://pypi.org/project/selenium/
Download WebDriver
WebDriver can be simply understood as a browser plug-in is an executable program. Different browsers corresponding WebDriver are different, such as Firefox WebDriver is geckodriver, the Windows environment is geckodriver.exe file; Chrome browser WebDriver is Chromedriver, the Windows environment is chromedriver.exe file.
Unzip Webdriver After the download, copy the exe file to the python directory (the directory as long as you can in the path environment variable)
Firefox webdriver download
https://github.com/mozilla/geckodriver/
google chrome download the webdriver (browser version by downloading the corresponding webdriver, if the chromedriver.exe version of Chrome does not match, then the program will run python selenium failure)
http://chromedriver.storage.googleapis.com/ index.html
Second, the example
Examples. 1:
from the webdriver Import Selenium
browser = webdriver.Chrome()
browser.get('http://www.baidu.com')
assert '百度一下' in browser.title
#elem = browser.find_element_by_name("wd")
elem = browser.find_element_by_xpath('//*[@id="kw"]')
elem.send_keys("selenium")
btn = browser.find_element_by_id("su")
btn.click()
#browser.quit()
例子2:
import unittest
from selenium import webdriver
class BaiduTest(unittest.TestCase):
def setUp(self):
self.browser = webdriver.Firefox()
self.browser.get("http://www.baidu.com")
#self.addCleanup(self.browser.quit)
testTitle DEF (Self):
self.assertIn ( "Baidu it", self.browser.title)
def testSearch(self):
#self.browser.get("http://www.baidu.com")
searchInput = self.browser.find_element_by_id("kw")
searchInput.send_keys("selenium")
searchBtn = self.browser.find_element_by_id("su")
searchBtn.click()
self.assertIn("selenium", self.browser.current_url)
if __name__ == '__main__':
unittest.main(verbosity=2)
Other resources:
https://www.seleniumhq.org/download/
http://ftp.mozilla.org/pub/firefox/releases/ Firefox version
https://www.cnblogs.com/givemelove/p/8482361.html Firefox, Google software and webdriver
This concludes this article, it may be more concerned about the number of public and personal micro signal: