Python selenium new crown data analysis [ unblocked ]

foreword

Selenium simulates the operation of humans for reptiles. It is easier to get started. This time, we use selenium + Firefox driver. You should pay attention to the version of selenium, Firefox, and Firefox driver. The best way is to use the latest one. Use pip for library installation Method, Firefox driver download address: https://github.com/mozilla/geckodriver/releases

Pay attention to the installation of your own computer version. Let me say here that I am 64-bit, but the last geckodriver-v0.32.0-win32 was used successfully. The reason is unknown. After installation, decompress the driver and put it in the same directory as py. If you find a problem during the running process, search and solve it (nonsense). The problem I encountered was that I couldn't find the browser or the driver, and they were all solved in the end.
This article refers to the code of CSDN predecessors.

pip install selenium 

Purpose and Knowledge Points

The effect of this script is to automatically crawl the epidemic data in all links from the first page to the ninth page of the Health Commission. The content you want to get is the cumulative number of confirmed cases in mainland China, Hong Kong, and Taiwan from April to November 2022, Cumulative death toll. The purpose is to calculate the change in mortality rate.
The main knowledge points used are:

  1. Pull up the webpage through selenium, use browser.find_element, browser.find_elements method to find links and text
  2. Filter corresponding text by regular expression
  3. export csv

code

Beginners recommend using jupyter notebook to facilitate code debugging

 

Guess you like

Origin blog.csdn.net/weixin_42984235/article/details/128144836