Python crawler | Use Selenium and BeautifulSoup to crawl xxxticket information and save it to an Excel file

foreword

        12306 is the only official website of China Railway Passenger Transport and an important online platform for purchasing train tickets. This article mainly introduces how to use the Python crawler module Selenium and BeautifulSoup to crawl train ticket information from 12306 and save it in an Excel document, so that everyone can view and compare the prices and remaining tickets of different train numbers and seat types.

Preparation

Before we start, we need to do some preparatory work.

Install the following components:

  • Python 3.x
  • BeautifulSoup 4
  • Selenium
  • Chrome browser or other browsers that support Selenium

The above components can be installed using the following command:

pip install beautifulsoup4
pip install selenium

Note: Since you need to use Selenium to simulate a browser to access the website, you need to download the front-end driver and select the corresponding driver according to the browser version. This article uses the Chrome browser, and the driver download link is: http://npm.taobao.org/mirrors/chromedriver/.

After the installation is complete, decompress the downloaded driver to any location, and add the folder where the driver is located to the PATH environment variable of the system.

Crawl train ticket information

Set request parameters

First, you need to set the request parameters, including departure city, arrival city, departure date and other information. In code, use the following parameters:

fro

Guess you like

Origin blog.csdn.net/weixin_43263566/article/details/131332250