Sesame HTTP: Installation of pyspider

Pyspider is a powerful web crawler framework written by Chinese binux. It has a powerful WebUI, script editor, task monitor, project manager and result processor. It supports multiple database backends, multiple message queues, and also It supports crawling of pages rendered by JavaScript, which is very convenient to use. This section introduces its installation process.

1. Related Links

2. Preparations

pyspider supports JavaScript rendering, and this process depends on PhantomJS, so PhantomJS needs to be installed (see section 1.2.5 for the specific installation process).

3. pip install

 

It is recommended to use pip to install, the command is as follows:

pip3 install pyspider

 After the command is executed, the installation is completed.

4. Common mistakes

This error message may appear under Windows:

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-vXo1W3/pycurl

 This is a PyCurl installation error, the PyCurl library needs to be installed at this point. Find the corresponding Python version from http://www.lfd.uci.edu/~gohlke/pythonlibs/#pycurl , and then download the corresponding wheel file. For example, Windows 64-bit, Python 3.6, you need to download pycurl‑7.43.0‑cp36‑cp36m‑win_amd64.whl, and then install it with pip. The command is as follows:

pip3 install pycurl‑7.43.0‑cp36‑cp36m‑win_amd64.whl

 If you encounter PyCurl errors under Linux, you can refer to this article: https://imlonghao.com/19.html .

5. Verify the installation

After the installation is complete, you can start pyspider directly from the command line:

pyspider all

 At this point, the console will have output similar to the one shown in Figure 1.

Figure 1 Console

At this time, the web service of pyspider will run on the local port 5000. Open http://localhost:5000/ directly in the browser to enter the WebUI management page of pyspider, as shown in Figure 2, which proves that pyspider has been successfully installed.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326244816&siteId=291194637