A series of problems in the installation of pyspider

start installation

pip install pyspider

Question 1:

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-vXo1W3/pycurl

Solution
1. Install the curl library first, find the curl version corresponding to your computer by Ctrl+F in pycurl, and download it.
2. Install the file downloaded above and execute it in cmd. The command is:

pip install pycurl-7.43.1-cp37-cp37m-win_amd64.whl

Verify
in cmd

pyspider all

Problem 2
async keyword problem,

Solve:
Windows 10, install pyspider, python3.7 success, but can not start the solution pyspider of
the under-Packages Standard Package pyspider under Site
(1) Fetcher / tornado_fetcher.py
(2) run.py
(3) weibu / app.py
three Replace all async in the item with shark

Specific replacement methods (two):
1. notepad++ first use Ctr+F to find async, then click replace, all async->shark in this file
2. Use idea to select directly from the path to replace the
specific operation: intellij idea Global find and replace

Question 3

ValueError: Invalid configuration: - Deprecated option 'domaincontroller': use 'http_authenticator

solve:

ValueError: Invalid configuration: - Deprecated option ‘domaincontroller’: use 'http_authenticator

final result
Insert picture description here
Insert picture description here

Note: At the
beginning, when phantomjs was not installed, only the first two lines of the above result appeared, that is

d:\program files\lib\site-packages\pyspider\libs\utils.py:196: FutureWarning: timeout is not supported on your platform.
  warnings.warn("timeout is not supported on your platform.", FutureWarning)
[I 190409 20:28:52 result_worker:49] result_worker starting...

Others are not displayed. Finally, I installed phantomjs according to the online tutorial, and eventually there were many results.

The installation of phantomjs
cannot be downloaded on the official website. Download on phantomjs
and put phantomjs.exe in the same file directory as python.exe

Also pay attention to
add D:\Software\phantomjs-2.1.1-windows\bin to the environment variable

Only two lines of content appear in pyspider
Insert picture description here

Solution:
Start fetcher alone and always display fetcher starting...
Specific operations:

pip install redis

Ideas
to solve the problem. I used to learn about crawlers from Cui Qingcai's official website. First of all, I installed a lot of software and packages about crawlers. Remember that when I was studying, because of some network reasons (not able to access the Internet, install some package software or packages), the previous ones were not installed. So after reading the relevant webpage, I went back to see that the installation of redis was before pyspider, tried to install it, and it turned out to be

Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44517301/article/details/114916232