start installation
pip install pyspider
Question 1:
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-vXo1W3/pycurl
Solution
1. Install the curl library first, find the curl version corresponding to your computer by Ctrl+F in pycurl, and download it.
2. Install the file downloaded above and execute it in cmd. The command is:
pip install pycurl-7.43.1-cp37-cp37m-win_amd64.whl
Verify
in cmd
pyspider all
Problem 2
async keyword problem,
Solve:
Windows 10, install pyspider, python3.7 success, but can not start the solution pyspider of
the under-Packages Standard Package pyspider under Site
(1) Fetcher / tornado_fetcher.py
(2) run.py
(3) weibu / app.py
three Replace all async in the item with shark
Specific replacement methods (two):
1. notepad++ first use Ctr+F to find async, then click replace, all async->shark in this file
2. Use idea to select directly from the path to replace the
specific operation: intellij idea Global find and replace
Question 3
ValueError: Invalid configuration: - Deprecated option 'domaincontroller': use 'http_authenticator
solve:
ValueError: Invalid configuration: - Deprecated option ‘domaincontroller’: use 'http_authenticator
final result
Note: At the
beginning, when phantomjs was not installed, only the first two lines of the above result appeared, that is
d:\program files\lib\site-packages\pyspider\libs\utils.py:196: FutureWarning: timeout is not supported on your platform.
warnings.warn("timeout is not supported on your platform.", FutureWarning)
[I 190409 20:28:52 result_worker:49] result_worker starting...
Others are not displayed. Finally, I installed phantomjs according to the online tutorial, and eventually there were many results.
The installation of phantomjs
cannot be downloaded on the official website. Download on phantomjs
and put phantomjs.exe in the same file directory as python.exe
Also pay attention to
add D:\Software\phantomjs-2.1.1-windows\bin to the environment variable
Only two lines of content appear in pyspider
Solution:
Start fetcher alone and always display fetcher starting...
Specific operations:
pip install redis
Ideas
to solve the problem. I used to learn about crawlers from Cui Qingcai's official website. First of all, I installed a lot of software and packages about crawlers. Remember that when I was studying, because of some network reasons (not able to access the Internet, install some package software or packages), the previous ones were not installed. So after reading the relevant webpage, I went back to see that the installation of redis was before pyspider, tried to install it, and it turned out to be