pysider的配置
pyspider, centos 7.4 , python 3.6.5
问题的提出
在启动pyspider的过程中,碰到如下的问题:
其中的信息如下:
[root@AY131203102210033c39Z ~]# pyspider
[W 180813 11:23:41 run:413] phantomjs not found, continue running without it.
[I 180813 11:23:44 result_worker:49] result_worker starting...
Process Process-4:
Traceback (most recent call last):
File "/root/.pyenv/versions/3.6.5/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/root/.pyenv/versions/3.6.5/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 236, in fetcher
Fetcher = load_cls(None, None, fetcher_cls)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 48, in load_cls
return utils.load_object(value)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/libs/utils.py", line 369, in load_object
module = __import__(module_name, globals(), locals(), [object_name])
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/fetcher/__init__.py", line 1, in <module>
from .tornado_fetcher import Fetcher
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/fetcher/tornado_fetcher.py", line 30, in <module>
from tornado.curl_httpclient import CurlAsyncHTTPClient
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/tornado/curl_httpclient.py", line 24, in <module>
import pycurl # type: ignore
ImportError: pycurl: libcurl link-time ssl backend (nss) is different from compile-time ssl backend (openssl)
[I 180813 11:23:44 processor:211] processor starting...
[I 180813 11:23:45 scheduler:647] scheduler starting...
Traceback (most recent call last):
File "/root/.pyenv/versions/3.6.5/bin/pyspider", line 11, in <module>
load_entry_point('pyspider==0.3.10', 'console_scripts', 'pyspider')()
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 754, in main
cli()
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 1043, in invoke
return Command.invoke(self, ctx)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 165, in cli
ctx.invoke(all)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 497, in all
ctx.invoke(webui, **webui_config)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 333, in webui
app = load_cls(None, None, webui_instance)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 48, in load_cls
return utils.load_object(value)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/libs/utils.py", line 369, in load_object
module = __import__(module_name, globals(), locals(), [object_name])
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/webui/__init__.py", line 8, in <module>
from . import app, index, debug, task, result, login
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/webui/app.py", line 17, in <module>
from pyspider.fetcher import tornado_fetcher
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/fetcher/__init__.py", line 1, in <module>
from .tornado_fetcher import Fetcher
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/fetcher/tornado_fetcher.py", line 30, in <module>
from tornado.curl_httpclient import CurlAsyncHTTPClient
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/tornado/curl_httpclient.py", line 24, in <module>
import pycurl # type: ignore
ImportError: pycurl: libcurl link-time ssl backend (nss) is different from compile-time ssl backend (openssl)
从错误信息的输出来看,其是nss与openssl之间的错配问题。
问题分析
由于之前在安装openssl的时候,进行了pycurl环境变量的配置,其中使用了openssl.
于是采用了如下的策略:
pip uninstall pycurl
pip install –no-cache-dir –compile –ignore-installed –install-option=”–with-nss” pycurl
vim ~/.bashrc
将其中修改为:export PYCURL_SSL_LIBRARY=nss
source ~/.bashrc
pip uninstall pyspider
pip install pyspider
在完成所有这些操作之后,重启启动pyspider即可。
正确的输入出如下:
[root@xxx~]# pyspider
[W 180813 11:31:52 run:413] phantomjs not found, continue running without it.
[I 180813 11:31:54 result_worker:49] result_worker starting...
[I 180813 11:31:55 processor:211] processor starting...
[I 180813 11:31:55 tornado_fetcher:638] fetcher starting...
[I 180813 11:31:55 scheduler:647] scheduler starting...
[I 180813 11:31:55 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 180813 11:31:55 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
[I 180813 11:31:55 app:76] webui running on 0.0.0.0:5000
总结
总体感觉pyspider在运行环境的处理上,做的还是有待提高的,毕竟在安装和启动过程中,碰到了如此多的问题,这些都是需要改进的内容。总体而言,pyspider还是一个很赞的项目。