1. What is scrapyd
Scrapyd is a service used to run scrapy crawlers.
It allows you to deploy your scrapy project and control your reptile through HTTP JSON way.
The official document: http://scrapyd.readthedocs.org/
2, and scrapyd-client installation scrapyd
pip install scrapyd (server)
pip install scrapyd-client (Client)
After installation is complete, there is a scrapyd.exe Scripts files in python installation directory, execute the command line window, you can access http://127.0.0.1:6800, enter
A very simple page,
Scrapyd execute this command line indicates the service opened,
Access the service in a browser,
The above represents the server installation was successful, and now to install the test client, after installation, perform scrapyd-deploy to test whether the installation is successful,
Fails, we look at the following Python Scripts Is there a scrapyd-client.exe command,
See from the picture there are a scrapyd-deploy file, but not an executable file, the file is open and found a file containing Python code, so if it
Execution, to do Python interpreter to execute the file, or is it compiled into an executable file in two ways.
The first way:
In this folder create scrapyd-deploy.bat file and enter in it:
@echo off C:\Users\18065\AppData\Local\Programs\Python\Python37-32\python.exe C:\Users\18065\AppData\Local\Programs\Python\Python37-32\Scripts\scrapyd-deploy %*
第二行第一个路径是python解释器的绝对路径,第二路径是scrapyd-deploy文件的绝对路径,然后再来执行scrapyd-deploy命令
这样就表明安装成功了
第二种方式:
用可以将python源文件编译为scrapyd-deploy.exe可执行程序的模块pyinstaller
3、上传爬虫项目
在上传之前必须修改一下配置文件,将scrapy.cfg中的url注释掉
注释掉后,就可以开始正式上传了,上传时必须要在爬虫项目文件中,
执行
4、运行爬虫项目
上传过后就可以在命令行窗口开始启动爬虫了
启动命令:
启动后,就可以看到在开启服务的那个命令行窗口不断出现scrapy项目运行时的数据,在http://127.0.0.1:6800/jobs页面显示
爬虫运行信息,在http://127.0.0.1:6800/logs/页面显示运行日志
5、关闭爬虫项目
关闭命令: