Python3 web crawler combat -13 deployment-related library ScrapydClient, ScrapydAPI

ScrapydClient installation

When the code is deployed to remote Scrapyd Scrapy, the first step is to get the code files packaged as Egg, Egg secondly need to upload files to a remote host, this process if we use the program implementation is entirely possible, but we these do not need to work, because ScrapydClient have implemented these functions for us.
Here we take a look through the installation process ScrapydClient.

1. Links

2. Pip installation

Pip recommended installation order is as follows:

pip3 install scrapyd-client

3. Verify the installation

After a successful installation will be available commands, called scrapyd-deploy, namely the deployment command.
We can enter the following command to test whether ScrapydClient test installation was successful:

scrapyd-deploy -h

If the output shown in Figure 1-87 appears similar to the proof ScrapydClient has been successfully installed:
Python3 web crawler combat -13 deployment-related library ScrapydClient, ScrapydAPI

Figure 1-87 operating results
later we will learn more about its use.

ScrapydAPI installation

After installing the Scrapyd, we can directly request the API it provides to get the current host Scrapy task operating conditions.

As a host IP is 192.168.1.1, you can run the following command to get all the items Scrapy current host:

curl http://localhost:6800/listprojects.json
Python资源分享qun 784758214 ,内有安装包,PDF,学习视频,这里是Python学习者的聚集地,零基础,进阶,都欢迎

operation result:

{"status": "ok", "projects": ["myproject", "otherproject"]}

The result is Json string by parsing the string we can get the current host all projects.

But in this way to get the job status still a little cumbersome, so ScrapydAPI to do it with a layer of packaging, let's look at its installation.

1. Links

2. Pip installation

Pip recommended installation order is as follows:

pip install python-scrapyd-api

3. Verify the installation

After installation they may be used to get the host state of the Python, the above operation can be realized with the Python code:

from scrapyd_api import ScrapydAPI
scrapyd = ScrapydAPI('http://localhost:6800')
print(scrapyd.list_projects())

operation result:

["myproject", "otherproject"]
Python资源分享qun 784758214 ,内有安装包,PDF,学习视频,这里是Python学习者的聚集地,零基础,进阶,都欢迎

So that we can get directly to the operating status of each task the host Scrapy with Python.

Guess you like

Origin blog.51cto.com/14445003/2425408