scrapy学习---管道

使用管道必须实现process_item() 方法
process_item (selfitemspider)

次方法实现数据的过滤处理等操作

open_spider (selfspider)

开始运行爬虫是调用

close_spider (selfspider)

结束爬虫时调用

from_crawler (clscrawler)

If present, this classmethod is called to create a pipeline instance from a Crawler. It must return a new instance of the pipeline. Crawler object provides access to all Scrapy core components like settings and signals; it is a way for pipeline to access them and hook its functionality into Scrapy.

To activate an Item Pipeline component you must add its class to the ITEM_PIPELINES setting, like in the following example:

ITEM_PIPELINES = {
    'myproject.pipelines.PricePipeline': 300, 'myproject.pipelines.JsonWriterPipeline': 800, }

猜你喜欢

转载自www.cnblogs.com/jack-jt-z/p/10527404.html