Description: scrapy-redis can be achieved during a data request url is automatically saved to the redis, but the type of data structure is stored in the queue and the selection priority setting profile hook.
Note: When saving type does not match the url and the url of extraction methods will complain
For example: zset extracted data in the extracted data by way of a data structure of list
In the list mode data extraction redis: lpush key
redis data extraction mode in zset: zrange key start end
(error) WRONGTYPE Operation against a key holding the wrong kind of value
A three priority queue:
# 指定排序爬取地址时使用的队列,
# 默认的 按优先级排序(Scrapy默认),由sorted set实现的一种非FIFO、LIFO方式。
# SCHEDULER_QUEUE_CLASS = 'scrapy_redis.queue.SpiderPriorityQueue'
# 可选的 按先进先出排序(FIFO)
SCHEDULER_QUEUE_CLASS = 'scrapy_redis.queue.SpiderQueue'
# 可选的 按后进先出排序(LIFO)
# SCHEDULER_QUEUE_CLASS = 'scrapy_redis.queue.SpiderStack'
Second, the priority queue corresponding to data type stored in redis the url
scrapy-redis default priority queue: zset
First In First Out (FIFO): list
Last-out (LIFO): list
Third, the choice of extraction method queue
REDIS_START_URLS_AS_SET = True
You can configure this line of code in setting the configuration file.
True: redis set to extract a set of data pattern extracting
False: redis extract data in a list a list of ways to extract