Debugging method of scrapy

Parse命令,Scrapy shell,logging

A Parse command

  The most basic way to inspect spider output is to use the Parse command. This allows you to check the effect of the older part of the spider at the function level, which is very flexible and used. But it can't be tested in code.

  https://docs.scrapy.org/en/latest/topics/commands.html#std:command-parse

二 Scrapy shell

  The basic use is to cooperate with the view to view the data obtained by scapy.

  High-end usage is. Use the scrapy.shell.inspect_response method to view the processed response in a certain position of the spider to confirm whether the expected response reaches a specific position.

  The effect is equivalent to that every response that knows parse will support shell commands for viewing.

  Still very useful.

import scrapy

from scrapy.shell import inspect_response
START_URL = 'http://www.521609.com/daxuexiaohua/list31{}.html'
class XiaohuaSpider(scrapy.Spider):
    name = ' xiaohua '

    def start_requests(self):
        yield scrapy.Request(url=START_URL.format(1))
    def parse(self, response):
        inspect_response(response,self)
        items = response.css('div.list_center > ul > li')
        for item in items:
            title = item.css('a.title::text').extract_first()
            print(title)
        next_ = response.css('div.listpage > ol > li:nth-child(14) > a::text')
        if next_.extract_first() == '下一页':
            next_url = response.css('div.listpage > ol > li:nth-child(14) > a::attr(href)').extract_first()
            # print(next_url)
            abs_url = response.urljoin(next_url)
            yield scrapy.Request(url=abs_url)

three records

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325123406&siteId=291194637
Recommended