The solution to the problem of Scrapy returning an empty list

 When learning the Scrapy framework today, an empty list was returned when calling the following method to send a request.

import scrapy


class Test01Spider(scrapy.Spider):
    name = "test01"
    allowed_domains = ["https://baike.baidu.com/item/%E7%99%BE%E5%BA%A6/6699?fromModule=lemma_search-box"]
    start_urls = ["https://baike.baidu.com/item/%E7%99%BE%E5%BA%A6/6699?fromModule=lemma_search-box"]


    def parse(self, response):
        pass
        get_text = response.xpath("/html/body/div[3]/div[2]/div/div[1]/div[4]/div[3]/text()").extract()
        print(get_text)

 After I tried many times and checked that the xpath was correct, it still returned an empty list.

Later, I searched for many solutions on the Internet, and found that the cookie in the header information was not set, and Scrapy used its internally set header information by default. So just modify the content in the setting:

1. Uncomment COOKIES_ENABLE = False :

2. Uncomment DEFAULT_REQUEST_HEADERS and add Cookie information:

In this way, the corresponding information can be obtained by requesting:

Guess you like

Origin blog.csdn.net/m0_61151031/article/details/129915140