scrapy多层爬取

想了很久最终还是决定把分层爬取加上
最关键的是这几行:

#获取详情页网址
security_item['url'] = i_item.xpath(".//div[@class='row2']/h3/a/@href").extract()[0]				
#跳转detail_parse方法,抓取数据以后返回					
yield scrapy.Request(security_item['url'],meta={'security_item':security_item},callback=self.detail_parse)				

最后引入的detail_parse方法:

def detail_parse(self,response):
		security_item = response.meta['security_item']
		security_item['detail'] = response.xpath("//div[@class='mianLeft']/div[@class='de_p']").xpath('string(.)').extract()[0]
		return security_item	

这样就完美解决啦!
给源代码截个图吧
在这里插入图片描述

发布了22 篇原创文章 · 获赞 18 · 访问量 7197

猜你喜欢

转载自blog.csdn.net/weixin_43525427/article/details/97140370