"" " Scrapy initial Url of two way, one is constant start_urls, and the need to define a method parse () Another method is to directly define a: star_requests () " "" Import Scrapy class simpleUrl (scrapy.Spider): = name "simpleUrl" start_urls = [# another writing, no need to define a method start_requests ' http://lab.scrapyd.cn/page/1/ ', ' http://lab.scrapyd.cn/page/2/ ' ] # writing another initial link # DEF start_requests (Self): # URLs = [# link crawling through this method in the link page crawling #' http://lab.scrapyd.cn/page/1 / ', #' http://lab.scrapyd.cn/page/2/ ', #] # for url in urls: # Scrapy the yield.Request(url=url, callback=self.parse) # If the initial short url, method name must be: the parse DEF the parse (Self, Response): Page = response.url.split ( "/") [- 2] filename = 'mingyan-% s.html' Page% Open with (filename, 'WB') AS F: f.write (response.body) self.log ( 'save file:% s'% filename)
Both versions of python reptile scrapy
Guess you like
Origin www.cnblogs.com/stillstep/p/11099809.html
Ranking