1.
修改middlewares.py中****Dowmloadermiddleware中process_request方法
class Books2DownloaderMiddleware(object): # Not all methods need to be defined. If a method is not defined, # scrapy acts as if the downloader middleware does not modify the # passed objects. requestIP=[{"ipaddr": "111.155.116.237:8123"}, {"ipaddr": "101.236.23.202:8866"}, ](ip来源:西刺代理) @classmethod def from_crawler(cls, crawler): # This method is used by Scrapy to create your spiders. s = cls() crawler.signals.connect(s.spider_opened, signal=signals.spider_opened) return s #下载器的中间键,存在于引擎与下载器之间 def process_request(self, request, spider): # Called for each request that goes through the downloader # middleware. # Must either: # - return None: continue processing this request # - or return a Response object # - or return a Request object # - or raise IgnoreRequest: process_exception() methods of # installed downloader middleware will be called currentIP=random.choice(self.requestIP) print("currentIP:" + currentIP["ipaddr"]) request.meta["proxy"] = "http://" + currentIP["ipaddr"]2.在settings.py文件中,把
DOWNLOADER_MIDDLEWARES = { 'books2.middlewares.Books2DownloaderMiddleware': 543, }放开即可
3.效果:
ip不稳定时: