scrapy 设置ip代理

1.


修改middlewares.py中****Dowmloadermiddleware中process_request方法
 
 
 
 
class Books2DownloaderMiddleware(object):
    # Not all methods need to be defined. If a method is not defined,
    # scrapy acts as if the downloader middleware does not modify the
    # passed objects.
    requestIP=[{"ipaddr": "111.155.116.237:8123"},
               {"ipaddr": "101.236.23.202:8866"},
               ](ip来源:西刺代理)
    @classmethod
    def from_crawler(cls, crawler):
        # This method is used by Scrapy to create your spiders.
        s = cls()
        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
        return s
    #下载器的中间键,存在于引擎与下载器之间
    def process_request(self, request, spider):
        # Called for each request that goes through the downloader
        # middleware.
        # Must either:
        # - return None: continue processing this request
        # - or return a Response object
        # - or return a Request object
        # - or raise IgnoreRequest: process_exception() methods of
        #   installed downloader middleware will be called
        currentIP=random.choice(self.requestIP)
        print("currentIP:" + currentIP["ipaddr"])
        request.meta["proxy"] = "http://" + currentIP["ipaddr"]
2.在settings.py文件中,把

DOWNLOADER_MIDDLEWARES = {
   'books2.middlewares.Books2DownloaderMiddleware': 543,
}放开即可
3.效果:
ip不稳定时:


猜你喜欢

转载自blog.csdn.net/rookie_is_me/article/details/81001083