设置随机UA
修改middlewares.py
from fake_useragent import UserAgent
class RandomUserAgentMiddleware(object):
def process_request(self, request, spider):
ua = UserAgent()
request.headers['User-Agent'] = ua.random
修改settings.py
# Enable or disable downloader middlewares
# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
DOWNLOADER_MIDDLEWARES = {
'scrapy_test.middlewares.RandomUserAgentMiddleware': 543,
}
设置IP代理
测试网站:http://icanhazip.com,网站可以返回当前请求的ip地址。
添加referer
default_headers = {
'referer': 'https://www.baidu.com/',
}