robots.txt配置 减小服务器压力

由于有之前发现很多国外的爬虫会抓页面的经验,为了减小压力直接修改 robots.txt :

# xiaowu
User-agent: Baiduspider
Allow: /
Disallow: /admin/
Disallow: /*.php$

User-agent: Sosospider
Allow: /
Disallow: /admin/
Disallow: /*.php$

User-agent: sogou spider
Allow: /
Disallow: /admin/
Disallow: /*.php$

User-agent: Googlebot
Allow: /
Disallow: /admin/
Disallow: /*.php$

User-agent: Bingbot
Allow: /
Disallow: /admin/
Disallow: /*.php$

User-agent: MSNBot
Allow: /
Disallow: /admin/
Disallow: /*.php$

User-agent: googlebot-mobile
Allow: /
Disallow: /admin/
Disallow: /*.php$

User-agent: 360Spider
Allow: /
Disallow: /admin/
Disallow: /*.php$

User-agent: HaosouSpider
Allow: /
Disallow: /admin/
Disallow: /*.php$

User-agent: *
Disallow: /

猜你喜欢

转载自blog.csdn.net/z13615480737/article/details/87929319
今日推荐