python爬虫系列二：requests-设置headers（3） - 代码天地

python爬虫系列二：requests-设置headers（3）

其他 2018-10-14 08:19:50 阅读次数: 0

1、为什么要设置headers?
在请求网页爬取的时候，输出的text信息中会出现抱歉，无法访问等字眼，这就是禁止爬取，需要通过反爬机制去解决这个问题。headers是解决requests请求反爬的方法之一，相当于我们进去这个网页的服务器本身，假装自己本身在爬取数据。
2、 headers在哪里找？
谷歌或者火狐浏览器，在网页面上点击右键，–>检查–>剩余按照图中显示操作，需要按Fn+F5刷新出网页来

3、headers中有很多内容，主要常用的就是user-agent 和 host，他们是以键对的形式展现出来，如果user-agent 以字典键对形式作为headers的内容，就可以反爬成功，就不需要其他键对；否则，需要加入headers下的更多键对形式。

import requests
res=requests.get("http://www.dianping.com/",headers=headers)
print(res.text)
#输出会出现：抱歉！页面无法访问....这就是限制爬虫了

#解决方法：加入headers，在requests.get（headers=headers）里面，添加headers
#构建headers

import requests
headers={
"Host": "www.dianping.com"
"User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.146 Safari/537.36"
}
res=requests.get("http://www.dianping.com/",headers=headers)
print(res.text)

猜你喜欢

转载自blog.csdn.net/qq_42787271/article/details/81571229

python爬虫系列二：requests-设置headers（3）

python爬虫系列二：requests-乱码处理（2）

python爬虫系列二：requests-最常用库_post(1)

Python——爬虫【Requests设置请求头Headers】

Python爬虫(二) | requests

python爬虫设置请求消息头(headers)

Python——requests,headers,encoding

requests-爬虫多页爬取肯德基餐厅位置

python爬虫系列——requests库

Requests-html 设置 headers

【Python爬虫】Requests的使用（3）

python爬虫（二）requests与BeautifulSoap

Python3+Requests+Unittest实战系列【二】

python 爬虫系列02 认识 requests

Python爬虫系列-Requests库详解

爬虫小练（刷访问量）（python+requests（headers+proxy)+Queue+threading）

python3 爬虫伪装headers User-Agent

requests-使用代理proxies

requests-发送post请求

requests-爬虫实现一个简易网页采集器

python3爬虫系列之使用requests爬取LOL英雄图片

爬虫_python3_requests_2

python3 爬虫（requests+BeautifulSoup）

Python3爬虫笔记 -- requests

Python3爬虫requests使用

[Python自学] 爬虫（3）Requests库

python (3) 爬虫必备——requests库

python爬虫3：requests库-案例1

python爬虫5：requests库-案例3

python requests 设置headers 和 post请求体x-www-form-urlencoded

今日推荐

NetBSD 禁止提交由 AI 生成的代码

Apache Doris 2.0.10 版本正式发布！

开源日报 | 大模型开战；大模型独角兽被曝卖身；周鸿祎建议谷歌开源所有产品；最大开源AI社区提供1000万美元共享GPU

开源日报 | Chrome内置Gemini的意义不在于Gemini；中国AI追随之路的五大误区；ECharts创始人“下海”养鱼；谷歌I/O开发者大会什么都有，只是没有惊喜

微软回应中国区AI团队“打包赴美”传闻

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

周排行

static方法和非static方法的区别（java）

如何查找计算机专业paper

java.lang.ClassFormatError: Incompatible magic value 0 in class file com/sitecha

跳跃游戏II

stm32_之【建立工程】

TeaWeb v0.0.9 发布，统计底层优化、主机监控功能改进

事件分发 -----控制字体大小

JavaScript DOM练习（动态表格添加） December 25，2019

JSF Scope & CDI

实现从零搭建一个登录注册页面（附源代码）

每日归档

更多

2024-05-19(0)

2024-05-18(4)

2024-05-17(34)

2024-05-16(6)

2024-05-15(24)

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)