基于requests的爬虫基础 - 代码天地

基于requests的爬虫基础

其他 2021-03-03 06:47:03 阅读次数: 0

1、首先安装requests模块 pip install requests

2、给出url和 headers的参数：

3、浏览器中按F12 进入network ，刷新页面，然后点击

基础完整代码如下：

import requests  #导入模块
url = 'https://www.baidu.com/'  #url地址

headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"}  # 请求头，用来模拟浏览器
#response = requests.get(url=url,headers=headers).text   #返回纯文本，可能乱码


response = requests.get(url=url,headers=headers).content.decode("utf-8","ignore") #返回指定编码格式

## 加入参数ignore可以忽略部分编码


with open("baidu.html","w",encoding="utf-8") as f:   #写入文件
    f.write(response)

url带参数完整代码如下：

第一种方式：

import requests
url = 'https://www.baidu.com/s?wd=哈士奇'
headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"}
response = requests.get(url=url,headers=headers).content.decode("utf-8")
with open("hashiqi.html","w",encoding="utf-8") as f:
    f.write(response)

第二种方式：

import requests
url = 'https://www.baidu.com/s?'
params={"wd":"边"}

headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"}
response = requests.get(url=url,params=params,headers=headers).content.decode("utf-8")

with open("bian1.html","w",encoding="utf-8") as f:
    f.write(response)

猜你喜欢

转载自blog.csdn.net/qq_40576301/article/details/99769801

基于requests的爬虫基础

爬虫基础——requests库

爬虫3 requests基础

一,requests爬虫基础

【爬虫基础】Requests库

python爬虫基础（requests、BeautifulSoup）

Python爬虫基础—requests库

Python爬虫 requests库基础

【requests】------- PYTHON爬虫基础2

浅谈爬虫——基于python的requests模块

网络爬虫基础之二（requests）

（十六）Python爬虫基础库：requests

爬虫基础——————（requests，cookie，session，json）

python爬虫基础03-requests库

爬虫基础(分类/requests模块使用)

05爬虫-requests模块基础（2）

爬虫的初始和requests模块基础用法

python爬虫基础Ⅰ——requests、BeautifulSoup：书本信息

Python爬虫1.3 — requests基础用法教程

爬虫基础--requests,.content的基本使用

python爬虫之requests 模块基础

爬虫学习（02）：requests模块基础

爬虫（Requests）

requests 爬虫

requests爬虫

爬虫_requests

爬虫 - requests

python淘宝爬虫基于requests抓取淘宝商品数据

python3 --- 基于requests + beautifulsoup 实现爬虫项目

基于requests库和re库的淘宝商品网络爬虫

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)