爬虫小工具处理headers 和对requests 的封装 - 代码天地

爬虫小工具处理headers 和对requests 的封装

其他 2020-03-22 16:25:16 阅读次数: 0

对requests 的封装，设置成liveTemplate，就不用每次都敲了。

import datetime

import requests


class requests_spider(object):
    def __init__(self, url, headers):
        self.url = url
        self.headers = headers

    def get_request(self, ccontent_type=None):
        """当content_type=0 时返回str类型
        content_type=1时返回的是二进制类型（图片视频，音频）
        content_type 不传时返回的字典。
        """
        try:
            response = requests.get(self.url, headers=self.headers)
            if ccontent_type == 0:
                response.encoding = response.apparent_encoding
                return response.text
            elif ccontent_type == 1:
                return response.content
            else:
                return response.json()
        except Exception as e:
            print("INFO: %s %s" % (datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), e))

代码处理headers

方法一利用元祖字典解包的方法（字典或元祖必须是二维的），字典有key 和value

def handle_headers(string):
    string=string.strip()
    result=dict([(line.replace(" ","")).split(":",1) for line in string.split("\n")])  # 1表示分割一次
    return result

方法二利用字典的方法设置dict.setdefault

def handle_head(header_str):
    headers = {}
    for line in header_str.split("\n"):
        if  not line: # 祛除空行
            continue
        headers.setdefault(line.split(": ")[0],line.split(": ")[1])
    return headers

go_flush

发布了127 篇原创文章 · 获赞 25 · 访问量 3万+

私信关注

猜你喜欢

转载自blog.csdn.net/weixin_44224529/article/details/103791129

爬虫小工具处理headers 和对requests 的封装

java调试日志封装小工具

爬虫 - requests 和 BeautifulSoup

requests使用ip代理时单ip和多ip设置方式，智联招聘小爬虫封装

ab小工具的Failed requests多的问题

【Requests】Python封装requests模块

python requests 封装请求

python requests 请求的封装

requests库及请求封装

如何优雅的封装requests

requests模块API封装

接口测试工具requests介绍及二次封装

Qt杂项和小工具

python小工具tqdm和retry

Linux中的基础和小工具

paper文献和科研小工具

tkinter和PIL中Image的使用——图像背景色处理小工具

爬虫入门——requests和Beautifulsoup

爬虫简介和requests模块

python接口测试：封装get和post请求+重新封装requests类

app自动化封装：自己写的小工具：等待机制

送给初学爬虫者们的一个小工具:如何用正则匹配headers？

requests---requests封装请求类型

python requests函数封装方法

requests发送数据和对反爬虫的处理 ----------python的爬虫学习

批量处理小工具- 批量resize 和批量mask2edge(canny法, 求梯度法)

爬虫简单封装post和get方法

Python爬虫之BeautifulSoup和requests的使用

爬虫 requests库的cookie 和session

爬虫【三】 requests和BeautifulSoup的使用

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)