python爬虫：爬取代理IP（requests+Beautiful Soup）教程 - 代码天地

python爬虫：爬取代理IP（requests+Beautiful Soup）教程

编程语言 2019-03-14 14:10:53 阅读次数: 0

在写python爬虫的时候，为防止对方发现爬虫IP和封停IP，那写爬虫的时候，就要用python去抓取一些代理IP，然后用这些代理IP不停地轮徇地爬取对方数据。在现实使用中，最好要隔段时间就去爬一次代理IP，并添加到代理IP库中，同时把不能使用的IP踢出IP库，每次爬求数据时，从代理IP库中随机（random.choice）获取一个IP。

脚本主要使用到requests模块和Beautiful Soup（bs4 ）模块

以下是爬取代理IP的简易脚本：

.

脚本中使用requests模块去请求url，爬取数据

使用re模块写正则匹配IP和端口，这IP和端口，都在HTML td元素中：

re.compile(r'(\d+\.\d+\.\d+\.\d+)')

re.compile(r'(\d+)')

用 bs4 模块去解析HTML数据，拿我们要指定的元素节点中的数据

最后在使用IP时，可以用random.choice随机返回一个IP

re匹配表达式，还要看对方网站的HTML元素是怎么写的，要根据具体情况具体判断

猜你喜欢

转载自blog.csdn.net/qq_40925239/article/details/88551677

python爬虫：爬取代理IP（requests+Beautiful Soup）教程

Beautiful Soup 教程

Python 爬虫入门教程，使用 Beautiful Soup 爬取某网站弹幕教程

Web爬虫|入门教程之解析库Beautiful Soup

Python爬虫--Beautiful Soup

Python中使用Beautiful Soup库的超详细教程

网络爬虫基础教程 Web scraping using Beautiful soup in Python: An introduction

Python 爬虫 ---- Beautiful Soup（二）

Python 爬虫 ---- Beautiful Soup（一）

爬虫---Beautiful Soup 爬取图片

爬虫之Beautiful Soup

爬虫数据-Beautiful Soup

Beautiful Soup

Python Beautiful Soup 4

Python爬虫利器 Beautiful Soup的用法

python 安装爬虫模块神器 Beautiful Soup

【python爬虫自学笔记】-----Beautiful Soup 用法

（十九）Python爬虫：Beautiful Soup的使用

python 爬虫学习--Beautiful Soup插件

初探Python网络爬虫：Beautiful Soup库

Python 爬虫解析库的使用 --- Beautiful Soup

Python爬虫之Beautiful Soup的用法

Python爬虫库-Beautiful Soup的使用

python爬虫基础:Beautiful Soup用法详解

04 Python爬虫之Beautiful Soup库

Python网络爬虫（四）——Beautiful Soup库

python爬虫之Beautiful Soup实战

Python爬虫入门——Beautiful Soup库的使用

Beautiful Soup库 - Python爬虫(二)

python爬虫--03 Beautiful Soup库

今日推荐

面壁智能发布 Eurux-8x22B 开源大模型 —— 堪称「理科状元」

开源日报 | 谷歌扶持鸿蒙上位；开源Rabbit R1；Docker加持的安卓手机；微软的焦虑和野心；海尔电器把开放平台关了

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

周排行

OOP第二次作业

java web 乱码问题

android 禁止scrollview 因控件变化自动滚动到底的方法

mysql服务解压版的安装(5.7)

centos7 nginx+tomcat配置https 安装免费SSL Let’s Encrypt

使用Mosquitto遗嘱机制实现感知客户端上下线功能的方法

面向对象之------多态与多态性

开发Teams Tabs应用程序

C# 希尔排序

第2章 Jupyter Notebooks

每日归档

更多

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)