原创:Python爬虫实战之爬取代理ip - 代码天地

原创:Python爬虫实战之爬取代理ip

其他 2020-02-28 22:17:31 阅读次数: 0

　　编程的快乐只有在运行成功的那一刻才知道QAQ

　　目标网站:https://www.kuaidaili.com/free/inha/ #若有侵权请联系我

　　因为上面的代理都是http的所以没写这个判断

　　代码如下:

 1 #!/usr/bin/env python
 2 # -*- coding: utf-8 -*-
 3 import urllib.request
 4 import re
 5 import time
 6 n = 1
 7 headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'}
 8 def web(url):
 9     req=urllib.request.Request(url=url,headers=headers)
10     response = urllib.request.urlopen(url)
11     html = response.read().decode('UTF-8','ignore')
12     ip = r'[0-9]+(?:\.[0-9]+){3}'
13     port = r'"PORT">(\d{0,1}\d{0,1}\d{0,1}\d{0,1}\d)<'
14     out = re.findall(ip,html)
15     out1 = re.findall(port,html)
16     i = 0
17     dictionary = {}
18     while i <= 14:
19         dictionary[out[i]] = out1[i]
20         store(dictionary)
21         i += 1
22     print(out,'\n',out1)
23 def store(dictionary):
24     for key in dictionary:
25         with open('ip.txt','a') as f:
26             c = 'ip:' + key + '\tport:' + dictionary[key] + '\n'
27             f.write(c)
28             print('store successfully')        
29 while n <= 3313:
30     url1 = "https://www.kuaidaili.com/free/intr/"
31     url = url1 + str(n) +'/'
32     web(url)
33     time.sleep(5)
34     n += 1

猜你喜欢

转载自www.cnblogs.com/vhhi/p/12380560.html

原创:Python爬虫实战之爬取代理ip

Python爬虫实战：爬取代理IP

python爬虫爬取代理ip

【爬虫笔记】Python爬虫简单运用爬取代理IP

Python爬虫简单运用爬取代理IP

python爬取代理ip

Python爬虫（一）之获取代理IP

python爬取代理IP地址

爬取代理IP

python爬虫日志（8）爬取代理

python 爬虫获取代理Ip

【爬虫】Python使用requests爬取代理IP并验证可用性

python爬虫：爬取代理IP（requests+Beautiful Soup）教程

原创:Python爬虫实战之爬取美女照片

python爬虫之反爬虫（随机user-agent，获取代理ip，检测代理ip可用性）

python爬虫之抓取代理服务器IP

极简代理IP爬取代码——Python爬取免费代理IP

Python爬虫：爬取免费代理ip

【python爬虫】爬取ip代理池

Python网络爬虫（五）——获取代理IP

Python爬虫技术拆解1：获取代理IP集合

python3+Scrapy爬虫实战（三） —— 使用代理IP，爬取“去哪儿”景点信息

自动爬取代理IP例子

python - 获取代理IP

Java：爬取代理ip，并使用代理IP刷uv

Python爬虫小白教程（四）—— 反反爬之IP代理池

Python爬虫抓取代理IP并检验可用性,自动设置IE代理

C#攻克反爬虫之代理IP爬取

Python之爬虫搭建代理ip池

python爬虫十二：爬取快速ip代理，攻破503

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

让自己的头脑极度开放

CentOS 6.5(x64) 和Redhat6.5操作系误删libc

高可用注册中心

【日记】12.28/【题解】AtCoder AGC041

XML（5）_XML 约束_DTD

Java集合Map（四）

树梅派安装桌面环境教程

pipenv 的使用和安装

小程序白屏问题和内存研究

C语言简单选择排序

每日归档

更多

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)