如何使用python制作一个简单的爬虫?

import urllib.request
import random
# 这是一个代理池
proxies_pool = [
    {'http:':'192.168.0.1:80'}, #这里面要是你的ip池哦
    {'http:': '192.168.0.1:8080'},
]
#在这里随机获取一个代理
proxies = random.choice(proxies_pool)

url = 'http:www.baidu.com/s?wd=ip'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36'
}
request = urllib.request.Request(url=url,headers=headers)
handler = urllib.request.ProxyHandler(proxies=proxies)
opener = urllib.request.build_opener(handler)
response = opener.open(request)
content = response.read().decode('utf-8')
with open('daili.html','w',encoding='utf-8') as fp:
    fp.write(content)

上面的操作步骤你就可以在你的根目录生成一个文件daili.html

文件内容就是你爬取的地址

猜你喜欢

转载自blog.csdn.net/ZHANG157111/article/details/130313656