爬取贴吧里的任意一张图片

爬取百度贴吧随便一页里的图片
想爬图片了,玩玩
import re
import urllib
user_agent = ‘Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)’
headers = {‘User-Agent’:user_agent}
def getHtml(url):
page = urllib.urlopen(url)
html = page.read()
return html
def getImg(html):
reg = r’src="(.*?.jpg)" width’
imgre = re.compile(reg)
imglist = re.findall(imgre,html)
x=0
for imgurl in imglist:
urllib.urlretrieve(imgurl,’%s.jpg’%x)
x+=1
html = getHtml(“https://tieba.baidu.com/index.html”)
print getImg(html)

猜你喜欢

转载自blog.csdn.net/wyd117/article/details/83118557