python问题随记

1、访问URL
2.x:导入urllib,访问urllib.urlopen(url)
3.x:导入urllib.request,访问urllib.request.urlopen(url)
否则会报错:
AttributeError: module ‘urllib’ has no attribute ‘urlopen’
2、读取html
正则表达式匹配html时html需要解码:
def getImg(html):
reg = r’src=”(.+?.(png|gif|jpg))”’
imgre = re.compile(reg)
html = html.decode(‘utf-8’)
imglist = re.findall(imgre,html)
return imglist
否则会报错:
TypeError: cannot use a string pattern on a bytes-like object

参考文章:
Python实现简单爬虫功能
Python学习笔记:学习爬虫时遇到的问题TypeError: cannot use a string pattern on a bytes-like object 与解决办法

猜你喜欢

转载自blog.csdn.net/hbyzzdw/article/details/78252221