转:https://blog.csdn.net/haipengdai/article/details/48713791
爬数据的时候常常遇到img标签的src中不包含图片的后缀名,这时通过imghdr模块就能够把图片的后缀名读出来。
- import urllib2
- import imghdr
- url = 'http://photos.prnewswire.com/prn/20100819/LA52539LOGO'
- response = urllib2.urlopen(url)
- webpage = response.read()
- print imghdr.what('', webpage)