今天发现一个错误日志:
2013-06-06 12:25:13,332 [ERROR] upload.service.UploadFileService - image open error ,url = http://img.xitisi.com/Commodity/BOBOTou_2204/RiXiFaXingNvShengJiaFa_HuaBuWu2011XinKuan_QiLiuHaiBoboBoBoTouXiuLianDuanFaZongSe20120210034904.jpg ,cannot identify image fil
看了一下图片的头信息:
Accept-Ranges | bytes |
Content-Encoding | gzip |
Content-Length | 452449 |
Content-Type | image/jpeg |
Date | Thu, 06 Jun 2013 05:03:08 GMT |
Etag | "8041952b9a50cd1:1a9a" |
Last-Modified | Fri, 22 Jun 2012 17:12:15 GMT |
Server | Microsoft-IIS/6.0 |
Vary | Accept-Encoding |
X-Powered-By | ASP.NET |
原来是通过gzip压缩过,所以Image无法识别,需要先处理一下。
解决办法:
1. 通过python的gzip反解
def _read_content(self,response): content_type = response.headers.get('Content-Type') content_encoding = response.headers.get("Content-Encoding") if response.code == 200 and content_type and content_type.find('image')!=-1: data = StringIO(response.read()) if content_encoding=="gzip": data = gzip.GzipFile(fileobj=data).read() data = StringIO((data)) return data else: logger.error("can't open image ,content type=%s, url=%s"%(content_type,url)) return None
2. 在请求头中指定不支持gzip
self.headers = {} self.headers['User-Agent'] = """Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB6""" self.headers['Accept'] = 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' self.headers['Accept-Encoding'] = 'identity' self.headers['Accept-Language'] = "zh,en-us;q=0.7,en;q=0.3" self.headers['Accept-Charset'] = "ISO-8859-1,utf-8;q=0.7,*;q=0.7" self.headers['Connection'] = "keep-alive" self.headers['Keep-Alive'] = "115" self.headers['Cache-Control'] = "no-cache" def open(self, url): try: response = self.opener.open(urllib2.Request(url, headers=self.headers),timeout=self.timeout) data = self._read_content(response) return data except Exception,e: logger.error(url) logger.exception(e) return None