python2验证码识别

今日就给大家分析一下简单的解决验证码识别的问题,开讲:

首先这是一串验证码链接:

https://credit.wsjd.gov.cn/portal/captcha
然后我们去解析这串链接,再利用pytesseract,PIL这两个库对验证码进行识别,废话不多说,直接开干,代码:
import pytesseract
import urllib2
from PIL import Image
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
    'Host': 'credit.wsjd.gov.cn',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36',
}
url = 'https://credit.wsjd.gov.cn/portal/captcha'
request = urllib2.Request(url,headers=headers)
res = urllib2.urlopen(request).read()
try:
    captchaFile = 'yishi/static/images/credit_captcha.png'#这个是创建一个文件来存放解析出来的验证码
    with open(captchaFile, 'wb') as f:
        f.write(res)
  #对验证码进行识别 image = Image.open(captchaFile) captcha_value = pytesseract.image_to_string(image) print '验证码为:'+captcha_value except IOError,e: #验证码失败 重新请求 print('验证码获取失败') print(e)

  

猜你喜欢

转载自www.cnblogs.com/fh-fendou/p/9015052.html
今日推荐