【Python3爬虫】使用云打码识别验证码

本来是学着使用tesserocr来识别验证码的,但是由于tesserocr的识别率不高,还是学了一下使用云打码来识别验证码==

具体步骤如下:

1、首先是注册账号,然后进入这个网址(http://www.yundama.com/apidoc/YDM_SDK.html)选择PythonHTTP示例下载:

2、下载后解压,可以看到有如下几个文件,因为我使用的Python版本是3.5,所以打开YDMHTTPDemo3.x:

3、打开之后修改如下几个部分,用户名和密码就是你的用户名和密码,而appid和appkey需要进入开发者后台查看,第一次使用的时候还需要新建一个软件,才能有appid和appkey:

下图中的软件代码就是appid,通讯密钥就是appkey:

4、把信息都添加进去后运行代码,不出意外会返回一个1007,进入错误代码及排错(http://www.yundama.com/apidoc/YDM_ErrorCode.html)查找原因,原来是因为账户没有余额

然后进入用户后台充值就行了,充值完以后再次运行代码,就可以看到识别结果了。

进行完如上步骤之后,我们就可以使用云打码平台来识别验证码了,不过为了使用方便,可以建一个YDMDemo.py,把账号密码等信息写进去,调用的时候只需要传入验证码图片就行了。

  1 import json
  2 import time
  3 import requests
  4 
  5 
  6 class YDMHttp:
  7     apiurl = 'http://api.yundama.com/api.php'
  8     username = ''
  9     password = ''
 10     appid = ''
 11     appkey = ''
 12 
 13     def __init__(self, username, password, appid, appkey):
 14         self.username = username
 15         self.password = password
 16         self.appid = str(appid)
 17         self.appkey = appkey
 18 
 19     def request(self, fields, files=[]):
 20         response = self.post_url(self.apiurl, fields, files)
 21         response = json.loads(response)
 22         return response
 23 
 24     def balance(self):
 25         data = {'method': 'balance', 'username': self.username, 'password': self.password, 'appid': self.appid,
 26                 'appkey': self.appkey}
 27         response = self.request(data)
 28         if response:
 29             if response['ret'] and response['ret'] < 0:
 30                 return response['ret']
 31             else:
 32                 return response['balance']
 33         else:
 34             return -9001
 35 
 36     def login(self):
 37         data = {'method': 'login', 'username': self.username, 'password': self.password, 'appid': self.appid,
 38                 'appkey': self.appkey}
 39         response = self.request(data)
 40         if response:
 41             if response['ret'] and response['ret'] < 0:
 42                 return response['ret']
 43             else:
 44                 return response['uid']
 45         else:
 46             return -9001
 47 
 48     def upload(self, filename, codetype, timeout):
 49         data = {'method': 'upload', 'username': self.username, 'password': self.password, 'appid': self.appid,
 50                 'appkey': self.appkey, 'codetype': str(codetype), 'timeout': str(timeout)}
 51         file = {'file': filename}
 52         response = self.request(data, file)
 53         if response:
 54             if response['ret'] and response['ret'] < 0:
 55                 return response['ret']
 56             else:
 57                 return response['cid']
 58         else:
 59             return -9001
 60 
 61     def result(self, cid):
 62         data = {'method': 'result', 'username': self.username, 'password': self.password, 'appid': self.appid,
 63                 'appkey': self.appkey, 'cid': str(cid)}
 64         response = self.request(data)
 65         return response and response['text'] or ''
 66 
 67     def decode(self, filename, codetype, timeout):
 68         cid = self.upload(filename, codetype, timeout)
 69         if cid > 0:
 70             for i in range(0, timeout):
 71                 result = self.result(cid)
 72                 if result != '':
 73                     return cid, result
 74                 else:
 75                     time.sleep(1)
 76             return -3003, ''
 77         else:
 78             return cid, ''
 79 
 80     def report(self, cid):
 81         data = {'method': 'report', 'username': self.username, 'password': self.password, 'appid': self.appid,
 82                 'appkey': self.appkey, 'cid': str(cid), 'flag': '0'}
 83         response = self.request(data)
 84         if response:
 85             return response['ret']
 86         else:
 87             return -9001
 88 
 89     def post_url(self, url, fields, files=[]):
 90         for key in files:
 91             files[key] = open(files[key], 'rb')
 92         res = requests.post(url, files=files, data=fields)
 93         return res.text
 94 
 95 
 96 def use_ydm(filename):
 97     username = ''  # 用户名
 98     password = ''  # 密码
 99     app_id = 1  # 软件ID
100     app_key = ''  # 软件密钥
101     code_type = 1004  # 验证码类型
102     timeout = 60  # 超时时间,秒
103     yundama = YDMHttp(username, password, app_id, app_key)  # 初始化
104     balance = yundama.balance()  # 查询余额
105     print('您的题分余额为{}'.format(balance))
106     cid, result = yundama.decode(filename, code_type, timeout)  # 开始识别
107     print('识别结果为{}'.format(result))
108     return result

猜你喜欢

转载自www.cnblogs.com/TM0831/p/9747207.html