已解决requests.exceptions.ConnectTimeout: HTTPConnectionPool(host=‘123.96.1.95’, port=30090): Max retries exceeded with url: http://cdict.qq.pinyin.cn/list?cate_id=461&sort1_id=436&sort2_id=461&page=4 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x00000204934C49A0>, ‘Connection to 123.96.1.95 timed out. (connect timeout=20)’))
Article directory
error code
A small friend in the fan group requests.get
reported an error when crawling the source code of the web page (at that time, he felt a lot of cold in his heart, and came to me for help, and then successfully helped him solve it. By the way, I hope it can help more encounters with this bug. Friends who will not solve it), the error code is as follows:
def get_url(url):
"""发送请求获取响应返回网页源码"""
# 如果访问失败无限循环访问五次
try:
r = requests.get(url, headers=headers)
except:
time.sleep(120)
for i in range(5):
r = requests.get(url, headers=headers, timeout=20)
if r.status_code == '200':
break
time.sleep(300)
html_str = r.content.decode('utf8')
return html_str
The error message is as follows :
requests.exceptions.ConnectTimeout: HTTPConnectionPool(host=‘123.96.1.95’, port=30090): Max retries exceeded with url: http://cdict.qq.pinyin.cn/list?cate_id=461&sort1_id=436&sort2_id=461&page=4 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x00000204934C49A0>, ‘Connection to 123.96.1.95 timed out. (connect timeout=20)’))
error translation
Error message translation :
Request. Exceptions. Connection timeout: http connection pool (host='123.96.1.95', port=30090): Exceeded max retries for url: http://cdict.qq.pinyin.cn/list?cate_id=461&sort1_id=436&sort2_id=461&page= 4 (caused by connection timeout error (<urllib3.connectionpool.HTTPConnectionPool object> at 0x00000204934C49A0, 'connection to 123.96.1.95 timed out (connection timeout=20')))
Reason for error
If the connection and read timeouts are specified respectively, the server does not respond within the specified time and throws requests.exceptions.ConnectTimeout
- timeout=([connection timeout], [read timeout])
- Connection: The client connects to the server and sends an http request to the server
- Read: The amount of time the client waits before the server sends the first byte
Friends, please follow the steps below to solve it! ! !
Solution
1. Solution, code to set proxy
proxies = {
'http': '127.0.0.1:1212',
'https': '127.0.0.1:1212'
}
r = requests.get(url, headers=headers, proxies=proxies, timeout=20)
2. When calling the function, another exception is caught:
try:
get_url(url)
except:
print(url,'爬取失败')
solved after running
help
This article has been included in: "Farewell to Bug" column
This column is used to record various difficult bugs encountered in study and work, as well as various problems raised by small partners in the fan group. Article format: error code + error translation + error reason + solution, including program installation, operation If you encounter other problems in the process of the program, if you encounter other problems after subscribing to the column + following the blogger, you can privately chat to help solve them! ! !