Python: Capture urllib.request timeout exception of two methods
1. Background
When using urllib.request.urlopen, frequent timeout exception causes the program to stop running. Each stop also restart the program, which is not conducive to the robustness of the program. Now we want to capture urllib of a timeout exception processing to do overtime.
from urllib import request
headers = { # 用户代理,伪装浏览器用户访问网址
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3941.4 Safari/537.36'
}
# 测试url是否有效
def test_url(url):
r = request.Request(url, headers=headers)
r1 = request.urlopen(r, timeout=0.1)
print(r1.status)
if __name__ == '__main__':
url1 = 'https://www.baidu.com/'
url2 = 'http://httpbin.org/get'
url3 = 'https://www.jianshu.com/p/5d6f1891354f'
test_url(url2)
2. Methods
2.1 except Exception as e
from urllib import request
headers = { # 用户代理,伪装浏览器用户访问网址
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3941.4 Safari/537.36'
}
# 测试url是否有效
def test_url(url):
try:
r = request.Request(url, headers=headers)
r1 = request.urlopen(r, timeout=0.1)
print(r1.status)
except Exception as e: # 捕获除与程序退出sys.exit()相关之外的所有异常
print(e)
if __name__ == '__main__':
url1 = 'https://www.baidu.com/'
url2 = 'http://httpbin.org/get'
url3 = 'https://www.jianshu.com/p/5d6f1891354f'
test_url(url2)
2.2 except error.URLError as e
from urllib import request, error
import socket
headers = { # 用户代理,伪装浏览器用户访问网址
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3941.4 Safari/537.36'
}
# 测试url是否有效
def test_url(url):
try:
r = request.Request(url, headers=headers)
r1 = request.urlopen(r, timeout=0.1)
print(r1.status)
except error.HTTPError as e:
print(str(e.code) + ':' + e.reason)
except error.URLError as e:
print(e.reason)
if isinstance(e.reason, socket.timeout):
print('时间超时')
if __name__ == '__main__':
url1 = 'https://www.baidu.com/'
url2 = 'http://httpbin.org/get'
url3 = 'https://www.jianshu.com/p/5d6f1891354f'
test_url(url2)
Note 3
- When the test url1, Baidu fast response, usually run the program without problems.
url1 = 'https://www.baidu.com/'
- When the test url2, the program can really capture a timeout exception.
url2 = 'http://httpbin.org/get'
Method 1 Run Results:
Method 2 Run Results:
- When the test url3, 1 can catch a timeout exception, Method 2 error exit. The reason may be in poor network status, caused by excessive application for web content.
url3 = 'https://www.jianshu.com/p/5d6f1891354f'
Method 1 Run Results:
Method 2 Run Results:
4 Summary
"Except error.URLError as e" can only capture a timeout associated with urllib exception, "except Exception as e" can capture all of a timeout exception.