问题描述:项目使用ip代理池对网页进行数据爬取,但是requests模块get方法出现问题,出错如下:
File "E:\project\venv\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "E:\project\venv\lib\site-packages\requests\api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "E:\project\venv\lib\site-packages\requests\sessions.py", line 524, in request
prep.url, proxies, stream, verify, cert
File "E:\project\venv\lib\site-packages\requests\sessions.py", line 699, in merge_environment_settings
no_proxy = proxies.get('no_proxy') if proxies is not None else None
AttributeError: 'str' object has no attribute 'get'
代码如下:
proxies = "http://" + proxy
res = requests.get(url=url, headers=headers, proxies=proxies)
原因分析:
api.py中get方法定义如下:
def get(url, params=None, **kwargs):
r"""Sends a GET request.
:param url: URL for the new :class:`Request` object.
:param params: (optional) Dictionary, list of tuples or bytes to send
in the body of the :class:`Request`.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
:param params: (optional) Dictionary, list of tuples or bytes to send
in the body of the :class:`Request`.
可以知道,传入的参数必须是字典类型
修改后代码:
proxies = {
"http": "http://" + proxy
}
res = requests.get(url=url, headers=headers, proxies=proxies)
运行成功