The requests module uses
requests module
A native module based on network requests in python, which is very powerful, simple and convenient, and highly efficient. Used to send requests to simulate browser surfing the Internet.
manual
- specified url
- make a request
- Get response data
- persistent storage
Environment installation
It is recommended to download Anaconda directly, and use Anaconda's default environment in PyCharm, which contains the requests module.
The first example of a crawler: Obtaining the source code of the Sogou search page
# 获取搜狗首页的数据
import requests
# 指定url
sougouURL = 'https://www.sogou.com/'
# 发起请求
# requests.get()返回一个相应对象
response = requests.get(url=sougouURL)
# 获取响应数据
# text返回字符串类型数据
# 这里会得到该页面的html源码
pageText = response.text
print(pageText)
# 存储获取到的数据
with open('./sougou.html','w') as fp:
fp.write(pageText)
print('爬取数据结束')