Python crawler: the requests module uses

requests module

A native module based on network requests in python, which is very powerful, simple and convenient, and highly efficient. Used to send requests to simulate browser surfing the Internet.

manual

  1. specified url
  2. make a request
  3. Get response data
  4. persistent storage

Environment installation

It is recommended to download Anaconda directly, and use Anaconda's default environment in PyCharm, which contains the requests module.
insert image description here

The first example of a crawler: Obtaining the source code of the Sogou search page

# 获取搜狗首页的数据

import requests

# 指定url
sougouURL = 'https://www.sogou.com/'

# 发起请求
# requests.get()返回一个相应对象
response = requests.get(url=sougouURL)

# 获取响应数据
# text返回字符串类型数据
# 这里会得到该页面的html源码
pageText = response.text
print(pageText)

# 存储获取到的数据
with open('./sougou.html','w') as fp:
    fp.write(pageText)

print('爬取数据结束')

insert image description here

Guess you like

Origin blog.csdn.net/Dae_Lzh/article/details/118786755