Python uses hyper to crawl web page data of http2.0 protocol

        Recently, I have encountered some problems when crawling the data of the Hong Kong Stock Exchange. The Hong Kong Stock Exchange uses the http2.0 protocol, and most of the rest use the http1.1 protocol, which makes it impossible to crawl, and finally found that the hyper is used. can.

        First install: pip install hyper

        Then import hyper:

from hyper import HTTPConnection

        API link address: https://hyper.readthedocs.io/en/latest/index.html

        To use hyper crawling, the homepage needs to add port: 443, code:

//加入端口:443
conn = HTTPConnection('www.hkex.com.hk:443')
conn.request('GET', '/chi/stat/smstat/dayquot/d210219c.htm', None, None)
resp = conn.get_response()
//不解码返回的数据,源码中有示例,不传参入则默认utf-8
s = resp.read(decode_content=False)
print s

        This is enough for preliminary use, please refer to the API for more in-depth usage

Guess you like

Origin blog.csdn.net/qq_41061437/article/details/113952370