[python] Crawler notes (1) concept

[Crawler] The classification of usage scenarios is as follows:

1. General crawler An
important part of the crawling system, which crawls a whole page of data

2. Focused crawlers are
built on the basis of general crawlers to grab specific partial content on the page

3. Incremental crawler
detects the data update situation in the website and only crawls the latest updated data in the website

http protocol

  • Concept : a form of data interaction between server and client
  • Common request header information
    User-Agent: the identity of the request carrier
    Connection: whether to disconnect or keep connected after the request is completed
  • Common response header information
    Content-Type: The type of data the server responds back to the client

https protocol:

  • Secure Hypertext Transfer Protocol
  • Encryption method (general understanding)
    Symmetric key encryption
    Insert picture description here
    Asymmetric key encryption
    However, the public key still has the risk of being hijacked; and the efficiency is low
    However, there is still a risk of hijacking the public key; and low efficiency

Certificate key encryption (https)
Insert picture description here

Guess you like

Origin blog.csdn.net/Sgmple/article/details/112002911