Http basic principles (web page request and response)

The request to the webpage can be divided into the following parts

  1. Request URL: the requested URL
  2. Request headers: request header
  3. Request body: Request body
  4. Request method: request method

Elaborate on the components of the request

  1. Request URL: URL, also called uniform resource locator, through which you can access specific resources in the server, that is, tell the browser what information you want it to store
  2. Request headers: request header to the server to specify additional information to be used, the following list some of the more important the request header information cookie:用来维持登录状态,每次你打开网址时,例如优酷视频时发现不用自己输入账号密码就可以登录这都是cookie的功劳

    The request header is an important part of the request. Most crawlers need to attach this information, which means that some crawlers may not include the request header information.

  3. Request method: request method, here only introduce the two most practicalpost:POST请求大多用于提交表单,这些表单通常包含一些加密信息,同时也可以处理上传文件的功能,可以说这是一个比较低调的大佬
  4. Request body: Generally speaking, this is something that exists relative to a POST request. It contains the form data contained in the sent request. Only this relatively low-key boss is equipped with this kind of treatment, haha

Server response

The server's response can be divided into three parts:

  1. Response status code: status code, here is a list of commonly used status codes:

  2. Response header: Here are a few common valuescontent-type:说明返回内容的格式,applicatio/json,返回的内容就是json格式的内容,text/html:html文件;content-enconding:指定响应内容的编码方式

  3. Response body: This is the big brother. Our crawler is the analysis of the response body, which is the body data of the response obtained after we initiate a request to the URL.

Guess you like