[Illustrated HTTP reading notes] Chapter 2: Simple HTTP protocol


This chapter explains the structure of the HTTP protocol, mainly using the HTTP/1.1 version.

2.1 HTTP protocol is used for communication between client and server

Insert picture description here
The HTTP protocol can clearly distinguish between the client side and the server side.

2.2 Reaching an agreement through the exchange of requests and responses

Insert picture description here
In other words, the communication must be established from the client first, and the server will not send a response without receiving the request.
Example:
Insert picture description here
The following is the content of the request message sent from the client to an HTTP server:

GET /index.htm HTTP//1.1
Host: hackr.jp
  • GET: Indicates the type of request to access the server (called method)
  • String/index.htm: specifies the resource object requested to be accessed (also called request URI)
  • HTTP1.1: HTTP version number

Request message:

Insert picture description here
The request header field will be discussed after the content entity. Then look at the server response:
Insert picture description here

  • HTTP1.1: HTTP version number
  • 200 OK: Status code and reason phrase
  • Data: Creation time
  • After the blank line is the entity Body

Response message:
Insert picture description here

2.3 HTTP is a protocol that does not save state (stateless protocol)

The HTTP protocol itself does not save the communication state between the request and the response. That is to say, at the HTTP level, the protocol does not make persistent processing for sent requests or responses.
Insert picture description here

  • This is done to process a large number of transactions faster and to ensure the scalability of the protocol.
  • But some need to save the user's state. In order to achieve the keep state function introducedCookietechnology.

2.4 Request URI to locate resources

Insert picture description here
When a client requests access to a resource and sends a request, the URI needs to be included as the request URI in the request message.

Insert picture description here
If you are not accessing a specific resource but making a request to the server itself, you can use a * to replace the request URI. Below is an example of querying the types of HTTP methods supported by the HTTP server.

OPTIONS * HTTP/1.1

2.5 HTTP method to inform the server of intent

Below, we introduce the methods available in HTTP/1.1:

GET: Get resources

The GET method is used to request access to resources identified by URI. The specified resource is parsed by the server and the response content is returned. That is to say, if the requested resource is text, it will be returned as it is; if it is a program like CGI, the output after execution will be returned.
Insert picture description here

POST: transfer entity body

Although the entity body can also be transmitted using the GET method, the GET method is generally not used for transmission instead of the POST method.
Insert picture description here
Insert picture description here
The difference between GET and POST (interview) A
lot of information is written in that GET usually puts the data in the url, and POST usually puts the data in the body. In fact, it is not very scientific. In theory, you can put the POST data in the url and the GET data in the body.

  • 1. GET is used to obtain resources from the server, and POST is used to submit data to the server (nowadays, the original design intention is rarely strictly followed, and both can be used to obtain resources or submit data)
  • 2. The upper limit of the amount of data transmitted by GET is small (the URL length is limited), and the amount of data transmitted by POST is larger (this sentence was correct 20 years ago. The
    longest URL before was 1k-2k, but now The url may be very long, it may even be a few M long)
  • 3. POST is more secure than GET (the security of POST is actually to hide the ears and steal the bell, just put the password in the body, this password is not in the URL
    address bar, so people who know programming can grab a packet and see it)

PUT: transfer files

The PUT method is used to transfer files. Just like the file upload of the FTP protocol, the content of the file is required to be included in the body of the request message and then saved to the location specified by the request URI.
However, the PUT method of HTTP/1.1 does not have an authentication mechanism. Anyone can upload files. There are security problems. Therefore, general Web sites do not use this method. If it cooperates with the verification mechanism of Web applications, or similar Web sites whose architecture design adopts the REST standard, the PUT method may be open to use.
Insert picture description here

HEAD: Get the header of the message

The HEAD method is the same as the GET method, except that it does not return the body of the text. It is used to confirm the validity of the URI and the date and time of resource update.
Insert picture description here

DELETE: delete files

Contrary to the PUT method. It also does not have a safety mechanism and is generally not used.
Insert picture description here
Insert picture description here

OPTIONS: Ask for support methods

Insert picture description here

TRACE: Trace the path

It is easy to cause XST attacks and is usually not used.
Insert picture description here

CONNECT: requires a tunneling protocol to connect to the agent

This method requires the establishment of a tunnel when communicating with the proxy server, so as to realize TCP communication with the tunnel protocol. Mainly use SSL (Secure Socket Layer) and TLS (Transport Layer Security) protocols to encrypt the communication content and transmit it through a network tunnel.

CONNECT 代理服务器名:端口号 HTTP版本

Insert picture description here

2.6 How to use to issue commands

When sending a request message to the resource specified by the request URI, a command called a method is used.

Insert picture description here
Insert picture description here

2.7 Persistent connection saves communication

In the initial version of the HTTP protocol, a TCP connection is disconnected every time an HTTP communication is performed.
Insert picture description here
When browsing HTML pages containing multiple pictures, each request will cause unnecessary TCP connection establishment and disconnection, increasing the overhead of communication.
Insert picture description here

2.7.1 Persistent connection

Persistent connection is to solve the above problems. The characteristic is: as long as either end does not explicitly propose to disconnect, the TCP connection state is maintained.
Insert picture description here

  • Reduce the additional overhead caused by repeated establishment and disconnection of TCP connections
  • 1.1 are persistent connections by default

2.7.2 Pipeline

Persistent connections make it possible to send most requests in a pipelined manner. That is, multiple requests can be sent at the same time, without waiting for a response one by one.
Insert picture description here

2.8 State management using cookies

As mentioned before, HTTP is a stateless protocol, it does not retain previous state information, and it is not particularly convenient to introduce cookies. Of course, the stateless protocol also has its advantages, that is, it can reduce the consumption of the server's CPU and memory resources.
Insert picture description here
CookieThe technology controls the state of the client by writing cookie information in the request and response messages.
Insert picture description here
For more related introduction, see: Deep Understanding of HTTP Protocol

Insert picture description here
References: <Illustrated HTTP>

Guess you like

Origin blog.csdn.net/weixin_45532227/article/details/112741838