An overview of the HTTP protocol

HTTP messages are roughly divided into three parts. The first part is the request line, the second part is the header of the request, and the third part is the body entity of the request.

POST is often used to create a resource, and PUT is often used to modify a resource.

Accept-Charset indicates the character set acceptable to the client. This prevents another character set from being transmitted, resulting in garbled characters.

In the HTTP header, Cache-control is used to control caching. When the request sent by the client contains the max-age directive, if it is determined that the cache time value of the resource in the cache layer is smaller than the specified time value, then the client can accept the cached resource; when the max-age value is specified as 0, Then the cache layer usually needs to forward the request to the application cluster.

If-Modified-Since is also about caching. In other words, if the server's resources are updated after a certain period of time, the client should download the latest resources; if there is no update, the server will return a "304 Not Modified" response, and the client will not need to download it. It will also save bandwidth.

The HTTP protocol is based on the TCP protocol, so it uses a connection-oriented method to send requests and passes them to the other party through a stream binary stream. Of course, when it comes to the TCP layer, it will turn the binary stream into segments and send them to the server.

The IP layer needs to check whether the target address and itself are in the same LAN. If so, send the ARP protocol to request the MAC address corresponding to the target address, then put the source MAC and target MAC into the MAC header and send it out; if they are not in the same LAN, they need to be sent to the gateway and also need to be sent. ARP protocol is used to obtain the MAC address of the gateway, and then the source MAC and gateway MAC are put into the MAC header and sent out.

The gateway receives the packet and finds that the MAC matches it, takes out the target IP address, finds the next-hop router according to the routing protocol, obtains the MAC address of the next-hop router, and sends the packet to the next-hop router.

HTTP return messages also have a certain format. This is also based on HTTP 1.1.

HTTP 1.1 communicates in plain text at the application layer. Each communication must carry a complete HTTP header, and regardless of the pipeline mode, each process will always go back and forth as described above. This has problems in real-time and concurrency.

TTP 2.0 will compress the HTTP header to a certain extent, create an index table at both ends for the large number of key values ​​that were originally carried each time, and only send the index in the index table for the same header.

The HTTP 2.0 protocol divides a TCP connection into multiple streams, each stream has its own ID, and the stream can be sent from the client to the server, or from the server to the client. It is actually just a virtual channel. Streams are prioritized.

HTTP 2.0 also breaks all transmission information into smaller messages and frames and encodes them in a binary format. Common frames include Header frames, which are used to transmit header content and start a new stream. Then there is the Data frame, which is used to transmit text entities. Multiple Data frames belong to the same stream.

Through these two mechanisms, HTTP 2.0 clients can divide multiple requests into different streams, and then break the request content into frames for binary transmission. These frames can be broken up and sent out of order, and then reassembled according to the stream identifier in the header of each frame, and the data of which stream can be processed first according to the priority.

Based on UDP, the connection mechanism can be maintained in QUIC's own logic. It is no longer identified by a quadruple, but by a 64-bit random number as an ID. Moreover, UDP is connectionless, so when the IP or port When changing, as long as the ID remains unchanged, there is no need to re-establish the connection.

QUIC defines an offset concept. Since QUIC is connection-oriented, just like TCP, it is a data stream. The data sent has an offset offset in this data stream. You can use the offset to check where the data is sent, so as long as the packet of this offset does not come. , it is necessary to resend; if it comes, it can still be spliced ​​into one stream according to the offset.

In the TCP protocol, the starting point of the receiving end's window is the next packet to be received and ACKed. Even if subsequent packets arrive and are placed in the cache, the window cannot be moved to the right because TCP's ACK mechanism is based on the accumulation of sequence numbers. In response, once a sequence number is ACKed, it means that the previous ones have arrived. Therefore, as long as the previous ones have not arrived, the later ones cannot be ACKed, which will cause the later ones to arrive, and it may also time out and retransmit, which wastes bandwidth.

QUIC's ACK is based on offset. When each offset packet comes and enters the cache, it can be responded. After the response, it will not be retransmitted. The gap in the middle will wait for arrival or retransmission, and the start of the window The position is the maximum offset currently received. From this offset to the maximum buffer that the current stream can accommodate, it is the real window size. Obviously, this is more accurate.

This article is a study note for Day 14 in September. The content comes from Geek Time's "Internet Protocol". This course is recommended.

Guess you like

Origin blog.csdn.net/key_3_feng/article/details/132876490