The difference and connection between HTTP/1.1, HTTP/2 and HTTP/3

HTTP/1.1

Whenever you talk about http/1.1, you will think of the way of ordering takeaway in the past. At that time, many stores did not have dedicated takeaway staff. If you called to order takeaway, the boss would call someone to deliver the goods. But this way has a big problem. The problem is that the clerk always forgets to put the chopsticks. Therefore, after the clerk delivered the food, he had to go again if he wanted to deliver other things. If you want to order another chicken leg or something, you have to wait until the clerk returns to the store and deliver it again. This "one piece at a time" approach corresponds to the core of HTTP/1.1.
insert image description here
HTTP/1.1 is the first true HTTP standard version of the Internet. Why do you say that one piece at a time is the core here?
Because when you send an http request, you need to wait for the HTTP response before you can send the next HTTP request.
insert image description here
When we browse a web page, we only need to click the mouse once, which is obviously a request. In fact, a web page generally has many It consists of three files, the most basic ones are HTML, CSS, JS and image files. For the HTTP protocol, when we open a web page, we need to perform a TCP three-way handshake and establish a TCP connection before making a formal request. The server will send the HTML file to us first, but other files will not be sent to us. After the browser receives the HTML file, it will request css, js and other files from the server in turn according to the content in the HTML. The whole process is performed by the browser. It is helping us to complete, so the user's direct feeling is that there is only one request.

insert image description here
If there is a file in the request queue that has not been received, the following files cannot be received, which will cause the head of the HTTP queue to be blocked. Because for HTTP, one request response at a time. Although for HTTP/1.1, the default is a persistent connection, that is, to maintain this TCP connection, there is no need for another round of TCP handshake for each request, the request and response can be placed in the same connection, but only one connection is sure Too slow, too many connections may cause DDos attacks, so the number of persistent connections allowed by each browser is not the same, usually we will see that chrome defaults to 6 connections at the same time, even if there are multiple connections There is still a problem, assuming that the response files of other connections of the browser have been received, only one connection file has not been received, which happens to be a css file that will cause the browser to fail to render, which will cause the HTTP queue head blocked .
insert image description here
In order to solve this problem, in fact, there is a technology called pipeline in HTTP /1.1, which means that a single connection can send multiple requests, but there is a problem here, although multiple requests can be sent at a time, but the response must follow the order of sending When receiving, what is sent first must be received first. This creates great difficulty in implementation, and the network situation is difficult to predict. It is very possible that the first response to be received is lost, and then the second response becomes the first to be received, so it is relatively difficult for us to see any browsers actually using this pipelining technique.
insert image description here
If this problem cannot be solved at the network protocol level, then many black technologies will come out. For example, many websites will make all the small icons into a single picture, and only need to request one file when requesting, instead of dividing so many requests. Then use Js or css to distribute small icons to various parts of the website. This is called sprite map or sprite map, does reduce the number of requests, but it is very troublesome for developers. In addition, Dzta URLs is another alternative. For example, a picture can be encoded with base64, and then written into HTML or css in the form of characters. There is no need for a separate picture file, but generally this As a result, the file will be super long, which is very difficult to maintain and manage the code. As mentioned above, the browser will limit the number of requests for the same connection. In fact, it is to limit the number of connections for each domain, so the website creates multiple domains, so that the browser can download these resources at the same time. To give an extreme example, for example, if the website has 5 pictures, then you can set 5 picture domain names, so that the browser can download them at the same time, and you don’t have to wait for the previous resources to download before it’s the next resource’s turn, that is, Domain sharding . Don't think it's scary to do this, the complexity of development has been abruptly increased.
insert image description here
In order to reduce requests, there are many ways, such as combining js and css, or integrating both css and js into html. Although everyone is racking their brains to reduce requests, more and more websites need to be locked, that is, HTTPS, otherwise Google Chrome will give your website a warning that it is not safe. For http/1.1 Undoubtedly, it is worse, because the TCP three-way handshake is required before communication, and the TLS/1.2 in HTTPS needs to be handshaked again, and encrypted communication can only be carried out after the handshake.
insert image description here
Although the later TLS/1.3 reduced the number of round trips, for http/1.1, the burden of this round trip is still too heavy. More round trips mean more unknowns. If there are middleware on the network If it explodes, it is easy to cause packet loss, and the result may still block the rendering of the web page.

insert image description here
However, in addition to the fixed overhead of the three-way handshake, TCP also has a slow start, because it needs to perform congestion control, that is to say, in order not to cause network congestion, and does not know the actual situation of the network, it will only send a small amount at the beginning. The TCP data segment will slowly increase later, which will cause the refresh rate of newly accessed web pages to be slower. In addition to the various overheads of TCP, we should not forget that HTTP itself will generate fixed overheads. Both requests and responses have various headers, and most of the headers are repeated. Cookies have to be sent every time, and the characters are very long. In addition, HTTP/1.1 itself is in plain text, and the header is not compressed, making most of the headers cumbersome and bloated, just like every time you go to work, the other party You must be required to carry a lot of proof materials, so later came HTTP/2.

HTTP/2

HTTP/2 is like takeaway 2.0. The store has hired a lot of dedicated takeaways. Assuming that breakfast, lunch, dinner and supper are ordered at one time, the boss is not worried at all, because there are enough takeaways. Annoyed by a clerk delivering food. This corresponds to a major feature of HTTP/2.0, multiplexing . It mainly solves the problem of HTTP/1.1 head-to-head blocking. Some people may say that multiplexing is very powerful, and there is no need for multiple connections. A single TCP connection can send requests and responses interleaved, and there is no impact between requests and responses. This is true, but it is ignored. The important details in it. HTTP/2 is not a single file that directly responds to the past. The request and response messages are divided into different frames. This frame can be divided into a header frame and a data frame. In fact, the original HTTP message The header and entity part are split into two parts, so the original HTTP message is no longer the original message, but a bit like a frame of the data link layer. At the beginning, we don't need to study it in detail. The most important thing is a "stream identifier".
insert image description here
Because of this flow identifier, the frames can arrive at the other party without order, because even if you do not follow the order, you can finally combine them in order with this flow identifier, and you can also set the priority in the frame type, mark The weight of the flow. Sounds like HTTP/2 multiplexing is a neat solution to head-blocking, right? Actually not.
insert image description here
HTTP/2 still has something worthy of praise. The previous HTTP/1.1 message body was compressed, but the header was not compressed. HTTP/2 also compressed the header this time, but this time it introduced a compression algorithm called HPACK. HPACK algorithm Both the browser and the server are required to save a static read-only table, for example, the classic "HTTP/1.1 200 OK start line", which becomes ": status: 200" in HTTP/2, which is just You may think it’s no big deal that 3 bytes are missing, but because it is a repeated header, it can be removed directly in the two requests and responses. In addition, headers such as cookies can be added to the dynamic table as dynamic information, so that the resources saved are more objective. In addition, this HTTP/2 frame is not an ASCII-encoded message, but is converted into Binary frames are more efficient to parse. When HTTP/2 was launched, there was another "anti-sky" technology that was touted by many, that is, server push. In fact, it is easier to understand. When the browser makes a request, the server can not respond one by one when the browser parses the HTML as before, but sends all the files that the browser may need later. It looks beautiful, but the actual operation is very "skinny", because the client may click on the wrong webpage, and you will give me a hard punch with a single click, so that I have so much more cache, and the server pushes It may also cause DDos asymmetric attacks, because there is an obvious "leverage", so server push also implies security issues, but in fact HTTP/2 is much safer than HTTP/1.1. First of all, as mentioned above, the message becomes a binary frame. For humans, the readability is much worse. Secondly, although HTTP/2 does not stipulate that TSL encryption is used, if HTTP/2 is used, TSL encryption is not required. It can't be justified, because in the era of HTTP/2, TSL is not used yet, and big-name browsers will prompt "insecure", and the multiplexing of HTTP/2 has also reduced the waiting time, and it will become HTTP/2 over time. TSL must be used. It is good to see HTTP/2, but why HTTP/2, which was announced in 2015, appeared in 2019 as HTTP/3? To know HTTP/1. 1 was announced in 1997. The process from 2 to 3 is too fast. There must be a problem here. In fact, HTTP/2 only solves the head-of-line blocking at the HTTP level. You should know that there is a transport layer under HTTP? Moreover, HTTP is based on TCP. After the HTTP/2 frame comes down, it will be processed by TCP. After the HTTP/2 frame is down, it will be processed by TCP. TCP doesn't know which content in your frame is with which one. , TCP still sends according to its own data segment. If there is a loss, it has to be retransmitted. The exaggeration is that even if the lost TCP data segment happens to be a line of code comment, there is no way but to continue to wait. This is TCP Layer head-of-line blocking. How to do it? A better way is to turn TCP into HTTP/2 frames. The key TCP protocol is implemented by the operating system kernel. Unless most of the operating systems in the world undergo an innovative upgrade, I don’t know that it will take until the year of the monkey. , so there is the HTTP/3 protocol.

HTTP/3

HTTP/3 is equivalent to takeaway 3.0. It is very convenient to order takeaway on the APP. It will not be unsmooth due to calling to order food. You can directly order payment. Multiple steps are directly integrated on the APP, and now the roads are also extending in all directions. Wherever you go by bike, you will not be blocked. This corresponds to a core of HTTP/3 - integration. Although it is difficult for us to quickly upgrade and widely apply TCP, HTTP/3 integrates the handshake of TCP and TLS. However, HTTP/3 integrates the handshake process of TCP and TLS, which directly reduces the overhead caused by the back and forth. If it is a restored session, it can also achieve 0-RTT without handshaking.
insert image description here
But the question is TCP and TLS are two protocols? It is not to say that the merger is merged. In fact, in order to enable HTTP/3 to be deployed, the only option is to send the UDP protocol at the transport layer, and add a new protocol based on the UDP protocol, that is, QUIC. This QUIC integrates TCP and TLS , so that HTTP/3 uses encrypted transmission by default. It can also be said to be TCP/2.0 loosely, but it cannot be said to be UDP/2.0.
insert image description here
Many people think that using TCP is slow and using UDP is fast, so QUIC is fast because it is based on UDP. This statement is not reliable. In fact, QUIC can only use UDP because it needs to be widely deployed, but the actual mechanism still mostly uses TCP, but QUIC must solve the problem of TCP head-of-line blocking, and the data coming from the applicability side will be blocked. Encapsulated into a QUIC frame, this QUIC frame is very similar to the HTTP/2 frame, and also adds a "stream identifier", but unlike HTTP/2, there is no so-called frame concept on the application layer of HTTP/3. Moving the data frame into QUIC is equivalent to having a data frame at the transport layer, which can solve the problem of header blocking from the source and realize multiplexing. The QUIC frame is encapsulated into a QUIC packet again, and the QUIC packet will add some information. The most important thing here is to add the Connection ID connection ID. If the network changes completely, such as suddenly switching from Wifi to 4G network, although the IP address However, because both the client and the server have negotiated the connection ID, the connection ID can be used to identify the same connection and avoid handshaking again. This is one of the reasons why QUIC is fast. Moreover, the QUIC data packet will encrypt the QUIC frame inside, that is, after the TLS handshake, the content of the QUIC frame is encrypted, and then the QUIC data packet will be encapsulated into a data segment by UDP, and the UDP will add the port number. When we choose HTTP/3 communication, QUIC will open the connection like TCP, and QUIC data packets are sent and received in this connection channel.

Egg teacher video explanation

Guess you like

Origin blog.csdn.net/qq_40992225/article/details/127875955