HTTP - Difference between HTTP 1.0, HTTP 1.1, HTTP 2.0

1. Basic optimization of HTTP

There are two main factors that affect an HTTP network request: bandwidth and latency.

Bandwidth
If we are still at the stage of dial-up Internet access, bandwidth may become a problem that seriously affects requests, but now that the network infrastructure has greatly improved bandwidth, we no longer worry about bandwidth affecting network speed, then There is only delay left.
Delay

  • Browser blocking (HOL blocking): The browser will block the request for some reason. The browser can only have 4 connections to the same domain name at the same time (this may vary depending on the browser kernel), if the maximum number of connections of the browser is exceeded, subsequent requests will be blocked.
  • DNS Lookup: The browser needs to know the IP of the target server to establish a connection. The system that resolves domain names to IPs is DNS. This can usually be achieved by using DNS cache results to reduce this time.
  • Establish a connection (Initial connection): HTTP is based on the TCP protocol. The browser can only piggyback an HTTP request message at the third handshake at the earliest to achieve a real connection establishment. However, these connections cannot be reused, which will cause each request Both go through a three-way handshake and slow start. The impact of the three-way handshake is more obvious in high-latency scenarios, and slow start has a greater impact on large file requests.

2. Some differences between HTTP1.0 and HTTP1.1

2.1. Response status code

HTTP/1.0 only defines 16 status codes.

A large number of status codes have been newly added to HTTP/1.1, and 24 types of error response status codes have been added.

for example:

  • 100 (Continue)··································································································································​​
  • 206 (Partial Content)·····Identifier code for range request
  • 409 (Conflict)·················································································​​
  • 410 (Gone)···················································································································.

2.2, cache processing

Caching technology saves a lot of network bandwidth and reduces the delay in receiving information by users by avoiding frequent interactions between users and source servers.

2.2.1、HTTP/1.0

The caching mechanism provided by HTTP/1.0 is very simple. The server side uses the Expires tag to mark (time) a response body, and requests within the Expires mark time will get the response body cache. In the response body that the server returns to the client for the first time, there is a Last-Modified tag, which marks the last modification of the requested resource on the server side. In the request header, use the If-Modified-Since tag, which marks a time, which means that the client asks the server: "After this time, has the resource I want to request been modified?" Usually, The value of If-Modified-Since in the request header is the value of Last-Modified in the response body when the resource was obtained last time.

If the server receives the request header and judges that the resource has not been modified after the If-Modified-Since time, it will return a 304 not modified response header to the client, indicating "the buffer is available, you can get it from the browser!".

If the server judges that the resource has been modified after the If-Modified-Since time, it will return a 200 OK response body to the client with a new resource content, which means "I have modified what you want, and I will give you a new one. of".

2.2.2、HTTP/1.1

HTTP/1.1's caching mechanism greatly increases flexibility and scalability on the basis of HTTP/1.0. The basic working principle remains the same as HTTP/1.0, but more subtle features have been added. Among them, the most common feature in the request header is Cache-Control, see MDN Web document Cache-Control for details .

2.3. Connection method

HTTP/1.0 uses short connections by default  , that is, each time the client and server perform an HTTP operation, a connection is established, and the connection is terminated when the task is completed. When an HTML or other type of web page accessed by a client browser contains other web resources (such as JavaScript files, image files, CSS files, etc.), each time such a web resource is encountered, the browser will recreate the A TCP connection, this will result in a large number of "handshake messages" and "wave messages" occupying the bandwidth.

In order to solve the problem of resource waste in HTTP/1.0, HTTP/1.1 is optimized as the default persistent connection mode.  The request message using the long connection mode will notify the server: "I request a connection from you, and after the connection is successfully established, please do not close it." Therefore, the TCP connection will be kept open to serve the subsequent client-server data interaction. That is to say, in the case of using a long connection, when a web page is opened, the TCP connection used to transmit HTTP data between the client and the server will not be closed, and when the client accesses the server again, it will continue to use this one connection established.

If the TCP connection is maintained all the time, it is also a waste of resources. Therefore, some server software (such as Apache) also supports timeout time. The TCP connection will be closed if no new request arrives within the timeout period.

It should be noted that HTTP/1.0 still provides a long connection option, which is added in the request header Connection: Keep-alive. Similarly, in HTTP/1.1, if you don't want to use the long connection option, you can also add it to the request header Connection: close, which will notify the server: "I don't need a long connection, and it can be closed after the connection is successful."

The long connection and short connection of HTTP protocol are essentially the long connection and short connection of TCP protocol.

To implement a persistent connection, both the client and the server need to support persistent connections.

2.4, Host header processing

In HTTP1.0, each server is considered to be bound to a unique IP address, therefore, the URL in the request message does not pass the hostname (hostname). But with the development of virtual host technology, multiple virtual hosts (Multi-homed Web Servers) can exist on one physical server, and they share one IP address. Both HTTP1.1 request messages and response messages should support the Host header field, and if there is no Host header field in the request message, an error (400 Bad Request) will be reported.

Suppose we have a resource URL http://example1.org/home.html, in the HTTP/1.0 request message, the request will be GET /home.html HTTP/1.0. That is, the host name will not be added. Such a message is sent to the server, and the server cannot understand the real URL that the client wants to request.

Therefore, HTTP/1.1 added the Host field to the request header. The message header added to the Host field will be:

GET /home.html HTTP/1.1
Host: example1.org

In this way, the server side can determine the real URL that the client wants to request.

2.5. Bandwidth optimization

2.5.1. Range request

HTTP/1.1 introduced a range request mechanism to avoid wasting bandwidth. When the client wants to request a part of a file, or needs to continue downloading a file that has been downloaded but terminated, HTTP/1.1 can add a header to the request to request ( Rangeand only request byte data) data part. The server side can ignore Rangethe header, or return several Rangeresponses.

If a response contains partial data, it will have 206 (Partial Content)a status code. The significance of this status code is to prevent the HTTP/1.0 proxy cache from mistakenly considering the response as a complete data response, thereby treating it as a response cache for a request.

In range responses, Content-Rangeheader flags indicate the offset of the data block and the length of the data block.

2.5.2, status code 100

Status codes were newly added in HTTP/1.1 100. The usage scenario of this status code is that there are some large file requests, and the server may not be willing to respond to such requests. At this time, the status code can be used 100as an indication of whether the request will be responded normally. The process is as follows:

However, in HTTP/1.0, there is no 100 (Continue)status code. To trigger this mechanism, a Expectheader can be sent, which contains a 100-continuevalue of .

2.5.3. Compression

Data in many formats is pre-compressed during transmission. Data compression can greatly optimize bandwidth utilization. However, HTTP/1.0 does not provide many options for data compression, does not support the selection of compression details, and cannot distinguish between end-to-end (end-to-end) compression or hop-by-hop (hop-by-hop) compression.

HTTP/1.1 makes a distinction between content-codings and transfer-codings. Content encoding is always end-to-end, and transfer encoding is always hop-by-hop.

HTTP/1.0 includes the Content-Encoding header, which encodes the message end-to-end.

HTTP/1.1 adds the Transfer-Encoding header, which can perform hop-by-hop transfer encoding on messages.

HTTP/1.1 also added the Accept-Encoding header, which is used by the client to indicate what content encoding it can handle.

2.6. Summary

  1. Connection method  : HTTP 1.0 is short connection, HTTP 1.1 supports long connection.

  2. Status response codes  : HTTP/1.1 has added a large number of status codes, including 24 new status codes for error responses.
    For example
    100 (Continue)... Warmup requests before requesting large resources
    206 (Partial Content)...Identifiers for range requests
    409 (Conflict)... ················································
    410 (Gone).

  3. Cache processing  : In HTTP1.0, the If-Modified-Since and Expires in the header are mainly used as the criteria for cache judgment. HTTP1.1 introduces more cache control strategies such as Entity tag, If-Unmodified-Since, If-Match, If-None-Match and more optional cache headers to control the cache strategy.

  4. Bandwidth optimization and use of network connections  : In HTTP1.0, there are some phenomena of wasting bandwidth. For example, the client only needs a part of an object, but the server sends the entire object, and does not support the function of resuming uploads. HTTP1.1 introduces the range header field in the request header, which allows only a certain part of the resource to be requested, that is, the return code is 206 (Partial Content), which is convenient for developers to choose freely to make full use of bandwidth and connections.

  5. Host header processing  : HTTP/1.1 adds Hostfields to the request header.

3. SPDY: Optimization of HTTP1.x

In 2012, Google proposed the SPDY solution like a thunder, which optimized the request delay of HTTP1.X and solved the security of HTTP1.X, as follows:

3.1. Reduce latency

To solve the problem of high HTTP latency, SPDY elegantly adopts multiplexing. Multiplexing solves the problem of HOL blocking by sharing one tcp connection with multiple request streams, reduces latency and improves bandwidth utilization.

3.2, request priority (request prioritization)

A new problem brought about by multiplexing is that key requests may be blocked on the basis of connection sharing. SPDY allows setting priority for each request, so that important requests will be responded to first. For example, when a browser loads the homepage, the html content of the homepage should be displayed first, and then various static resource files, script files, etc. are loaded, so that users can see the webpage content at the first time.

3.3, header compression

As mentioned earlier, the headers of HTTP 1.x are often redundant. Choosing an appropriate compression algorithm can reduce the size and number of packets.

3.4. HTTPS-based encryption protocol transmission

The reliability of data transmission is greatly improved.

3.5, server push (server push)

A webpage using SPDY. For example, my webpage has a request for sytle.css. When the client receives the sytle.css data, the server will push the sytle.js file to the client. When the client tries to get it again sytle.js can be obtained directly from the cache, no need to send a request. SPDY composition diagram:

SPDY is located under HTTP and above TCP and SSL, so that it can be easily compatible with the old version of the HTTP protocol (encapsulating the content of HTTP1.x into a new frame format), and can use the existing SSL functions at the same time.

4. HTTP2.0: an upgraded version of SPDY

HTTP2.0 can be said to be an upgraded version of SPDY (in fact, it was originally designed based on SPDY), but there are still differences between HTTP2.0 and SPDY, as follows:

The difference between HTTP2.0 and SPDY:

  1. HTTP2.0 supports plaintext HTTP transmission, while SPDY enforces the use of HTTPS
  2. The compression algorithm of the HTTP2.0 message header uses HPACK instead of DEFLATE used by SPDY

5. New features of HTTP2.0 compared with HTTP1.X

  • The new binary format (Binary Format) , the analysis of HTTP1.x is based on text. The format analysis based on the text protocol has natural flaws. The representation of the text is diverse, and there must be many scenarios for robustness considerations. The binary is different, and only the combination of 0 and 1 is recognized. Based on this consideration, the protocol analysis of HTTP2.0 decided to use the binary format, which is convenient and robust to implement.
  • Multiplexing (MultiPlexing) , that is, connection sharing, that is, each request is used as a connection sharing mechanism. One request corresponds to one id, so there can be multiple requests on one connection, and the requests of each connection can be randomly mixed together, and the receiver can attribute the request to different server requests according to the request id.
  • Header compression , as mentioned above, has a lot of information in the header of HTTP1.x mentioned above, and it has to be sent repeatedly every time. HTTP2.0 uses an encoder to reduce the size of the header that needs to be transmitted, and each party in the communication caches a copy The header fields table not only avoids the transmission of repeated headers, but also reduces the size that needs to be transmitted.
  • Server push (server push) , like SPDY, HTTP2.0 also has a server push function.

6. Upgrade and transformation of HTTP2.0

  • As mentioned earlier, HTTP 2.0 can actually support non-HTTPS, but now mainstream browsers like chrome and firefox still only support the HTTP 2.0 protocol based on TLS deployment, so it is better to upgrade to HTTPS first if you want to upgrade to HTTP 2.0 .
  • After your website has been upgraded to HTTPS, it is much easier to upgrade to HTTP2.0. If you use NGINX, you only need to enable the corresponding protocol in the configuration file. You can refer to the NGINX white paper, NGINX Official Guide to Configuring HTTP2.0 .
  • If you use HTTP2.0, what should you do with the original HTTP1.x? In fact, there is no need to worry about this problem. HTTP2.0 is fully compatible with the semantics of HTTP1.x. For browsers that do not support HTTP2.0, NGINX will automatically be backward compatible.

7. Notes

7.1 What is the difference between the multiplexing of HTTP2.0 and the multiplexing of long connections in HTTP1.X?

  • HTTP/1.* A request-response, establish a connection, close when used up; each request must establish a connection;
  • The solution to HTTP/1.1 Pipeling is that several requests are queued and serialized for single-thread processing, and subsequent requests wait for the return of previous requests to obtain an execution opportunity. Once a request times out, subsequent requests can only be blocked, and there is no way. That is, people often say that the end of the line is blocked;
  • HTTP/2 multiple requests can be executed in parallel on one connection at the same time. A certain request task takes a lot of time and will not affect the normal execution of other connections;

Specifically as shown in the figure:

7.2. What exactly is server push?

The server-side push can send the resources required by the client to the client along with the index.html, eliminating the need for the client to repeatedly request. Just because there are no operations such as initiating a request or establishing a connection, the speed of static resources can be greatly improved by pushing them through the server. details as follows:

7.2.1. Ordinary client request process

7.2.2. The process of server push

 7.3. Why is header compression needed?

Assume that a page has 100 resources to load (this amount is quite conservative for today's Web), and each request has a 1kb message header (this is also not uncommon, because of things like cookies and references) ), you need to consume at least 100kb more to get these message headers. HTTP 2.0 can maintain a dictionary and update HTTP headers in small increments, greatly reducing the traffic generated by header transmission. Specific reference: Introduction to HTTP/2 header compression technology

7.4 How good is HTTP2.0 multiplexing?

The key to HTTP performance optimization is not high bandwidth, but low latency. A TCP connection "tunes" itself over time, initially limiting the maximum speed of the connection and increasing the speed over time if data is successfully transmitted. This tuning is called TCP slow start. For this reason, HTTP connections, which are already bursty and short-lived, become very inefficient.
HTTP/2 enables more efficient use of TCP connections by allowing all data streams to share the same connection, so that high bandwidth can also truly serve HTTP performance improvements.

8, reference

What are the major improvements of HTTP/2.0 compared to 1.0?
In-depth research: What is the real performance of HTTP2
Introduction to HTTP/2 header compression technology

Guess you like

Origin blog.csdn.net/qq_34272760/article/details/126398841