[Computer Network] HTTP protocol and its version characteristics


1.HTTP protocol

        The HTTP protocol (Hyper Text Transfer Protocol, Hypertext Transfer Protocol ) is a simple request-response protocol that usually runs on top of TCP and is the most widely used network protocol on the Internet. The HTTP protocol defines how a browser requests a World Wide Web document from a World Wide Web server, and how the server sends the document to the browser. From a hierarchical point of view, the HTTP protocol is a transaction-oriented application layer protocol, which is the basis for reliable exchange of files on the World Wide Web.

2. Common HTTP protocol versions

        There are four most common versions of the HTTP protocol, namely HTTP1.0, HTTP1.1, HTTP2, and HTTP3. Different versions of the http protocol have this very different.

2.1 HTTP1.0

Features:

  • Every time the browser sends a request, it will establish a TCP request with the server, and the TCP connection will be disconnected when the request ends. ( no connection )
  • The server does not keep track of every client nor log past requests. ( stateless )
  • In the version of HTTP0.9, only the GET method can be supported. In the version of HTTP1.0, two request methods, HEAD and POST, have been added.
  • Added response status codes to mark possible error causes.
  • The concept of a protocol version number is introduced.
  • The concept of HTTP header is introduced to make HTTP processing requests and responses more flexible.
  • The data to be transferred is no longer limited to text, but files such as pictures and music can also be transferred.

        The main disadvantage of HTTP 1.0 is the same as HTTP 0.9. Every time the browser sends a request, it needs to create a TCP connection, and the TCP connection will be disconnected after the request ends. Frequent creation and disconnection of TCP connections will undoubtedly increase the overhead of the server, and the speed of sending data at the beginning of the TCP connection is relatively slow, and there is a phase of slow start and congestion avoidance.

2.2 HTTP1.1

Features:

  • Connection: close is used by default in HTTP1.0. Connection: keep-alive has been used by default in HTTP1.1, which avoids the overhead of connection establishment and release. A TCP is not closed by default and can be multiplexed by multiple requests. Only when the set time has elapsed will the connection be disconnected. ( long connection )
  • The pipeline mechanism is introduced, and a TCP connection can send multiple requests at the same time.
  • A request for a domain name allows multiple long-lived connections to be allocated (alleviating the problem of head-blocking in long-lived connections).
  • Added some cached fields (If-Modified-Since, If-None-Match).
  • The range field is introduced in the request header to support resumable uploads.
  • The Host header is mandatory to make Internet hosting possible.
  • New methods such as PUT, DELETE, OPTIONS, and PATCH have been added.

        The HTTP pipeline mechanism means that in a TCP connection, multiple HTTP requests can be made in parallel, and the client can issue the next request without waiting for the result of the previous request to return, but the server must return responses in the order in which the client requests are received result.
        Although HTTP1.1 reduces the creation process of a large number of TCPs through long connections, other requests can only be blocked if the previous request is not completed.

2.3 HTTP2

Features:

  • Binary protocol : The header information in the HTTP1.1 version is text, and the data part can be text or binary. In the HTTP2 version, both the header and the data part are binary, and are collectively referred to as frames.
  • Multiplexing : The pipeline in HTTP1.1 is discarded. In the same TCP connection, the client and the server can send multiple requests and multiple responses at the same time, and do not need to follow the order, thus avoiding the problem of queue head blocking.
  • Header information compression : Use a special algorithm to compress the header to reduce the amount of data transmission, mainly through the server and the client to jointly maintain a header information table, all header information will have corresponding records in the table, and there will be An index number, so that only the index number needs to be sent later.
  • Server active push : Allows the server to actively push data to the client.
  • Data flow : All data packets of each request or response in HTTP2 become a data flow, and each data flow has a unique ID. The ID of the request data flow is an odd number, and the ID of the response data flow is an even number. Each data packet is sent with the ID of the corresponding data flow, so that the server and client can partition which data flow it belongs to.
            While HTTP2 solves many problems, a similar head-of-line problem still exists at the level of the TCP protocol, which is still the foundation upon which the Web is built. When a TCP packet is lost in transit, the receiver cannot acknowledge the incoming packet until the server resends the lost packet. Since TCP is not designed to follow a higher-level protocol like HTTP, a single lost packet will block the flow of all HTTP requests in progress until the lost data is resent.

2.4 HTTP3

        The purpose of HTTP3 is to solve the transmission problem of HTTP2. Provides fast, reliable and secure web connections on all forms of devices. To do this, it uses a different transport-layer networking protocol called QUIC .
Features:

  • The bottom layer of HTTP3 is implemented based on UDP , and UDP does not need the process of three-way handshake and four-way handshake, so it is inherently faster than TCP.
  • QUIC is no longer identified by a quadruple, but by a 64-bit random number as the ID, and UDP is connectionless, so when the IP or port changes, as long as the ID remains the same, there is no need to re-establish the connection .
  • QUIC's flow control also uses window_update to tell the peer the number of bytes it can accept. But QUIC's window is adapted to its own multiplexing mechanism. It not only controls the window on a connection, but also controls the window for each stream in a connection.
  • Integrated TLS encryption function. QUIC does not resume on top of TLS, but contains TLS internally. It uses its own frame to take over the records in TLS. Handshake messages and alarm messages do not use TLS records, and are directly encapsulated into QUIC frames for transmission, saving one-time overhead.
  • HTTP3 does not specify a default port number, which means that it is not necessary to provide HTTP/3 services on UDP 80 or 443.
  • In order to achieve reliability, QUIC also has a serial number, which is incremented. Any packet with a serial number is sent only once, and it will be increased by one next time. Even if the packet is lost and retransmitted, the sequence number will be incremented by one, and QUIC defines an offset concept. Since QUIC is connection-oriented, just like TCP, it is a data stream. The sent data has an offset offset in this data stream. You can check where the data is sent through offset, so as long as the packet of this offset does not come , it will be resent; if it comes, splicing according to the offset can still be spliced ​​into a stream.

3. Comparison of HTTP versions

protocol version core problem to be solved solution
0.9 HTML file transfer Established the communication process of client request and server response
1.0 Different types of file transfers set header field
1.1 Creating/disconnecting TCP connections is expensive Establish long connection for multiplexing
2 Concurrency is limited binary framing
3 TCP packet loss blocking Using UDP protocol

4. The difference between HTTP and HTTPS

        Since the HTTP protocol uses clear text transmission, there are the following three risks in terms of security:

  • Risk of eavesdropping : For example, the communication content can be obtained on the communication link, and the user number is easily lost.
  • Risk of tampering : such as forced placement of spam advertisements, visual pollution, and users' eyes are prone to blindness.
  • Risk of impersonation : such as pretending to be Taobao website, etc.
    insert image description here
            As shown in the figure above (the HTTP protocol is on the left, and the HTTPS protocol is on the right), in order to solve the security problem of HTTP, HTTPS adds the SSL/TLS protocol between the HTTP and TCP layers, which can well solve the above risks.
    How does HTTPS solve the above three risks?
  • HTTPS uses a hybrid encryption method that combines symmetric encryption and asymmetric encryption . Asymmetric encryption is used to exchange session keys before communication is established, and asymmetric encryption is no longer used subsequently. All plaintext data is encrypted using a symmetric encrypted session key during the communication process.
  • The digest algorithm is used to achieve integrity, which can generate a unique fingerprint for the data, which is used to verify the integrity of the data and solve the risk of tampering.
  • The identity of the server public key is guaranteed by means of digital certificates to solve the risk of impersonation.

the difference:

  • The https protocol needs to apply for a certificate from a CA. Generally, there are few free integers, so a certain fee is required.
  • http is a hypertext transfer protocol, and information is transmitted in plain text, while HTTPS is a secure SSL/TLS encrypted transfer protocol.
  • http and https use completely different connection methods and different ports. The former is 80 and the latter is 443.
  • The http connection is very simple and stateless; the https protocol is a network protocol constructed by the SSL/TLS+HTTP protocol that can perform encrypted transmission and identity authentication, which is safer than the http protocol.

Guess you like

Origin blog.csdn.net/m0_73845616/article/details/127351636