"HTTP Authoritative Guide" Reading Notes: Chapter 4-Connection Management

1. TCP connection

  • Quadruple: source IP, source port, destination IP, destination port
  • Browser and web server interaction process:
    1. The browser resolves the hostname from the url
    2. Query the IP address of the host name (DNS)
    3. The browser gets the port number from the url
    4. The browser initiates a TCP connection to the server IP
    5. The browser sends an http request message to the server
    6. The browser reads the server's http response message
    7. The browser closes the tcp connection
  • The characteristics of tcp, that is, the connection data maintained by both parties after the tcp handshake:
    1. Port: Identify both parties to the connection
    2. Data packet sequence number (seq): used to ensure no repetition, no leakage, no disorder
    3. Window size: used for flow control
  • http(over tsl)over tcp
  • TCP is sent based on small pieces of data in IP packets (or IP datagrams).
  • Each tcp packet includes:
    1. ip packet header (20 byte)
    2. tcp step head (20 bytes)
    3. Content data

2. TCP performance

HTTP transaction delay:

  1. DNS resolution
  2. Establish a tcp connection
  3. The client sends an http request, the server reads and processes the request
  4. Server sends back response

1. TCP handshake:

  1. The client sends ctl=syn and seq=client-seq to the server
  2. s reply ctl=syn+ack, ack=client-seq+1 and seq=server-seq
  3. s confirm to c ctl=ack, ack=server-seq+1

2. Delayed confirmation

  • Each tcp segment has a sequence number and data integrity checksum. When each segment receiver receives a good segment, it will send back a small confirmation packet. If the sender does not receive the confirmation message within the specified time window, it is considered that the packet has been damaged or destroyed, and the data is retransmitted.
  • Since the confirmation message is very small, tcp allows it to be "piggybacked" in the output data packets sent to the same direction, that is, the returned confirmation information is combined with the output data packets, which can effectively utilize the network.
  • The "delayed confirmation" algorithm puts the confirmation message in the buffer within a specific time window (100~200ms) to wait for the output data packet to be piggybacked. If it expires, it is transmitted in a separate packet.

3. TCP slow start

  • After the connection is established, the maximum speed of the connection is initially limited. If the data is successfully transmitted, the transmission speed will be increased over time. This tuning is called TCP slow start and is used to prevent sudden overload and congestion on the Internet.
  • Slow start limits the number of packets that can be transmitted, and the number of packets that can be sent increases exponentially, which is called "opening the congestion window"
  • Therefore, the transmission speed of a new connection is slower than a "tuned" connection.

3. HTTP connection handling

Http's Connection header field has a comma-separated list of connection tags , such as Connection: close, indicating that the connection must be closed after sending the next message.

Serial transaction processing delay

  • Each transaction must create a new tcp connection,Connection delay and slow start delayWill add up.
  • When loading a picture, the rest of the page is silent

4. Parallel connection

The HTTP client opens multiple tcp connections and processes http transactions in parallel. Each transaction has its own tcp connection.

  • Partial delay can be overlapped
  • It doesn't make sense if there is insufficient bandwidth
  • Establishing more connections will incur some additional overhead
  • The server load increases, generally limit a client's parallel connection not more than 4
  • Swiping multiple pictures at the same time makes the user experience feel faster

5. Persistent connection

  • Demand Scenario: Site localized (site locality)
  • Various enhanced versions of http/1.1 and http/1.0
  • Definition: the tcp connection that remains open after the transaction is over
  • Avoid the delay of establishing a connection and the slow-start congestion adaptation phase
  • Be especially careful when managing persistent connections, otherwise it will accumulate a large number of idle connections, consuming client and server resources
  • Many web applications now open a small number of parallel connections, each of which is a persistent connection.
  • HTTP/1.1 persistent HTTP/1.0+keep-alive

http/1.0+keep alive

  • http/1.0 extension
  • Long connection handshake process: The client requests a connection to be kept open by including the Connection: Keep-Alive header. If the server is willing to keep the connection open for the next request, it includes the same header in the response.
  • Promise but not guarantee, idle connections can be closed at any time
  • The timeout is sent in the response header. It estimates the time the server wants to maintain a long connection, and max estimates how many transactions the server wants to keep a long connection.

limit

  • ka is not the default in http/1.0
  • Each message must contain the ka header, otherwise the server will close the connection after that request
  • Content-Length must be correct or encoded in block transfer encoding
  • Dumb proxy: the header that the proxy server cannot forward: Connection, Proxy-Ahtuenticate, Proxy-Connection, Transfer-Encoding

http/1.1 persistent connection

  • HTTP/1.1 is a persistent connection by default. You must add Connection: close to the message to close the connection after the transaction is over.
  • Promise but not guarantee, you can close the connection at any time

limit

  • After sending the Connection: close header, the client cannot send more requests on this tcp connection
  • Content-Length must be correct or encoded in block transfer encoding
  • The HTTP/1.1 proxy must separately manage persistent connections with the client and server-each persistent connection is only suitable for one-hop transmission
  • HTTP/1.1 devices can close the connection at any time
  • A client can only maintain at most two persistent connections to any server or agent to prevent server overload

6. Pipeline connection

  • HTTP/1.1 allows optional use of request pipelines on persistent connections
  • Send requests 2, 3,... before the response to request 1 arrives
  • If the client cannot confirm that the connection is persistent, it should not use a pipe
  • The response must be sent back in the same order as the request. http message has no serial number label
  • The client must be prepared to close the connection at any time and resend the request
  • Non-idempotent requests should not be sent in a pipeline

Head of line blocking

  • Subsequent responses arrive before the response from the head of the team. In order to maintain the order, they should be cached and wait for the response from the head of the team. If the response of the head of the team is delayed, subsequent responses will be delayed
  • The tcp team head is blocked. If you want to avoid it, you can’t use tcp. Google’s quic protocol is based on udp
  • http line head blocking, use http2 if you want to avoid it

7. Close the connection

  • Both s and c can close a tcp connection at any time, usually at the end of a message but may also be in the middle when an error occurs
  • Each HTTP response should have an accurate Content-Length header. If there is no or an error, the connection sent by the server should be closed to indicate the true end of the data.
  • When the client receives an http response that ends when the connection is closed, and the actual length of the transmitted entity is not equal to the Content-Length, it should question the correctness of the length
  • The client is ready for the connection to be closed at any time, re-establishing the connection and sending the request repeatedly
  • Pipeline only sends idempotent requests (get, put, head, delete, trace, options), most browsers will prompt the user before reloading the post request
  • TCP can be fully closed-close (), can also be half closed-shutdown ()
  • It is always safe to close the output channel, and it is more dangerous to close the input channel. If the peer sends data to a channel you have closed, the operating system object to the peer machine will send back a "connection reset by the peer" message. This message is usually treated as a serious error and the cache is cleared. If there is an operating system The cached responses that have not been read by the application are lost.
  • Closing process: Both parties first close the output channel and periodically check the status of the input channel. If the input channel is not closed by the opposite end within a certain period of time, the application will force the connection to close.

Guess you like

Origin blog.csdn.net/qq_35753140/article/details/105458274