Detailed explanation of TCP timeout and retransmission mechanism - long text warning

The last article about TCP " TCP three-way handshake, four waves and some details " has a good feedback, and I am quite happy. This time I will talk about the part about timeout and retransmission.


We all know that the TCP protocol has a retransmission mechanism, that is, if the sender believes that packet loss has occurred, it will retransmit these packets. Obviously, we need a way to " guess " whether a packet loss has occurred. The simplest idea is that every time the receiver receives a packet, it returns an ACK to the sender , indicating that it has received the data. Conversely, if the sender does not receive an ACK for a period of time, it is likely to be The packet is lost, and the packet is then resent until an ACK is received.

You may have noticed that I used "guessing", because even with a timeout, the packet may not have been lost, it just took a long detour and came late. After all, the TCP protocol is a protocol at the transport layer , and it is impossible to know exactly what happened at the data link layer and the physical layer. But this does not prevent our timeout retransmission mechanism, because the receiver will automatically ignore the duplicate packets.

The concept of timeout and retransmission is actually that simple, but there are many internal details. One of the first questions we think of is, how long does it take to count as a timeout ?

How is the timeout determined?

The one-size-fits-all approach is that I directly set the timeout to a fixed value , such as 200ms, but this is definitely a problem. Our computer interacts with many servers. These servers are located in the world, domestic and foreign, and the delay difference Huge, for example:

  • My personal blog is in China, and the delay is about 30ms, which means that the data packets under normal circumstances can already receive ACK in about 60ms, but according to our method, it takes 200ms to determine the packet loss (normally, it may be 90 to 120 ms), this efficiency is a bit low .
  • Suppose you visit a foreign website, and the delay is 130 ms, which is troublesome. Normal data packets may be considered to be timed out, resulting in a large number of data packets being retransmitted. It is conceivable that the retransmitted data packets are also easily misjudged. for timeout. . . The feeling of the avalanche effect

Therefore, it is very unreliable to set a fixed value. We need to dynamically adjust the timeout period according to the network delay. The greater the delay, the longer the timeout period.

Two concepts are introduced here:

  • RTT (Round Trip Time): Round trip delay, that is, the time from when the data packet is sent to when the corresponding ACK is received. **RTT is connection-specific, each connection has its own independent RTT.
  • RTO (Retransmission Time Out): Retransmission timeout, which is the timeout period mentioned earlier.

Compare standard RTT definitions:

Measure the elapsed time between sending a data octet with a particular sequence number and receiving an acknowledgment that covers that sequence number (segments sent do not have to match segments received). This measured elapsed time is the Round Trip Time (RTT).

Classical method

The original specification " RFC0793 " used the following formula to obtain a smoothed RTT estimate (called SRTT):

SRTT <- α·SRTT +(1 - α)·RTT

RTT refers to the latest sample value. This estimation method is called "exponentially weighted moving average". The name sounds high, but the whole formula is easier to understand. It is to use the existing SRTT value and the latest measured RTT value to take a weighted average.

With SRTT, it is time to set the value of the corresponding RTO. " RFC0793 " is calculated like this:

RTO = min(ubound, max(lbound, (SRTT)·β))

Here, ubound is the upper boundary of RTO , lbound is the lower boundary of RTO , β is called the delay dispersion factor , and the recommended value is 1.3 to 2.0. This formula is to use the value of (SRTT)·β as RTO, but it also limits the upper and lower limits of RTO .

At first glance, there is no problem with this calculation method (at least I feel that way), but in practice, there are two defects:

There were two known problems with the RTO calculations specified in RFC-793. First, the accurate measurement of RTTs is difficult when there are retransmissions. Second, the algorithm to compute the smoothed round-trip time is inadequate [TCP:7], because it incorrectly assumed that the variance in RTT values would be small and constant. These problems were solved by Karn's and Jacobson's algorithm, respectively.

This passage is taken from " RFC1122 ", let me explain:

  • In the case of data packet retransmission , the calculation of RTT will be very "trouble", I drew a picture to illustrate these situations:

    The figure lists two cases, in which the method of calculating RTT is different (this is the so-called retransmission ambiguity):

    • Case 1: RTT = t2 - t0
    • Case 2: RTT = t2 - t1

    But for the client, it doesn't know what happened. The result of wrong selection is that the RTT is too large/small, which affects the calculation of RTO. (The most simple and crude solution is to ignore the retransmitted packets and only count those that have not been retransmitted , but this will lead to other problems. See Karn's algorithm for details )

  • Another problem is that this algorithm assumes that the RTT fluctuation is relatively small , because this weighted average algorithm, also called a low-pass filter , is not sensitive to sudden network fluctuations. If the network delay suddenly increases, the actual RTT value is much larger than the estimated value, which will lead to unnecessary retransmissions and increase the network burden. (An increase in RTT already indicates that the network is overloaded, and these unnecessary retransmissions will further burden the network).

standard method

To be honest, this standard method comparison,,, trouble, I will paste the formula directly:

​SRTT <- (1 - α)·SRTT + α·RTT //Same as the basic method, find the weighted average of SRTT

​rttvar <- (1 - h) rttvar + h (|RTT - SRTT |) //Calculate the difference between SRTT and the true value (called absolute error |Err|), also use weighted average

​RTO = SRTT + 4 rttvar //The estimated new RTO, the coefficient 4 of rttvar is adjusted by the parameter

The overall idea of ​​this algorithm is to combine the average value (that is, the basic method) and the average deviation to estimate, and a wave of metaphysical parameter tuning has achieved good results. For a more in-depth understanding of this algorithm, refer to " RFC6298 ".

Retransmissions - an important event for TCP

timer-based retransmission

Under this mechanism, each data packet has a corresponding timer . Once the RTO is exceeded without receiving an ACK, the data packet will be resent. Packets that do not receive an ACK will be stored in the retransmission buffer, and will be deleted from the buffer after the ACK is received.

First of all, it is clear that for TCP, timeout retransmission is a very important event (RTO is often greater than twice the RTT, and timeout often means congestion). Once this happens, TCP will not only retransmit the corresponding data segment, but also Reduce the current data sending rate , because TCP will think the current network is congested.

Simple timeout retransmission mechanisms are often inefficient, as in the following cases:

Suppose data packet 5 is lost, and data packets 6, 7, 8, and 9 have all reached the receiver. At this time, the client can only wait for the server to send ACK. Note that for packets 6, 7, 8, and 9, the server cannot send ACK. This is determined by the sliding window mechanism, so for the client, he has no idea how many packets have been lost, and he may be pessimistic that the packets after 5 are also lost, so he will retransmit these 5 packets. This is rather wasteful.

fast retransmission

The fast retransmission mechanism " RFC5681 " triggers retransmission based on the feedback information from the receiver, rather than the retransmission timer timeout.

As mentioned just now, timer-based retransmission often takes a long time, and fast retransmission uses a very clever method to solve this problem: if the server receives an out-of-order packet, it also returns an ACK to the client , only It's just a repeated ACK. Take the example just now, when receiving out-of-order packets 6, 7, 8, and 9, the server all sends ACK = 5. In this way, the client knows that 5 has a vacancy. In general, if the client receives repeated ACKs three times in a row, it will retransmit the corresponding packet without waiting for the timer to expire.

But fast retransmission still does not solve the second problem: how many packets should be retransmitted?

retransmission with selection acknowledgment

The improved method is SACK (Selective Acknowledgment). In short, it returns the sequence number range of the recently received segment based on fast retransmission , so that the client knows which data packets have arrived at the server.

Here are a few simple examples:

  • case 1: The first packet is lost, and the remaining 7 packets are received.

    When any of the 7 packets are received, the receiver will return an ACK with the SACK option to inform the sender which out-of-order packets it has received. Note: Left Edge, Right Edge are the left and right boundaries of these out-of-order packets .


             Triggering    ACK      Left Edge   Right Edge
             Segment

             5000         (lost)
             5500         5000     5500       6000
             6000         5000     5500       6500
             6500         5000     5500       7000
             7000         5000     5500       7500
             7500         5000     5500       8000
             8000         5000     5500       8500
             8500         5000     5500       9000

  • case 2: 2nd, 4th, 6th, 8th packets are lost.
    • When the first packet is received, there is no out-of-order situation, and ACK is normally returned.

    • When the 3rd, 5th, and 7th packets are received, ACK with SACK is returned due to out-of-order packets.

    • Because there are many fragmented segments in this case, there are also many groups of corresponding Block segments. Of course, because of the size limit of the option field, the Block also has an upper limit.


          Triggering  ACK    First Block   2nd Block     3rd Block
          Segment            Left   Right  Left   Right  Left   Right
                             Edge   Edge   Edge   Edge   Edge   Edge

          5000       5500
          5500       (lost)
          6000       5500    6000   6500
          6500       (lost)
          7000       5500    7000   7500   6000   6500
          7500       (lost)
          8000       5500    8000   8500   7000   7500   6000   6500
          8500       (lost)

However, the SACK specification " RFC2018 " is a bit tricky. The receiver may "break his promise" after providing a SACK to tell the sender this information, that is to say, the receiver may delete these (out of order) packets, and then Notify the sender again. The following is an excerpt from "RFC2018":

Note that the data receiver is permitted to discard data in its queue that has not been acknowledged to the data sender, even if the data has already been reported in a SACK option. Such discarding of SACKed packets is discouraged, but may be used if the receiver runs out of buffer space.

The last sentence is that this action can be taken when the receiver buffer is almost exhausted, which is of course not recommended. . .

Due to this operation, the sender cannot directly clear the data in the retransmission buffer after receiving the SACK, until the receiver sends a normal ACK number greater than the value of its maximum sequence number. In addition, the retransmission timer is also affected. The retransmission timer should ignore the impact of SACK. After all, the receiver deletes the data and loses the packet.

DSACK extension

DSACK, that is, repeated SACK, this mechanism is based on SACK, and additional information is carried to inform the sender which data packets have been repeatedly received by themselves . The purpose of DSACK is to help the sender determine whether a packet out-of-order, ACK loss, packet duplication, or spurious retransmission has occurred. Let TCP do better network flow control.

Regarding DSACK, there are many examples in " RFC2883 ". Interested readers can read it. I won't go into details here.


The content of timeout and retransmission is probably so much, I hope it helps you.

If this article is helpful to you, welcome to pay attention to the ravings of my public account tobe , and take you deep into the world of computers~ There are surprises in the background of the public account to reply to the keyword [computer]~

If this article is helpful to you, welcome to pay attention to the ravings of my public account tobe , and take you deep into the world of computers~ There are surprises in the background of the public account to reply to the keyword [computer]~

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324078888&siteId=291194637