TCP flow control and congestion avoidance

TCP flow control

     The so-called flow control is to make the sender's sending rate not too fast, so that the receiver has time to accept it. Using the sliding window mechanism, it is very convenient to implement flow control on the sender on the TCP connection. The window unit of TCP is bytes, not segments. The sender's sending window cannot exceed the value of the receiving window given by the receiver.
     As shown, flow control with variable window size is illustrated. Suppose host A sends data to host B. The window value determined by both parties is 400. Let each segment be 100 bytes long, the initial value of the sequence number is seq=1, the uppercase ACK above the arrow in the figure indicates that the header is considered as ACK, and the lowercase ack indicates Confirm the value of the field.
     Host B on the receiver performs flow control three times. The first time the window is set to rwind=300, the second time it is reduced to rwind=100 and finally to rwind=0, that is, the sender is not allowed to send any more data. This state, which causes the sender to suspend sending, will continue until host B re-issues a new window value.
     Suppose, shortly after B sends a zero-window segment to A, B's receive buffer has some storage space. So B sends a segment with rwind=400 to A, but this segment is lost during transmission. A has been waiting to receive the notification of the non-zero window sent by B, and B has also been waiting for the data sent by A. This is deadlock. To resolve this deadlock state, TCP has a persistent timer for each connection. As long as one party of the TCP connection receives the zero-window notification from the other party, it will start the continuous timer. If the time set by the continuous timer expires, it will send a zero-window detection segment (carrying only 1 byte of data), and the other party will The current window value is given just when the probe segment is acknowledged.

Selection of TCP segment sending timing

     There are mainly the following options for the timing of sending a TCP segment.
     1) TCP maintains a variable, which is equal to the maximum segment length MSS. As long as the data stored in the cache reaches MSS bytes, it is assembled into a TCP segment and sent out.
     2) The sender's application indicates that the message segment is required to be sent, that is, the push operation supported by TCP
     3) When a timer of the sender expires, the currently existing cached data is loaded into the segment and sent out.

TCP congestion control

1. The principle of congestion control
     During a certain period of time, if the demand for a certain resource in the network exceeds the available part provided by the resource, the performance of the network will change, which is called congestion. Network congestion is often caused by many factors. Simply increasing the speed of the node processor or expanding the storage space of the node cache cannot solve the congestion problem. The problem of congestion is that the parts of the whole system are often mismatched, and the problem will be solved only when each part is balanced.
2. The difference between congestion control and flow control
     The so-called congestion control is to prevent too much data from being injected into the network, so that the routers or links in the network will not be overloaded. Congestion control has to do with a premise that the network can bear the existing network load. Congestion is a global problem, involving all hosts, all routers, and all factors related to reducing network transmission performance. Flow control often refers to the control of point-to-point traffic, which is an end-to-end problem. All flow control has to do is control the rate at which the sender sends data so that the receiver has time to accept it.
3. Congestion Control Design
     Congestion control is difficult to design because it is a dynamic problem, and in many cases, even the congestion control mechanism itself is the cause of network performance degradation or even deadlock. From the perspective of control theory, congestion control can be divided into two methods: open-loop control and closed-loop control. Open-loop control is to consider all the factors related to the occurrence of congestion in advance when designing the network, and once the system is running, it cannot be corrected in the middle.
     Closed-loop control is based on the concept of a feedback loop and includes the following measures:
     1) Monitor network systems to detect when and where congestion occurs
     2) Send information about the occurrence of congestion to where action can be taken
     3) Adjust the actions of the network system to solve the problems that arise.
4. Congestion Control Methods
  The Internet recommendation standard RFC2581 defines four algorithms for congestion control, namely Slow-start, Congestion Avoidance, Fast Restrangsmit and Fast Recovery. we assume
     1) Data is sent in one direction, and the other direction only sends confirmation
     2) The receiver always has a large enough buffer space, because the size of the sending window is determined by the degree of network congestion.

 Slow start and congestion avoidance

  The sender maintains a state variable called the congestion window cwnd (congestion window) . The size of the congestion window depends on how congested the network is and changes dynamically. The sender makes its sending window equal to the congestion window. In addition, considering the receiving ability of the receiver, the sending window may be smaller than the congestion window. The principle of the sender's control of the congestion window is that as long as the network is not congested, the congestion window will be increased in order to send more packets. But as long as the network is congested, the congestion window is reduced to reduce the number of packets injected into the network.

       The idea of ​​the slow start algorithm is that the initial TCP will send a large number of data packets to the network after the connection is successfully established, which can easily cause the router cache space in the network to be exhausted, resulting in congestion. Therefore, the newly established connection cannot send a large number of data packets at the beginning, but can only gradually increase the amount of data sent each time according to the network conditions, so as to avoid the occurrence of the above phenomenon. Specifically, when a new connection is created, cwnd is initialized to a maximum segment (MSS) size, and the sender starts to send data according to the congestion window size. Whenever a segment is confirmed, cwnd is increased by at most 1 MSS. size. In this way, the congestion window CWND is gradually increased.

       Here, the slow start algorithm is illustrated by the congestion window size of the number of segments. The real-time congestion window size is in bytes. As shown below:

       In order to prevent the network congestion caused by the excessive growth of cwnd, a slow start threshold ssthresh state variable needs to be set. The usage of ssthresh is as follows:

  When cwnd < ssthresh, the slow start algorithm is used.

  When cwnd>ssthresh, use the congestion avoidance algorithm instead.

  When cwnd=ssthresh, slow start and congestion avoidance algorithm are arbitrary.

       The idea of ​​congestion avoidance algorithm: let the congestion window grow slowly, that is, increase the sender's congestion window cwnd by 1 every time a round-trip time RTT passes, instead of doubling it. In this way, the congestion window grows slowly according to a linear law.

       Whether it is in the slow start stage or in the congestion avoidance stage, as long as the sender judges that the network is congested (the basis is that it times out without receiving an acknowledgment, although the failure to receive an acknowledgment may be due to other reasons for packet loss, but because it cannot be determined, so all as congestion), set the slow start threshold to half the size of the send window when congestion occurs. Then set the congestion window to 1 and perform the slow start algorithm. The purpose of this is to quickly reduce the number of packets sent by the host to the network, so that the congested router has enough time to process the backlogged packets in the queue.
  As shown below:

 

Multiplication to decrease and addition to increase

  Multiplicative reduction: It means that no matter in the slow start stage or the congestion avoidance stage, as long as the timeout occurs, the slow start threshold is halved, that is, it is set to half of the current congestion window (at the same time, the slow start algorithm is executed). When the network is frequently congested, the ssthresh value drops rapidly to greatly inject the small number of packets into the network.

  Additive increase: After the congestion avoidance algorithm is executed, the congestion window is slowly increased to prevent premature network congestion.

Fast retransmission and fast recovery

  A TCP connection is sometimes idle for a long time due to the timeout of the retransmission timer. Slow start and congestion avoidance cannot solve such problems well. Therefore, a congestion control method of fast retransmission and fast recovery is proposed. The fast retransmission algorithm does not cancel the retransmission mechanism, but in some cases earlier retransmits the lost segment (if the sender receives three repeated ACKs, it is determined that the packet is lost and retransmitted immediately lost segments without having to wait for the retransmission timer to expire).
  Fast retransmission requires the receiver to send a duplicate acknowledgment immediately after receiving an out-of-sequence segment (in order to let the sender know early that a segment has not reached the other party) instead of waiting until it sends data by piggybacking the acknowledgment (ACK delay). algorithm). The fast retransmission algorithm stipulates that as long as the sender receives three repeated acknowledgments in a row, it should immediately retransmit the segment that has not been received by the other party, without continuing to wait for the set retransmission timer to expire. As shown below:

  The fast retransmission is also used in conjunction with the fast recovery algorithm, which has the following two points:

  ① When the sender receives three repeated confirmations in a row, the "multiplication reduction" algorithm is executed to halve the ssthresh threshold. But then the slow start algorithm is not executed.

  ② Considering that if the network is congested, it will not receive several duplicate acknowledgments, so the sender now believes that the network may not be congested. So instead of executing the slow start algorithm at this time, set cwnd to the size of ssthresh halved, and then execute the congestion avoidance algorithm. As shown below:

  When using the fast recovery algorithm, the slow start algorithm is only used when the TCP connection is established and when the network times out.

  The accept window is also known as the notification window. Therefore, from the perspective of the receiver's flow control to the sender, the sender's sending window must not exceed the RWND of the receiving window given by the other party.

  That is to say: the upper limit of the sending window = Min[rwnd, cwnd].

Random Early Detection RED

       The above congestion avoidance algorithm is not related to the network layer. In fact, the policy of the network layer has the greatest impact on the congestion avoidance algorithm is the discarding policy of the router. In simple cases, routers usually process incoming packets on a first-in-first-out basis. When the router's cache cannot hold the packet, it discards the incoming packet, which is called the tail drop policy. This results in packet loss and the sender thinks the network is congested. What is more serious is that there are many TCP connections in the network, and the packet segments in these connections are usually multiplexed routing paths. If a router's tail drop occurs, many TCP connections may be affected, and the result is that these many TCP connections enter the slow start state at the same time. This is called global synchronization in terminology . Global synchronization will cause the network traffic to suddenly drop a lot, and after the network returns to normal, its traffic suddenly increases a lot.

       In order to avoid the phenomenon of global synchronization in the network, the router adopts random early detection (RED: randomearly detection). The main points of the algorithm are as follows:

       The router's queue maintains two parameters, namely, the minimum threshold min and the maximum threshold max of the queue length. Whenever a packet arrives, RED calculates the average queue length. Then treat the incoming packets according to the situation:

  ①The average queue length is less than the minimum threshold—put the newly arrived packet into the queue.

  ②The average queue length is between the minimum threshold and the maximum threshold—the packet is discarded according to a certain probability.

  ③ The average queue length is greater than the maximum threshold - discard the newly arrived packets.

      RED does not discard all packets at the end of the queue after congestion has occurred, but when it detects early signs of network congestion (that is, when the average queue length of the router exceeds a certain threshold), it randomly discards packets with probability p. Let congestion control be performed only on individual TCP connections, thus avoiding global congestion control.

       The key to RED is to select three parameters, the minimum threshold, the maximum threshold, the drop probability and the calculation of the average queue length. The minimum gate line must be large enough to ensure high utilization of the router's output link. The difference between the maximum threshold and the minimum threshold should be large enough. Yes, the normal growth of the queue in a TCP round-trip time RTT is still within the maximum threshold. Experience has shown that it is appropriate to make the maximum threshold equal to twice the minimum threshold.

  The average queue length uses the weighted average method to calculate the average queue length, which is the same as the round-trip time (RTT) calculation strategy.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324412609&siteId=291194637