[TCP/IP] Ethernet flow control------pause flow control

keywords

Ethernet data link layer PAUSE frame flow control.


Recently, a product problem caused by pause flow control was located, and a detailed study of pause flow control was conducted. Since there is very little information about pause flow control on the Internet, here is a summary of all knowledge related to pause flow control for your reference. .


1. Ethernet flow control

Ethernet flow control is divided into two types, one is flow control under half-duplex, generally using back pressure technology; the other is flow control under full-duplex, that is, the pause flow control to be discussed below.

Second, the principle and implementation of pause flow control

1.Pause flow control principle

The IEEE802.3x protocol provides a full-duplex flow control structure framework for the MAC control sublayer of the data link layer. You can read the original rfc document. So pause flow control is handled by the MAC control sublayer.

PAUSE flow control is a simple stop-and-wait mechanism. When the local end receives too much data and the processing pressure is high, it can send a pause message (time is not 0, usually 0xFFFF) to the opposite end to let the opposite end pause to give The local end sends data. When the pressure on the local end is removed, another pause message is sent (time is 0), so that the peer end can continue to send data to the local end.

Conversely, when the peer device is congested, the local port usually receives multiple PAUSE frames continuously. As long as the congestion status of the peer device is not released, the relevant port will always send PAUSE.

2.pause message format

The pause message belongs to the control frame of the MAC layer. It is difficult to catch the problem when locating the problem. The general network card can't catch it. It can't be caught by trying to connect to various switches and using port mirroring. Finally, it is caught by a special network testing instrument. ; Another is to make messages by yourself and send them from the network card, which can be easily caught in this case.

The pause message captured by wireshark:
insert image description here
The pause frame written by the Kelai packet generator:
insert image description here
The definition of each field of the PAUSE frame is as follows:
Destination address : The protocol specifies that the destination address of PAUSE is the reserved multicast address 0x01-80-C2-00 -00-01.

Source Address : The 48-bit MAC address of the port sending PAUSE frames.

Protocol type : 0x8808, MAC control frame.

Opcode : Always 0x0001.

Parameter - Pause Time : 2 bytes pause time parameter. It is the length of time for the PAUSE sender to request the other party to stop sending data frames. It is usually 0xFFFF, and some are not 0xFFFF. The unit of time measurement is the time it takes to transmit 512-bit data at the current transmission rate. The actual pause time of the receiver is the operation parameter The product of the field contents and the time it takes to transfer 512 bits of data at the current transfer rate.

3.pause flow control processing logic

The processing logic of pause flow control on the switch is as follows:
insert image description here
Port-level flow control:
For the switch, each port has an rx buffer:
when the data in the rx buffer of the receiving queue of this port exceeds a certain threshold XON, the pause flow is triggered Control, send a pause message to the peer, the time is set to 0xFFFF.
When the message in the receiving queue rx buffer of this port is processed and is lower than a certain threshold XOFF, the pause flow control is released, and a pause message is sent to the opposite end, and the time is set to 0x0000.

Flow control of the whole machine:
At the same time, the rx buffer of each port is in the same large buffer:
when the total size of the rx buffer of all ports exceeds the set threshold XON, the flow control of the whole machine will be triggered. The port sends a pause message;
when the total size of the rx buffer of all ports is lower than the set threshold XOFF after processing, the flow control of the whole machine will be released;

Many switches do not achieve the coordination of port-level flow control and overall machine flow control, or simply block the entire lan port when a lan port continues to receive pause. Because this has a great relationship with the total buffer size, port buffer size, XON and XOFF of the total buffer, XON and XOFF of the port buffer, it takes a long time and continuous practice to adjust, in order to achieve a good Effect.

4. Realization on pause flow control chip

As shown in the figure, the left side is the end chip, and the right side is the opposite end chip.
insert image description here

Both MAC0 and MAC1 contain a transmit side tx and a receive side rx. There is a flow control signal fc_rdy on the sending side of the mac upstream module A and mac0 in the left chip. When the signal is high, it means that module A cannot process the input data in time and needs to perform flow control. For the convenience of highlighting the key points, modules such as PCS and serdes are omitted from the figure.

The specific process is as follows:

Steps 1~2: The peer mac1 sends data to the receiving side of mac0, and sends it to module A

Step 3: Module A cannot even process the input data, and needs to reduce the data input, thereby pulling fc_rdy high.

Step 4: tx on the sending side of mac0 finds that the flow control signal fc_rdy is high, generates a pause frame, and sends it to the receiving side of mac1. As long as fc_rdy is high, tx on the sending side of mac0 sends a pause frame at regular intervals, and the interval is controlled by the configuration register. The interval duration calculation is calculated by the counter counting. The time to stop sending data in the Pause frame is controlled by another configuration register. As long as fc_rdy is high, the mac0 sending side does not send normal data.

Step 5: After the rx on the receiving side of mac1 receives the pause message, it extracts the pause time contained in the pause frame, and controls the tx on the sending side to stop sending data

6 7 8: After mac1 stops sending data, module A pulls down fc_rdy after processing the previous data, indicating that mac1 can continue to send data.

9: Step: Step 9 is divided into 2 situations.

Case 1: fc_rdy is pulled low, and counting does not count for an interval period. At this time, the pause frame is sent, but the pause time in the frame is 0. After Mac1 receives the pause frame, the control tx control starts to send data immediately.

Case 2: While fc_rdy is pulled low, counting just counts to an interval period, and no pause frame is sent at this time. After the pause time of the previous pause frame arrives, the tx on the sending side of mac1 continues to send data.
The pause frame processing protocol mandates:

The generation and sending process of pause cannot interrupt a complete data packet. That is, in step 4, after fc_rdy is pulled high, first, the mac0 tx side needs to determine whether normal data packets are currently being transmitted. If so, the pause frame can only be sent after the current data packet is transmitted. That is to say, during the sending process, only pause frames can be inserted in the gaps of complete data packets.

The pause time of the new pause packet will overwrite the previous pause time. For mac1, when mac1 receives a new pause frame, the pause time is based on the latest time.

3. The role and side effects of pause flow control

Although the protocol defines the pause flow control mechanism, the pause mechanism is not perfect. It has a certain role and also has great side effects. The key depends on the implementation and adjustment of the manufacturer.

1. The role of pause flow control

Pause flow control, through simple stop and wait processing, can avoid data loss caused by rx buffer overflow during traffic peaks.
So its role is to prevent instantaneous overload and data loss . The two- word summary is peak shaving : cut off traffic spikes, so that data can be transmitted smoothly.

2. Side effects of pause flow control

The starting point of pause flow control is good, but in practical applications, it has great side effects, mainly when the traffic is overloaded for a long time.
The first is to cause a surge: when the pressure on the local end cannot be handled, the pause flow control will be triggered to let the peer end stop sending data to itself; after the pressure on the local end is eliminated, the peer end will be notified to continue sending data; at this time, due to the data If the volume is large, it will not be processed immediately, triggering pause flow control, and so on. The network performance is stuttering, sometimes good and sometimes bad.
The second is to bring infection: when a node is in a long-term flow control state, its adjacent nodes cannot send data to it, causing data to be blocked in the adjacent node, and the adjacent node will also enter the flow control for a long time. The state, such a spread of ten, ten to a hundred, will eventually spread to the entire network, causing the entire network to be paralyzed.

Therefore, if the manufacturer does not adjust the pause flow control, there will be a great risk. Some manufacturers' switches are directly closed by default.

4. Analysis of the impact of pause flow control on performance

1. Performance Impact

How does the opening and closing of pause flow control affect performance?
Let's take 100M bandwidth as an example, assuming that the real processing capacity of the A-side is 80M/s, and the real processing capacity of the B-side is 90M/s, to analyze its impact on performance in detail:

  1. When the A-side flow control is closed, the maximum throughput is 80M. When the traffic is below 80M, there is no impact. However, when the traffic exceeds 80M/s, the performance will collapse, because the A-side cannot handle it, which will cause When data is lost, data loss requires retransmission, and retransmission consumes a certain amount of bandwidth, resulting in a vicious circle and a sharp drop in performance.
  2. When the A-side flow control is turned on, the real maximum throughput may be 78M/s, which is slightly lower than when it is turned off, because the pause flow control is not triggered when the rx buffer is full, but at a certain threshold (such as 80 %) is triggered. When the traffic exceeds 78M/s and reaches 85M/s, although its real pass performance is still 78M/s, after the flow control process, since the data will not be lost (it is only temporarily accumulated on the B-side layer 2), finally all The data packets will still be transmitted correctly, so the appearance of performance is that its peak processing capacity can reach 85M or even 90M (of course, the time to receive all messages will be extended), which is the performance of its peak shaving ability.

Through the above analysis, it can be found that pause flow control actually uses time and the space of the opposite end to exchange the space of the local end to ensure that the data is not lost.

Therefore, the opening and closing of flow control does not affect its real maximum throughput performance, but ensures smooth data transmission by peak clipping.

2. Risk assessment

Pause flow control is turned on:
Benefits: It can cut the peak of the instantaneous traffic peak to ensure the smooth transmission of data.
Disadvantage: When the traffic exceeds the processing capacity for a long time, it will cause surge and infection, which may lead to the paralysis of the entire network.

Pause flow control closure:
Benefits: There is no risk of paralysis of the entire network.
Disadvantages: The ability to cut traffic peaks is lost. Instantaneous traffic spikes can lead to unstable transmission, and the network will be good and bad.

Summary:
For a product, it is necessary to evaluate whether it needs to be turned on and off according to the performance value of its own product, the performance value of the peer product, and the application scenario where the product is located. Turn off sending pause, but not responding pause).
Therefore, it is best to have a switch that can be manually controlled for pause flow control to meet different application scenarios.

———————————————
Reference: "Evan_ZGYF," [TCP/IP detailed explanation] [pause] Ethernet (PAUSE) flow control principle

おすすめ

転載: blog.csdn.net/sunjice/article/details/115532563