Solve the problem of sticky packets in TCP network transmission

The TCP/IP protocol suite establishes the conceptual model of communication protocols in the Internet. The two main protocols in the protocol suite are TCP and IP protocols. The TCP protocol in the TCP/IP protocol suite can ensure the reliability and order of the data segments (Segment). With a reliable transport layer protocol, the application layer protocol can directly use the TCP protocol to transmit data without worrying about the loss of data segments. and duplicate questions

Figure 1 - TCP Protocol and Application Layer Protocol

The IP protocol solves the routing and transmission of data packets. The upper-layer TCP protocol no longer pays attention to routing and addressing. The TCP protocol solves the reliability and order of transmission, and the upper layer does not need to care whether the data can be transmitted to the target. Process, as long as the data written to the buffer of the TCP protocol, the protocol stack can almost guarantee the delivery of the data.

When the application layer protocol uses the TCP protocol to transmit data, the TCP protocol may divide the data sent by the application layer into multiple packets and send them in sequence, and the data segment received by the data receiver may be composed of multiple "application layer data packets". So when the application layer reads data from the TCP buffer and finds a stuck packet, it needs to split the received data.

The sticky packet is not caused by the TCP protocol. It occurs because the application layer protocol designers misunderstand the TCP protocol, ignore the definition of the TCP protocol and lack experience in designing the application layer protocol. This article will start from the TCP protocol and the application layer protocol, and analyze how the sticky packets in the TCP protocol we often mention occur:

  • The TCP protocol is a byte stream-oriented protocol, which may combine or split the data of the application layer protocol;
  • The application layer protocol does not define the boundary of the message, so that the receiver of the data cannot splicing the data;

Many people may think that sticking packets is a relatively low-level problem that is not worth discussing, but in the author's opinion, this problem is still very interesting. Not everyone has systematically learned the design of application layer protocols based on TCP, and not all People have such a deep understanding of the TCP protocol. I believe that the process of learning programming for many people is bottom-up, so the author believes that this is a question worth answering. We should pass on correct knowledge, not negative and condescending. mood.

Private message me to receive the latest and most complete C++ audio and video learning and improvement materials, including ( C/C++ , Linux , FFmpeg , webRTC , rtmp , hls , rtsp , ffplay , srs )

 

byte stream

The TCP protocol is a connection-oriented, reliable, byte stream-based transport layer communication protocol. The data handed over to the TCP protocol by the application layer will not be transmitted to the destination host in units of messages. In some cases, these data will be combined It is sent to the target host as a data segment.

Nagle's algorithm is an algorithm that improves TCP transmission performance by reducing the number of packets. Because the network bandwidth is limited, it will not send small data blocks directly to the destination host, but will wait for more data to be sent in the local buffer. Although this strategy of sending data in batches will affect real-time performance and network latency , but reduces the likelihood of network congestion and reduces overhead.

In the early days of the Internet, Telnet was a widely used application, however using Telnet would generate a large amount of valid data with only 1 byte payload, each packet would have an additional 40 bytes overhead, and the bandwidth utilization was only ~2.44 %, the Nagle algorithm was designed in this scenario at the time.

When the application layer protocol transmits data through the TCP protocol, the data to be sent is actually written into the buffer of the TCP protocol first. If the user enables the Nagle algorithm, the TCP protocol may not send the written data immediately, it will The data in the buffer will not be sent until the data in the buffer exceeds the maximum data segment (MSS) or the previous data segment is ACKed.

Figure 2 - Nagle's algorithm

The problem of network congestion will still occur decades ago, but today's network bandwidth resources are no longer as tight as in the past. By default, the Linux kernel will use the following methods to disable the Nagle algorithm by default:

TCP_NODELAY = 1

The tcp_nagle_test function shown below is used in the Linux kernel to test whether we should send the current TCP data segment. Interested readers can use this code as an entry to learn more about the implementation of the Nagle algorithm today:

static inline bool tcp_nagle_test(const struct tcp_sock *tp, const struct sk_buff *skb,
				  unsigned int cur_mss, int nonagle)
{
	if (nonagle & TCP_NAGLE_PUSH)
		return true;

	if (tcp_urg_mode(tp) || (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN))
		return true;

	if (!tcp_nagle_check(skb->len < cur_mss, tp, nonagle))
		return true;

	return false;
}

The Nagle algorithm can indeed improve the utilization of network bandwidth and reduce the extra overhead caused by the TCP and IP protocol headers when the data packets are small, but the use of this algorithm may also cause the data written by the application layer protocol to be merged or split. When sending, when the receiver reads data from the TCP protocol stack, it will find that irrelevant data appears in the same data segment, and the application layer protocol may not have a way to split and reassemble them.

In addition to the Nagle algorithm, there is another option in the TCP protocol stack for delaying sending data  TCP_CORK. If we enable this option, then when the sent data is less than MSS, the TCP protocol will delay 200ms to send the data or wait for the buffer data in more than MSS.

Either  way, TCP_NODELAY they  TCP_CORKwill improve bandwidth utilization by delaying sending data, they will split and reassemble data written by application layer protocols, and the most important reason why these mechanisms and configurations can appear is - the TCP protocol is based on The protocol of byte stream itself has no concept of data packets, and will not send data according to data packets.

message boundary

If we have systematically studied the TCP protocol and the application layer protocol design based on TCP, then there is no problem in designing an application layer protocol that can be arbitrarily split and assembled by the TCP protocol stack. Since the TCP protocol is based on byte streams, this actually means that the application layer protocol has to divide the boundary of the message by itself.

If we can define the boundary of the message in the application layer protocol, then no matter how the TCP protocol splits and reassembles the data packet process of the application layer protocol, the receiver can recover the corresponding message according to the rules of the protocol. In the application layer protocol, the two most common solutions are length-based or terminator-based (Delimiter).

Figure 3 - Approach to implementing message boundaries

There are two ways of length-based implementation, one is to use a fixed length, and all application layer messages use a uniform size, and the other way is to use a non-fixed length, but it is necessary to increase the representation load in the protocol header of the application layer protocol. The length field, so that the receiver can separate different messages from the byte stream. The message boundary of the HTTP protocol is implemented based on the length:

HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Content-Length: 138
...
Connection: close

<html>
  <head>
    <title>An Example Page</title>
  </head>
  <body>
    <p>Hello World, this is a very simple HTML document.</p>
  </body>
</html>

In the above HTTP message, we use  Content-Length the header to indicate the payload size of the HTTP message. When the application layer protocol parses enough bytes, the complete HTTP message can be separated from it. No matter how the sender handles the corresponding data packet, we All can follow this rule to complete the reassembly of HTTP messages.

However, in addition to using a length-based method to implement boundaries, the HTTP protocol also uses a terminator-based strategy. When HTTP uses the Chunked Transfer mechanism, it is no longer included  Content-Length in the HTTP header, and it will use a payload size of 0. HTTP messages serve as terminators to denote message boundaries.

Of course, in addition to these two methods, we can implement the boundary of the message based on specific rules, for example: using the TCP protocol to send JSON data, the receiver can judge whether the message is terminated according to whether the received data can be parsed into legal JSON.

Summarize

The problem of TCP protocol sticking is caused by the wrong design of application layer protocol developers. They ignore the core mechanism of TCP protocol data transmission-based on byte stream, which itself does not contain concepts such as messages and data packets. All data transmission is It is streaming, and the application layer protocol needs to design the message boundary, that is, the message frame (Message Framing). Let's review the core reasons for the sticky packet problem:

  1. The TCP protocol is a transport layer protocol based on byte streams, in which there is no concept of messages and data packets;
  2. Application layer protocols do not use length-based or terminator-based message boundaries, resulting in multiple messages being stuck;

The learning process of network protocols is very interesting, and constantly thinking about the problem behind it can give us a deeper understanding of the definition. In the end, we still look at some open-ended related issues. Interested readers can think about the following questions carefully:

  • How should the application layer protocol based on UDP protocol be designed? Will there be a sticking problem?
  • Which application layer protocols use length-based framing? Which ones use terminator-based framing?

 

 

 

 

 

 

 

Guess you like

Origin blog.csdn.net/m0_60259116/article/details/124406907