Interpretation of the difference between the TCP protocol and the UDP protocol

This section will explain the difference between the TCP protocol and the UDP protocol in detail, as a section in my column "Computer Network Protocol Quick Start Tutorial".

written in front

When the protocol was formulated in the early stage, no matter which model it was, the positioning of the transport layer was used to control the transmission of data. Therefore, for a hierarchical model with a clear division, the comparison between TCP and UDP is often a comparison of transport layer capabilities, such as reliability, whether there is a connection, whether there is a sequence, and so on. However, with the evolution of technology, the control of data transmission is not necessarily limited to the transport layer. Google also implements the quic protocol based on UDP, which guarantees the reliability of data transmission. In this way, even if UDP itself is unreliable, reliability can still be guaranteed above the transport layer. Many people will wonder whether TCP is more reliable than UDP, but UDP-based QUIC is still reliable, what's going on. This is because the comparison between TCP and UDP is the transport layer itself in the usual sense, and the scope needs to be limited to the transport layer. At the same time, it can be seen that the evolution of technology is constantly breaking some of the original concepts. If it is not because TCP/UDP is too widely used, it is not so easy to completely replace it, maybe the evolution has already been more violent and thorough. At the same time, the early public protocols were limited, and some applications did not customize the application layer protocol, and were directly implemented based on TCP or UDP. This is why everyone is obsessed with the comparison of the transport layer. This article will still discuss the difference between TCP and UDP within the scope of the transport layer.

During the interview process, the most common interview about computer networks is to briefly describe the difference between TCP and UDP. Usually some answers are as follows:

  • TCP is connected, UDP is connectionless
  • TCP is reliable, UDP is unreliable
  • TCP is based on stream transmission, UDP is based on packet transmission
  • TCP is stateful, UDP is stateless
  • TCP is ordered, UDP is unordered

This article will start with each field of TCP and UDP, and go deeper into these differences.

Connection VS Connectionless

TCP is connected. The specific performance is that before each formal data exchange, three handshakes are required to determine whether the two parties are ready for communication, as shown in Figure 1: Figure
insert image description here
1

As can be seen in Figure 1, the communication parties complete the exchange of the status of the communication parties through the SYN flag and the ACK confirmation of the SYN packet. Only after the three-way handshake passes, the subsequent data exchange can continue normally.

In addition, during the three-way handshake process in Figure 1, information such as the size of the window window and the size of the MSS are also exchanged. The size of the window window is used to notify the communication party of the current data size that can be accepted, and the MSS size indicates the maximum data size allowed by the upper layer for each TCP transmission. Through the exchange of these information, the communication parties can quickly understand each other's status, so as to provide a basis for whether data can be sent, how much data can be sent, and whether they can communicate. For example, when the window is 0, it is not suitable for data transmission, but should wait for the window status to be updated before communicating.

Since there is a connection established, there must be a disconnection of the connection. Figure 2 shows the process of TCP disconnection:
insert image description here
Figure 2

It can be seen that on a stream, when data transmission is not required, the connected party can initiate a disconnection. After the two parties reach a consensus, the communication information on the stream will be released by both parties.

The connection of TCP is reflected in the connection needs to be established and the connection needs to be released. There is no corresponding field in the UDP protocol to ensure the establishment of the connection, so the communication of UDP is connectionless. It can be seen that the establishment and release of the connection is a negotiation between the two parties on the communication parameters, and the connection can also reflect that TCP is reliable to a certain extent.

Reliable VS Unreliable

TCP transmission is reliable, but UDP transmission is unreliable, which can be understood from the following aspects:

  • Packet loss occurs in network transmission
  • Retransmission occurs in network transmission
  • Out of order network transmission
  • The other party goes offline suddenly

1. TCP indicates the content of the data packet received by the receiver through the confirmation sequence number, so when a packet loss occurs, the receiver will not confirm the data packet, and it will trigger the TCP timeout retransmission mechanism to ensure that the data packet will not be throw away. UDP has no corresponding mechanism to ensure that data packets reach the receiver reliably, so UDP is unreliable

2. After multiple data packets arrive at the receiver, the receiver can ensure that the data is reassembled correctly according to the TCP sequence number, and there will be no disorder. UDP can only reassemble application layer data in the order it is received. The order in which data is received is not always consistent with the order in which data is sent. Therefore, the order and disorder of TCP and UDP can also be classified as reliable. UDP doesn't have that feature, so UDP is unreliable.

3. When the receiver receives two identical data packets, TCP will discard the retransmitted data packets according to the sequence number. UDP does not have this mechanism, so UDP is unreliable.

4. The processing capability of the receiver is limited. TCP notifies the sender of the receiving capability of the current receiver through the window and window scale fields, and the sender determines the size of the content to be sent according to the size. UDP has no corresponding mechanism to ensure that the data sent will not exceed the capacity of acceptance, so UDP is unreliable

5. When the receiver goes offline suddenly, the mechanism is the same as packet loss. If no confirmation from the receiver is received, a retransmission is triggered, and after the number of retransmissions is reached, the connection is directly reset and disconnected. UDP cannot perceive the status of the other party, and the sender will always transmit data, so UDP is unreliable.

It should be noted that the reliability of TCP and UDP refers to whether the transport layer is reliable, and whether this layer can provide reliable services to the upper layer. As mentioned at the beginning of the article, although UDP itself is unreliable, services based on UDP transmission can build reliable transmission on the upper layer of UDP. For example, the quic protocol is based on UDP, and the reliability of its transmission is realized by the quic protocol.

Stream-based vs packet-based

It is generally said that UDP is based on packet transmission, while TCP is based on stream transmission. Since UDP has a field that clearly indicates the data size of this transmission, as shown in Figure 3:
insert image description here
Figure 3
implies that after receiving the data packet, the receiver will directly hand it over to the application layer for processing according to the length. Similarly, for the sender, how much data is handed over by the application layer to the transport layer, UDP will form these data into a packet and send it, and if it exceeds the limit of the physical layer, it will be fragmented and transmitted in IP. Therefore, when the UDP protocol is often used, the application layer often limits the size of each data transmission, such as 512 bytes, and makes multiple UDP calls.

For TCP, when processing application layer data, the sender does not directly transmit the application layer data as a packet each time, but divides a piece of data into multiple pieces of data for transmission according to the MSS and the size of the other party's window. Just like the flow of water, it is continuously transmitted to the receiver. For the receiver, after receiving the data, it will be stored in the receive buffer. The TCP length indicates the size of the header, and does not indicate the size of the application layer data, so the receiver does not know whether the sender's data has been sent or not. Therefore, when it is handed over to the application layer for processing, there are two common situations: 1. The buffer is full and the data needs to be refreshed; 2. The sender sends a push field to request the application layer to process the data. Although we don't know the overall size of the transmitted data, TCP first guarantees the reliability of the transmission. The receiver only needs to read data from the buffer regularly according to the policy, so that the data flow is continuous. Often when the TCP protocol is used, the application layer does not limit the size of the data transmitted each time, but is handled by the transport layer, and a TCP call is made.

It is precisely because of the streaming characteristics of TCP that sticky packets may occur in some scenarios, as follows:

  • In order to improve the transmission efficiency of data, tcp combines and sends several small packets handed over by the application layer, so that it only needs to be sent once, and the receiver only needs to confirm once.
  • Even if several smaller packets are sent separately, they may be merged in the buffer at the receiving end, and the receiving end only needs to confirm once.
    This will cause the recipient to be unable to unpack these merged packages, so sticky packages will occur. Because in some games, these small messages may represent different commands, so they must be distinguished, otherwise subsequent commands will not take effect. The solution to the problem of sticky packets is to add a data length field at the application layer for distinction, such as the content-length field of http and so on. Therefore, it can be seen that this phenomenon must depend on the function of the application layer, which is not feasible for the application directly based on TCP, and the protocol of the application layer needs to be customized. Since Udp is packet transmission, UDP does not have the problem of sticky packets.

efficiency

  • The length range of the TCP header is 20-60 bytes, and the length of UDP is 8 bytes, so the efficiency of UDP is high.
  • In order to maintain the validity of the connection, TCP needs an additional three-way handshake and four handshakes, which requires additional overhead.
  • TCP needs to wait for the confirmation of the other party before transmitting subsequent data, so the speed depends on the confirmation speed of the other party. UDP does not have this mechanism, and it is highly efficient under certain circumstances.
  • TCP needs to maintain the state parameters of the transmission, such as the current state, TCP reassembly information, window size, etc., so the TCP process will take up more resources.

The above are some of the main differences between TCP and UDP. It is still the point of view at the beginning of the article. The evolution of technology will continue to break some of the original concepts. Maybe the difference between TCP and UDP will not be so important in the future. The functions that UDP lacks can be implemented at a higher layer, and the integration of multiple protocol layers will also unknown.

This article is an original article by the youth in the village of CSDN, and may not be reproduced without permission. The blogger links here .

Guess you like

Origin blog.csdn.net/javajiawei/article/details/125946316