Quickly understand the difference between TCP and UDP in a pee

This article quotes the content of the author Fundebug's article "Understanding the difference between TCP and UDP", thank you for your selfless sharing.

1 Introduction

Network protocols are the basic knowledge that every programmer who develops network communication applications ( such as IM, push, gateway, etc. ) must master. There are two most representative transport layer protocols in the TCP/IP protocol suite— They are TCP and UDP, respectively.

Students who have experience in network communication development know that TCP and UDP protocols are the two most commonly used protocols. For many people, when and in what scenarios should TCP or UDP be used? This is an ongoing topic of discussion.

Different from other long-winded articles, this article tries to use concise and concise words to help you summarize the main differences between the TCP and UDP protocols, so that those who want to master this knowledge but do not want to spend too much time to systematically learn the basics of network theory can quickly understand !

 
Recommended reading: In order to deepen your understanding, another article in this series, " Introduction to Network Programming for Lazy People (4): Quickly Understand the Differences between TCP and UDP " can also be read together.  

2. Quickly understand the TCP/IP protocol suite

For computers and network devices to communicate with each other, both must be based on the same method. For example, rules such as how to detect the communication target, which side initiates communication first, which language is used for communication, and how to end communication need to be determined in advance. Communication between different hardware, operating systems, all of this requires a discipline. We call this rule a protocol .

TCP/IP is a general term for various protocol families related to the Internet, such as: TCP, UDP, IP, FTP, HTTP, ICMP, SMTP, etc., all belong to the protocols in the TCP/IP family.

The TCP/IP model is the foundation of the Internet and is a general term for a series of network protocols. These protocols can be divided into four layers, namely link layer, network layer, transport layer and application layer.

specifically is:

  • 1) Link layer: responsible for encapsulating and decapsulating IP packets, sending and receiving ARP/RARP packets, etc.;
  • 2) Network layer: responsible for routing and sending packets to the target network or host;
  • 3) Transport layer: responsible for grouping and reorganizing the message, and encapsulating the message in TCP or UDP protocol format;
  • 4) Application layer: Responsible for providing applications to users, such as HTTP, FTP, Telnet, DNS, SMTP, etc.

The following table summarizes:

The following picture more vividly reflects the relationship between the TCP/IP protocol family (the high-definition picture is downloaded from here ):

In the network architecture, the establishment of network communication must be carried out at the peer layer of both parties of the communication, and cannot be interleaved.

During the entire data transmission process, when the data passes through each layer at the sender, the protocol header and protocol tail of the corresponding layer must be attached (only the data link layer needs to encapsulate the protocol tail) part, that is, the data must be protocol encapsulated. to identify the communication protocol used by the corresponding layer.

About the knowledge of the TCP/IP protocol suite, several books can't be written all over, so I won't go into details here. If you are interested, you can read " TCP/IP Detailed Explanation Volume 1: Protocol (Online Reading) ".

In addition, learning knowledge, I especially like to know some knowledge other than technology, such as the following two:

  1. " Technology Past: The TCP/IP Protocol that Changed the World (Precious and Multi-Pictures, Be Careful with Mobile Phones) "
  2. "The 5G era has arrived, TCP/IP is old, can you still eat?

Next, we will return to the topic and learn two representative transport layer protocols in TCP/IP - TCP and UDP.

3. Quickly understand the UDP protocol

3.1 Basic introduction

UDP protocol: The full name is the User Datagram Protocol. In the network, it is used to process data packets like the TCP protocol. It is a connectionless protocol.

In the OSI model, it is in the fourth layer - the transport layer, which is the upper layer of the IP protocol ( see the figure below ).

▲ The above picture is quoted from " Computer Network Communication Protocol Relationship Diagram "

UDP has the disadvantage of not providing packet grouping, assembly and sorting of packets, that is to say, when a packet is sent, it is impossible to know whether it arrives safely and completely.

Several main specialties of the UDP protocol, I will summarize them, and the following sections will explain them one by one.

3.2 Connectionless Oriented

First of all, UDP does not need to perform a three-way handshake to establish a connection like TCP before sending data. If you want to send data, you can start sending. And it is only a porter of data packets, and will not perform any splitting and splicing operations on data packets.

Specifically:

  • 1) On the sender side: the application layer transmits the data to the UDP protocol of the transport layer, UDP only adds a UDP header to the data to identify the UDP protocol, and then passes it to the network layer;
  • 2) At the receiving end: the network layer transmits the data to the transport layer, and UDP only removes the IP header and transmits it to the application layer without any splicing operation.

3.3 Support unicast, multicast, broadcast

UDP not only supports one-to-one transmission mode, but also supports one-to-many, many-to-many, and many-to-one modes, that is to say, UDP provides unicast, multicast, and broadcast functions.

3.4 Message Oriented

The UDP protocol is packet-oriented.

The sender's UDP sends the message to the application, and then delivers it down to the IP layer after adding the header. UDP does not combine or split the packets delivered by the application layer, but preserves the boundaries of these packets.

Therefore, the application must choose the appropriate size of the packet (see " How big is the maximum size of a packet in UDP? ").

3.5 Unreliability

The unreliability of UDP is first reflected in the lack of connection. The two parties of the communication do not need to establish a connection, and send it as they want. This situation is definitely unreliable.

And what data is received will be transmitted, and the data will not be backed up, and the data will not be sent to the other party without caring whether the other party has received the data correctly.

In addition, the network environment is good and bad, but because UDP has no congestion control, it will always send data at a constant speed ( even if the network conditions are not good, the sending rate will not be adjusted ).

The disadvantage of this implementation is that it may lead to packet loss in the case of poor network conditions, but the advantages are also obvious. In some scenarios with high real-time requirements ( such as teleconferences ), it is necessary to use UDP instead of TCP (see " Network Introduction to Programming Lazy People (5): Quickly understand why UDP is sometimes more advantageous than TCP ").

The following animation can be a good illustration of the unreliability of UDP:

As can be seen from the above animation, UDP will only throw the data packets it wants to send to the other party, and does not care whether the data arrives safely and completely.

3.6 Small header overhead

UDP protocol header overhead is small ( as shown in the figure below ), and it is very efficient to transmit data packets.

▲ The above picture is quoted from " TCP/IP Detailed Explanation  -  Chapter 11 UDP Protocol "

The UDP header contains the following data:

  • 1) Two 16-digit port numbers, which are the source port (optional field) and the destination port;
  • 2) The length of the entire data message;
  • 3) Checksum of the entire datagram (IPv4 optional field), which is used to find errors in header information and data.

Therefore, the header overhead of UDP is small, only 8 bytes, which is much less than at least 20 bytes of TCP, and it is very efficient when transmitting data packets.

For comparison, the following figure is the header overhead of the TCP protocol:

▲ The above picture is quoted from " TCP/IP Detailed Explanation  -  Chapter 17 TCP Protocol "

3.7 Learn more about the UDP protocol

The UDP protocol is relatively simple and easy to learn. If you feel that the theory is lacking, you can supplement it with the chapters in the classic network book " TCP/IP Detailed Explanation  -  Chapter 11 UDP: User Datagram Protocol ".

In fact, when producing applications, the UDP protocol also has its complex side. The following articles are worth learning:

  1. " Unknown Network Programming (5): UDP Connectivity and Load Balancing "
  2. " Unknown Network Programming (6): In-depth understanding of the UDP protocol and making good use of it "
  3. " Unknown Network Programming (7): How to Make Unreliable UDP Reliable?

In addition, with the vigorous promotion of the Quic protocol by Internet companies such as Google in recent years, the UDP protocol may find more application scenarios in the new era of mobile Internet environment. Interested readers can learn the QUIC protocol: " Introduction to Network Programming Lazy People " (10): Quickly understand the QUIC protocol in the time of taking a urine bath Tencent's Technology Practice Sharing ".

4. Quickly understand the TCP protocol

4.1 Basic introduction

When one computer wants to communicate with another computer, the communication between the two computers needs to be smooth and reliable, so as to ensure the correct sending and receiving of data.

For example: when you want to view a web page or check an email, you want to see the web page in full and in order without losing any content. When you download a file, you want to get the complete file, not just a part of the file, because if the data is lost or out of order, it's not what you want, so TCP is used.

TCP protocol: The full name is Transmission Control Protocol, which is a connection-oriented, reliable, byte stream-based transport layer communication protocol, defined by IETF's RFC 793 .

TCP is a connection-oriented, reliable streaming protocol. A stream is an uninterrupted data structure, and you can think of it as the flow of water down a drain.

For the theory of the TCP protocol, you can continue to read " TCP/IP Detailed Explanation  -  Chapter 17 TCP: Transmission Control Protocol ", which will be expanded here due to space limitations.

Next, we will introduce the most important features of TCP one by one.

4.2 TCP connection process (3-way handshake)

As shown in the figure below, this is the process of establishing a TCP connection (commonly known as "3-way handshake"):

1) The first handshake: The client sends a connection request segment to the server. This segment contains its own data communication initial sequence number. After the request is sent, the client enters the SYN-SENT state.

2) The second handshake: After the server receives the connection request segment, if it agrees to the connection, it will send a response, which will also contain its own initial sequence number of data communication, and will enter the SYN-RECEIVED state after the transmission is completed. .

3) The third handshake: When the client receives the response of connection consent, it also sends a confirmation message to the server. After the client sends this segment, it enters the ESTABLISHED state, and the server also enters the ESTABLISHED state after receiving the response. At this time, the connection is established successfully.

You may have a doubt here: Why does TCP need three handshakes to establish a connection instead of two? This is because this is to prevent an invalid connection request segment from being received by the server, resulting in an error.

The following animation demonstrates the 3-way handshake process, which may be better understood:

4.3 TCP disconnects (4 waves)

 

TCP is full-duplex, as shown in the figure above, both ends need to send FIN and ACK when disconnecting.

1) The first wave: If client A thinks the data transmission is complete, it needs to send a connection release request to server B.

2) Wave the second time: After B receives the connection release request, it will tell the application layer to release the TCP connection. Then it will send an ACK packet and enter the CLOSE_WAIT state, which indicates that the connection between A and B has been released, and the data sent by A is no longer received. But because the TCP connection is bidirectional, B can still send data to A.

3) The third wave: B will continue to send if there is still unfinished data at this time, and will send a connection release request to A after completion, and then B will enter the LAST-ACK state.

4) The fourth wave: After A receives the release request, it sends a confirmation response to B, and A enters the TIME-WAIT state at this time. This state will last for 2MSL (maximum segment lifetime, which refers to the time that the segment survives in the network, and the timeout will be discarded). If there is no retransmission request from B within this time period, it will enter the CLOSED state. When B receives the acknowledgment, it also enters the CLOSED state.

Regarding the 4 waves of TCP, the following animation may be more vivid:

It is very important to correctly understand the TCP 3-way handshake and 4-way wave process. Due to space limitations, this article cannot be further developed. Interested colleagues can further read a few special articles:

  1. " Introduction to Brain Stupid Network Programming (1): Follow the animation to learn TCP three-way handshake and four-way wave "
  2. " Theoretical Classics: Detailed explanation of the 3-way handshake and 4-way wave process of the TCP protocol "
  3. " Theory is linked to practice: Wireshark captures packets and analyzes TCP 3-way handshake and 4-way wave process "

4.4 Summary of the main points of the TCP protocol

1) Connection-oriented:

Connection-oriented means that a connection must be established at both ends before sending data.

The method of establishing a connection is a "three-way handshake", which can establish a reliable connection. Establishing a connection lays the foundation for reliable data transmission.

2) Only unicast transmission is supported:

Each TCP transmission connection can only have two endpoints, only point-to-point data transmission, and does not support multicast and broadcast transmission methods.

3) Oriented to byte stream:

Unlike UDP, TCP does not transmit packets independently, but transmits them in a stream of bytes without preserving packet boundaries.

4) Reliable transmission:

For reliable transmission, judging packet loss and bit error, it depends on the TCP segment number and confirmation number.

In order to ensure the reliability of message transmission, TCP gives each packet a sequence number, and the sequence number also ensures the orderly reception of packets transmitted to the receiving entity.

The receiver entity then sends back a corresponding acknowledgment (ACK) for the successfully received bytes: if the sender entity does not receive an acknowledgment within a reasonable round-trip delay (RTT), then the corresponding data (assuming lost) will be retransmitted.

Regarding the theory of reliable transmission, you can study in-depth " TCP/IP Detailed Explanation  -  Chapter 21. TCP Timeout and Retransmission ", which will not be further expanded here.

5) Provide congestion control:

When the network is congested, TCP can reduce the rate and quantity of data injected into the network and relieve the congestion.

Articles about congestion control in TCP are generally boring. This " Easy to Understand - In -depth Understanding of TCP Protocol (Part 2): RTT, Sliding Window, Congestion Handling " is relatively easy to understand. If you are interested, you can read it in depth. .

6) TCP provides full-duplex communication:

TCP allows applications on both sides of the communication to send data at any time, because both ends of the TCP connection have buffers to temporarily store data for bidirectional communication.

Of course, TCP can send a segment immediately, or buffer it for a while to send more segments at once (the maximum segment size depends on the MSS).

4.5 Learn more about the TCP protocol

The content involved in the TCP protocol is relatively rich. If you really want to show it in all aspects, you can't finish it in three days and three nights. However, for developers of web applications, on-demand learning based on the depth of the technologies involved in their applications is sufficient.

Beginners suggest to consolidate the theory first, such as starting from the classic book " TCP/IP Detailed Explanation  -  Chapter 17 TCP: Transmission Control Protocol ".

If you think the theory is too boring, the following lively and interesting introductory articles are recommended to read:

  1. " Introduction to Network Programming Lazy People (1): Quickly Understand Network Communication Protocols (Part 1) "
  2. " Introduction to Network Programming Lazy People (2): Quickly Understand Network Communication Protocols (Part 2) "
  3. " Introduction to Network Programming Lazy People (3): A Quick Understanding of the TCP Protocol is Enough "
  4. " Introduction to Network Programming Lazy People (6): Introduction to the Functional Principles of the Most Popular Hubs, Switches, and Routers in History "
  5. " Introduction to Brain Stupid Network Programming (1): Follow the animation to learn TCP three-way handshake and four-way wave "
  6. " Introduction to network programming has never been easier (1): If you were to design a network, what would you do?
  7. " Introduction to network programming has never been easier (2): If you were to design the TCP protocol, what would you do?

5. To summarize

The difference between TCP and UDP can be summarized in the following table:

Simply put, the difference between TCP and UDP is:

  • 1) TCP provides connection-oriented reliable services to the upper layer, and UDP provides connectionless unreliable services to the upper layer;
  • 2) Although UDP is not as accurate as TCP, it can also make a difference in many places with high real-time requirements;
  • 3) If the data accuracy is high and the speed can be relatively slow, TCP can be selected.

Finally, I want to use a picture to vividly summarize the difference between TCP and UDP:

As shown in the picture above: TCP is like the girl on the left - drinking water in an orderly and dripping manner, UDP is like the girl on the right - no matter how much you can drink, it will be over. . .

study Exchange:

- Introductory article on mobile IM development: " One entry is enough for beginners: developing mobile IM from scratch "

- Open source IM framework source code: https://github.com/JackJiang2011/MobileIMSDK 

( This article has been published simultaneously at: http://www.52im.net/thread-3793-1-1.html  )

{{o.name}}
{{m.name}}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=324109884&siteId=291194637