TCP/IP Detailed Explanation Volume 1: Protocol Study Notes Chapter 11 UDP: User Datagram Protocol

UDP is a datagram-oriented transport layer protocol. Each output operation of the process generates a UDP datagram and assembles it into an IP datagram to be sent.

Insert picture description here
RFC 768 is the official specification of UDP.

UDP does not provide reliability.

If the length of the IP datagram exceeds the MTU of the network, the IP datagram must be fragmented. If necessary, every network between the source and the destination can be fragmented, not just the sender.

Insert picture description here
The port number indicates the sending and receiving process.

If TCP and UDP provide a certain well-known service at the same time, the two protocols usually choose the same port number for convenience.

The UDP length field refers to the sum of the UDP header and UDP data. The minimum value of this field is 8 bytes (UDP datagrams with a data field length of 0 can be sent). This length value is equal to the length indicated by the IP datagram length field minus the IP data The length of the header field.

UDP check and cover UDP header and UDP data. In the calculation, because UDP does not fill in 0s like the IP header, the length of UDP can be odd, but the checksum algorithm is calculated in 16bit units, so if the final length is insufficient, 0 is added. This is only for the calculation of the checksum and will not be changed. The supplemented 0 is sent out.

Both the UDP datagram and the TCP segment contain a 12-byte pseudo-header, which is set to calculate the checksum. The pseudo-header contains some fields in the IP header. The purpose is to allow UDP to double check whether the data reaches the destination correctly.

Insert picture description here
The padding bytes in the figure above are used when calculating the checksum. The length of the UDP datagram appears twice in the checksum calculation process.

If the sent UDP checksum field value is all 0, it means that the sender has not calculated the checksum (if the calculated checksum result is all 0, the stored value is all 1 (65535)).

If the receiving end detects an error in the checksum, the UDP datagram is discarded and no error message is generated (this is also done when the IP layer detects an error in the checksum of the IP header).

Although UDP checksums are optional, they should always be used. In the 1980s, some computer manufacturers turned off the UDP checksum function by default to increase the speed of UDP's NFS (Network File System). This may be acceptable in a single LAN, but when the datagram passes through the router, it passes through the link Cyclic redundancy check of layer data frame (such as Ethernet or token ring data frame) can detect most errors, resulting in transmission failure. In addition, there are software and hardware errors in the router, which cause the data in the datagram to be modified. If the end-to-end UDP checksum function is turned off, these errors cannot be detected in the UDP datagram. In addition, some data links The layer protocol (such as SLIP) does not have any form of data link checksum.

The Host Requirements RFC states that the UDP checksum option is turned on by default, and if the sender calculates the checksum, the receiver must check the received checksum. But many systems do not comply with this, and only verify the received checksum when the export checksum option is turned on.

If the data you send is important, don't completely trust the UDP or TCP checksums. These are just simple checksums and cannot detect all possible errors.

Whenever the IP layer receives an IP datagram to be sent, it must determine which interface to send the data to (routing), and query the interface to obtain its MTU. IP compares the MTU with the length of the datagram, and if necessary, Fragmentation. Fragmentation can occur on the original sender host or on the intermediate router.

After a piece of IP datagram is fragmented, it is reassembled only when it reaches the destination (the reassembly here is different from other network protocols, which require reassembly at the next stop, not at the final destination). The reassembly is done by the IP layer of the destination. The purpose is to make the fragmentation and reassembly process transparent to the transport layer (TCP and UDP), except for some operations that may leapfrog. Datagrams that have been fragmented may be fragmented again. The data contained in the IP header provides sufficient information for fragmentation and reassembly.

For the IP header, the fields related to fragmentation are as follows: the identification field of each IP datagram sent by the sender contains a unique value, which is copied into each fragment when the datagram is fragmented. The flag field uses one of the bits to indicate that there is a slice to follow. Except for the last slice, every other slice that composes the datagram must set this bit to 1. The fragment offset field refers to the position of the fragment offset from the beginning of the original datagram. When the datagram is fragmented, the total length of each fragment should be changed to the length of the fragment. There is a bit in the flag field of the IP header called the non-fragmentation bit. If this bit is set to 1, IP will not fragment the datagram. When it encounters fragmentation, it discards the datagram and sends an ICMP error. The message ("fragmentation is required but the non-fragmentation bit is set") is sent to the start end.

After the IP datagram is fragmented, each fragment becomes a packet with its own IP header and is independent of other packets when routing is selected. In this way, these fragments of the datagram may be out of order when they reach the destination, but the IP header There is enough information for the receiving end to assemble these datagram pieces correctly.

However, even if only one piece of data is lost, IP fragments must retransmit the entire datagram (whether retransmission depends on the upper layer protocol, for UDP, it does not have a timeout retransmission mechanism (but some UDP applications themselves also perform timeout and retransmission) ), because if the fragment is not divided at the sender, the sender does not know how the intermediate route is fragmented, so avoid fragmentation.

Using UDP can easily lead to IP fragmentation (TCP tries to avoid fragmentation. For applications, it is almost impossible to force TCP to send a long segment that needs to be fragmented).

On an Ethernet, the maximum length of a data frame is 1500 bytes, of which 1472 bytes are reserved for data. Assuming that the IP header is 20 bytes and the UDP header is 8 bytes, the data length is 1471,1472,1473,1474 bytes. Send a UDP datagram, tcpdump checks the fragmentation of the UDP datagram (the last two fragments should occur): After the
Insert picture description here
IP datagram is fragmented, tcpdump prints out more information, the third and fourth lines of frag 26304 and The frag 26313 in the fifth and sixth lines refers to the value of the identification field in the IP header.

As shown in the figure above, when the UDP datagram length is 1473 bytes, it is fragmented. If the Ethernet uses IEEE 802 encapsulation, because the Ethernet header has 8 bytes more, 1465 bytes becomes the minimum length of the UDP datagram fragment. .

The next number in the fragment information, that is, the number between the colon and the @ sign, is the length of the fragment except the IP header.

When fragmenting, except for the last one, the data part (except the part of the IP header) in each of the other fragments must be an integer multiple of 8 bytes.

The number after the @ symbol is the slice offset value calculated from the beginning of the datagram. The plus sign after this number corresponds to more slice bits in the 3bit flag field in the IP header, and the purpose is to let the receiver know that there are slices behind. .

The 4th and 6th lines in the figure above omit the protocol name, source port number, and destination port number. The protocol name can be printed out because the protocol name is in the IP header, but the port number is in the UDP header. Found in the first piece (any transport layer header only appears in the first piece of data).

A packet is a unit of data transmitted between the IP layer and the link layer. A packet can be a complete IP datagram or a fragment of the IP datagram.

Another situation where an ICMP unreachable error occurs is when the router receives a datagram that needs to be fragmented, and the non-fragmentation (DF) flag bit is set in the IP header. If a program needs to determine the minimum MTU on the way to the destination (called the path MTU discovery mechanism), then this error can be used by the program.

Insert picture description here
The MTU field of the next station network provides the MTU of the next station. If the router does not provide this new ICMP error message format, the MTU of the next station is set to 0.

ICMP unreachable error generation:
Insert picture description here
When SLIP is installed on the host sun, we configured the link MTU from sun to netb during the configuration of SLIP, and now we want to determine the link MTU from netb to sun (in point-to-point links, The MTU in both directions is not required to be the same value). The method used is to run ping on the host solaris to send an ICMP echo request to the host bsdi, increase the length of the data packet until the packet is fragmented, and then run tcpdump on the host sun:
Insert picture description here
the mark DF in each line indicates that The non-fragmentation bit is set in the IP header, which is part of the path MTU discovery mechanism.

The first line shows that the echo request reaches the sun host through the router netb, without fragmentation, so the SLIP MTU of netb has not been reached.

The DF flag in the second line is copied into the echo reply message. Although the echo reply and echo request messages have the same length (more than 600) bytes, the MTU of the SLIP interface that sun goes out is 552, so the echo reply needs to be Fragmentation is performed, but the DF flag bit is set, so sun generates an ICMP unreachable error message to bsdi, and the error message is discarded at bsdi. This is why you can't see any echo responses on solaris, and these responses can never pass sun.

In the third and sixth lines, mtu=0 means that the host sun did not return the export MTU value in the ICMP unreachable packet.
Insert picture description here
Many systems do not support the path MTU discovery function, but you can modify the traceroute program to determine the path MTU. What you need to do is to send the packet and set the non-fragmentation flag bit. The length of the first packet sent is exactly the same as the egress MTU. The packet length is reduced every time an ICMP cannot be fragmented error is received. If the ICMP error message sent by the router is in the new format and contains the egress MTU, it will be sent with this MTU value, otherwise it will be sent with the next smallest MTU value. According to the RFC statement, the number of MTU values ​​is limited. Therefore, take the next minimum MTU value to send.

Insert picture description here

As shown in the figure above, first try the path MTU from the host sun to the host slip. We know in advance that the MTU of the SLIP link is 296: In the
Insert picture description here
above example, the router bsdi did not return the egress MTU in the ICMP error message, so we choose the next smaller one. The MTU. When modifying the intermediate route bsdi to return the export MTU value, run the modified traceroute program
Insert picture description here
at this time : there is no need to try 8 different MTU values ​​one by one at this time.

Now many but not all WANs can handle packets larger than 512 bytes. Using the path MTU discovery mechanism, applications can make full use of the larger MTU to send packets.
Insert picture description here

As shown in the figure above, UDP is used from solaris to send a 650-byte data to the slip. Since the slip host is behind the SLIP link with an MTU of 296, any length longer than 268 (296-20-8) bytes is irrelevant. The UDP data with the fragment bit set to 1 will cause the bsdi router to generate ICMP unfragmentable error messages.

A 650-byte UDP datagram is generated on solaris at an interval of five seconds. The following is the result of running tcpdump on sun: In the
Insert picture description here
above example, the router bsdi does not return the next hop MTU information.

The DF bit is set to 1 in the first datagram sent in the first line, and the result should be a result that we can guess from the bsdi router (sent back in the second line), but what is puzzling is that the third line is in When sending the next datagram, DF should be set to 0 to fragment the datagram according to the maximum value of the guessable result sent back by bsdi in the second line, but the datagram with DF of 1 is still sent, and the result is the same ICMP error (fourth line).

The fifth line shows that the IP already knows that the datagram sent to the destination address cannot set the DF bit. Therefore, the IP fragments the datagram on the source site and divides it into two pieces (rows 5 and 6).

But in the seventh line, the sent datagram still has the DF bit set, so bsdi discards it and returns an ICMP error. This is an IP timer timeout, and its function is to notify the IP to check whether the path MTU has increased and change the DF Set to 1. It can be seen from lines 7 to 19 that IP sets DF to 1 every 30 seconds. This 30 seconds is too short. RFC 1191 recommends 10 minutes. This value can be changed by modifying the ip_ire_pathmtu_interval parameter.

The fragment size in the picture above is incorrect. The actual MTU value is 296 bytes, which means that the datagram fragmented by solaris will also be fragmented by bsdi. The picture below is the first one received on the slip of the destination host. The content of the datagram (the fifth and sixth lines in the figure above) is based on the fact that the IP identification field is 47942: in the
Insert picture description here
above example, solaris should not send out datagram fragments, and it should set the DF bit to 0 to allow the router with the smallest MTU Complete the sharding work.

The following example repeats the previous example, but the router bsdi is set to return the next-hop MTU in the ICMP fragmentation failure error: the
Insert picture description here
same as the previous example, the first two datagrams are sent after setting DF to 1. The destination only received three pieces of data instead of four pieces of data this time.

Generate a 8192-byte UDP datagram. It is predicted that this will generate 6 datagram fragments on the Ethernet. At the same time, before sending this datagram, the ARP buffer is empty, so the ARP request must be exchanged before sending the first datagram fragment. And the response, the following is the result of tcpdump: It
Insert picture description here
can be seen that each data piece has generated an ARP request. When the first ARP response is received, only the last datagram fragment is sent. It seems that the first five datagram fragments have been discarded. In fact, this is a normal operation implemented by ARP. When waiting for an ARP response, only the last datagram fragment is sent. A message is sent to a specific host. The Host Requirements RFC requires that this type of ARP flooding (that is, ARP request messages sent repeatedly to the same IP address at a high rate) must be prevented in the implementation. The recommended maximum rate is once per second. It also stipulates that ARP should retain at least one Message, this message must be the last message, which is exactly the result we see.

In the above example, after the sender sends several ARP requests, the first datagram sent is the last offset datagram fragment, indicating that the ARP input queue is last in, first out.

In the above example, there is an unexplainable abnormal phenomenon, svr4 sends back 7 ARP replies instead of 6.

When the first datagram fragment arrives, the IP layer starts a timer with a timing value of 30 or 60 seconds. If the timer expires and all datagram fragments fail to arrive, these datagram fragments are discarded. If not, So, sooner or later, those datagram fragments that will never arrive will fill the receiving end buffer.

After the last ARP response returned, tcpdump continued to monitor for 5 minutes and found that svr4 did not return an ICMP assembly timeout error. There are two reasons for not seeing: one is that most implementations derived from Berkeley never generate the error, these implementations will set the timer, and will also discard the datagram fragments when the timer overflows; the second is that the inclusion is not received The first datagram fragment with an offset of 0 in the UDP header is not required to generate an ICMP error for this fragment. The reason is that because there is no transport layer header, the receiver of the ICMP error does not know which process sent the datagram. throw away.

Theoretically, the maximum length of an IP datagram is 65535 bytes, which is limited by the 16-bit total length field of the IP header. Excluding the 20-byte IP header and 8-byte UDP header, the longest user data length in a UDP datagram is 65507 bytes, but most implementations provide a maximum value less than this value. There are two limiting factors: First, the application may be affected Due to its program interface limitation, the socket API provides a function that can be called by the application to set the length of the receiving and sending buffer. For UDP sockets, this length is directly related to the length of the largest UDP datagram that can be read and written by the application. Most systems provide UDP datagrams that can read and write larger than 8192 bytes by default (this default value is used because it is the default value of the number of user data read and written by NFS); the second is from the TCP/IP kernel implementation, which may exist Some features or errors. The host must be able to receive 576-byte IP datagrams.

The fact that IP can send or receive datagrams of a specific length does not mean that applications can read data of that length. Therefore, UDP becomes an interface that allows applications to specify the maximum number of bytes returned each time, if the length of the received datagram is greater than The length that the application can handle, what happens depends on the specific implementation: the Berkeley version of the socket API truncates the datagram, discarding excess data, and the application can receive truncated datagrams in 4.3BSD Reno and later versions. Message; the socket API under SVR4 (including Solaris 2.x) does not truncate the datagram, and the excess part is returned in the subsequent reading, but it does not notify the application to perform multiple reading operations on a single UDP datagram; the TLI API does not Discard the data, it returns a sign that more data can be obtained, and the read operation after the application will return the rest of the datagram.

UDP can also produce ICMP source station suppression errors. When a host or router receives datagrams faster than its processing speed, this error may occur (even if the host or router has no cache and discards the datagram, it is not required Send the source station to suppress the message).

Insert picture description here
Earlier RFCs required routers to generate source site suppression errors when they were not cached, but newer RFCs proposed that routers should not generate source site suppression error messages, because the source site always consumes network bandwidth, and it is an invalid effect for congestion. Unfair adjustments, so people do not support the current attitude of suppressing errors on router origin.

If UDP is used, the BSD implementation usually ignores the source station suppression message it receives. Part of the reason is that when the source station suppression error message is received, the process that caused the source station suppression may have been terminated.

When the client uses a UDP datagram, its IP header contains the source and destination IP addresses, and the UDP header contains the New Year's Day and destination UDP port numbers. When the server receives a UDP datagram, the operating system will tell it the source IP of the sender Address and port number. This feature allows an interactive UDP server to process multiple clients and send back a response to each client who sends a request.

Some applications need to know the destination IP address of the datagram. For example, the RFC stipulates that the TFTP server must ignore the received datagram sent to the broadcast address. This requires the operating system to send the destination IP address in the received UDP datagram to the application. Not all implementations provide this function.

Most UDP servers are interactive servers, which means that a single server process processes all client requests on a single UDP port. Each port is associated with an input queue of limited size, which will cause queue overflow and cause datagram loss. When the queue overflows, the application does not know that the overflow has occurred, and the sender will not be notified that the datagram has been discarded. The UDP output queue is first in, first out.

Most UDP servers make their local IP address a wildcard when creating a UDP port. At this time, if the destination of a UDP datagram is a server port, it can be received on any local interface.

It is possible to start different servers on the same port, and each server has a different local IP address, but it is generally necessary to tell the system that there is no problem in reusing the same port number. For example, when using the sockets API, the SO_REUSEADDR socket option must be specified.

If server A sets the socket to any local IP address + a certain port number, if server B sets the socket to a specific local IP address + the same port number as server A, the datagram arriving at server B can theoretically be Two servers receive, but there is an implicit priority relationship, that is, the more specific socket, server B, is selected first.

Most systems allow UDP to restrict remote addresses, and can only receive UDP datagrams with specific IP and port numbers. If no local address is selected during this setting, the Berkeley-derived system will automatically select a local IP address, which becomes the source interface to the remote end.
Insert picture description here
The order from top to bottom in the above figure is the order used by UDP to determine which application program post-section datagram to use.

Most systems only allow a program to be connected to a certain local IP address and port number at a certain time. If another server with the same local address and port number is started at this time, it will not run, even though we use SO_REUSEADDR socket Options.

On systems that support multicast, multiple programs can use the same IP address and port number, although the application usually must tell the API that this is possible (such as using the SO_REUSEADDR socket option).

When the destination IP to which a UDP datagram arrives is a broadcast address or a multicast address, and there are multiple processes at the destination IP address and port number, a datagram is sent to each endpoint.

Assuming that there is a 8192-byte UDP datagram to be sent, for IP, plus the UDP 8-byte header, there are 8200-byte datagrams to be sent.

If a UDP datagram is divided into 4 pieces, the receiving end only receives the first two pieces, and the application retransmits the UDP datagram after timeout 10s, and is divided into the same 4 pieces, assuming that the receiving end only receives this time The last two pieces, and the receiving end reassembly time is 60 seconds, these 4 pieces will not be reassembled into an IP datagram, because the retransmitted IP datagram has a new identification field, and the reassembly is only for the same Identifies the segment of the field.

The netstat program on a host shows that TCP and IP checksum errors are far more than UDP checksum errors. The reason may be that the UDP datagram sent to the host does not calculate the checksum (when the sender checksum is filled with 0) ) Or UDP datagram is mainly for local communication, and the probability of error is small.

When fragmentation occurs, the options in different IP headers copy the IP header differently. The loose and strict source site routing options copy the IP header to each datagram fragment; the timestamp option and the record routing option The IP header only appears in the first datagram fragment.

There are many ways to filter an input datagram sent to a given UDP port number, which can be filtered based on the destination IP address, source IP address, and source port number.

Guess you like

Origin blog.csdn.net/tus00000/article/details/114699109