UNIX Network Programming Volume 1 Chapter 2

UDP User Datagram Protocol

UDP does not guarantee that UDP datagrams will reach their final destination, that the sequence of individual datagrams remains unchanged across the network, or that each datagram arrives only once. (can be compared with TCP)

TCP Transmission Control Protocol

TCP provides the connection between the client and the server and provides reliability. When TCP sends data to the other end, it requires the other end to return an acknowledgment. If no acknowledgment is received, TCP will automatically retransmit the data and wait for a longer time. After several failed retransmissions (attention should be paid), TCP gives up.
Note: TCP does not guarantee that the data will be received by the other endpoint, it provides reliable delivery of data or reliable notification of failures.

RTT : round-trip time The round-trip time between the client and the server

TCP sorts the data it sends by assigning a sequence number to each byte in it. If these segments arrive out of order, the receiving TCP will first reorder them according to their sequence numbers, and then pass the resulting data to the receiving application. If the receiving end TCP receives duplicate data from the opposite end, it can judge that the data is duplicate according to the sequence number, so as to discard the duplicate data. (UDP itself does not provide mechanisms for acknowledgments, sequence numbers, RTT estimation, timeouts, and retransmissions.)

TCP provides flow control . TCP always informs the peer end how many bytes of data it can receive from the peer end at any one time, which is called the notification window .
This window changes dynamically: when data is received from the sender, the window size decreases, but when the receiving application reads data from the buffer, the window size increases. (UDP does not provide flow control)

TCP connections are full duplex. This means that applications on a given connection can both send and receive data in both directions at any one time. Therefore, TCP must keep track of state information such as sequence numbers and advertised window sizes for each data flow direction. After establishing a full-duplex connection, it can be converted to a simplex connection if desired. (See Chapter 6)

SCTP Stream Control Transmission Protocol

SCTP provides associations between clients and servers (a connection involves only communication between two IP addresses, while an association refers to a communication between two systems, which may involve more than two addresses because SCTP supports multihoming ).

Unlike TCP, SCTP is message-oriented . It provides sequential delivery of individual records. As with UDP, the length of each record written by the sender is passed along with the data to the receiving application.

SCTP is capable of providing multiple streams between connected endpoints, each of which reliably delivers messages in sequence. A message on a stream does not block the delivery of messages on other streams of the same association. This is the exact opposite of TCP, where a byte loss anywhere in a single byte stream blocks the delivery of all subsequent data on that connection until the loss is repaired.

SCTP also provides multihoming, enabling a single SCTP endpoint to support multiple IP addresses. This feature can enhance the robustness of network faults.

TCP connection establishment and termination

We all know that a three-way handshake occurs when a TCP connection is established, so what exactly is it?

1. The server must be ready to accept external connections, which is usually done through the three functions of socket, bind, and listen. We call it passive opening . 2. The client initiates an active open
by calling connect . This causes the client TCP to send a SYN (synchronization) segment, which tells the server the initial sequence number of the data the client will send on the connection. Usually the SYN subsection does not carry data, and the IP datagram where it is located only contains an IP header, a TCP header and possible TCP options. 3. The server must acknowledge (ACK) the client's SYN and at the same time send itself a SYN segment containing the initial sequence number of the data the server will send on the same connection. The server sends the SYN and the ACK (acknowledgment) to the client's SYN in a single segment. The acknowledgment number in the ACK is the expected next sequence number for the segment that sent this ACK. Because SYN occupies one byte of sequence number space, the confirmation number in the ACK of each SYN is the initial sequence number of the SYN plus 1. Similarly, the confirmation number in the ACK of each FIN (indicating the end) is the serial number of the FIN plus 1. 4. The client must confirm the SYN of the server. This exchange requires at least 3 packets, so it is called TCP's three-way handshake .


TCP options

  • MSS option
    The TCP section sending SYN uses this option to inform the peer of its maximum segment size - MSS . The sending TCP uses the receiver's MSS value as the maximum size of the sent segment. Use the TCP_MAXSEG socket option to extract and set this TCP option.
  • Window size option
    The maximum window size that any end of a TCP connection can notify the other end is 65535, because the corresponding field in the TCP header occupies 16 bits. Use the SO_RCVBUF socket option to affect this TCP option.
  • Timestamp option
    This option is necessary for high-speed network connections to prevent data corruption that may be caused by lost and returned packets.

Since high-bandwidth or long-latency networks are called "fat pipes", the latter two options are also called "fat pipe options".

TCP connection terminated

TCP takes 3 segments to establish a connection, and 4 segments to terminate a connection. (waves four times)

  1. An application process first calls close, we call this end to perform an active close , and the TCP on this end then sends a FIN segment, indicating that the data has been sent.
  2. The peer receiving this FIN performs a passive close. This FIN is acknowledged by TCP. Its reception is also passed to the receiving application process as an end-of-file (after any other data that has been queued for the application process to receive), because the reception of FIN means that the receiving application process has no additional data on the corresponding connection acceptable.
  3. After some time, the application process that receives this end-of-file will call close to close its socket. This causes its TCP to also send a FIN.
  4. The original sender TCP that received the final FIN (that is, the end that performed the active close) confirms the FIN.
  5. In some cases, the FIN of step 1 is sent together with the data. In addition, the segments sent in steps 2 and 3 are all from the end that performs passive closing, and may be merged into one segment.

Similar to SYN, a FIN also occupies 1 byte of serial number space, and the ACK confirmation number of each FIN is the serial number of this FIN plus 1.
Between steps 2 and 3, it is possible to flow data from the side performing a passive close to the side performing an active close, known as a half close.
Either client or server can perform an active close, usually the client performs the close, but some protocols (notably HTTP/1.0) have the server perform the active close.

TCP state transition diagram

(Follow-up supplement)

TIME_WAIT state

The endpoint stays in this state for twice the Maximum Segment Lifetime (MSL), sometimes referred to as 2MSL.
The MSL is the maximum amount of time any IP datagram can live on the Internet.
The TIME_WAIT state exists for two reasons:

  1. Reliable implementation of TCP full-duplex connection termination
  2. Allow old duplicate sections to disappear in the network.

The port number

Any TCP/IP implementation that supports FTP assigns the well-known port 21 to the FTP server, and UDP port number 69 to TFTP.

  • Socket pair A socket pair for
    a TCP connection is a quadruple that defines the two endpoints of the connection: local IP address, local TCP port number,
    foreign IP address, foreign TCP port number. The two values ​​(IP address and port number) that identify each endpoint are commonly referred to as a socket.

TCP port numbers and concurrent servers

The master server in a concurrent server loops by forking a child process to handle each new connection. TCP cannot separate incoming segments to different endpoints just by looking at the destination port number. It must look at all 4 elements of the socket pair to determine which endpoint receives an arriving segment.

Buffer size and limits

  • The maximum size of an IPv4 datagram is 65535 bytes, including the IPv4 header. The total length field occupies 16 bits.

  • The maximum size of an IPv6 datagram is 65575 bytes, including the 40-byte IPv6 header. Note that the IPv6 payload length field does not include the IPv6 header, and the IPv4 total length field includes the IPv4 header. IPv6 has a very large payload option, which extends the payload length field to 32 bits, but this option requires data links with an MTU (Maximum Output Unit) exceeding 65535 to provide support.

  • The MTU of Ethernet is 1500 bytes. The minimum link MTU required by IPv4 is 68 bytes. The minimum link MTU required by IPv6 is 1280 bytes.

  • The smallest MTU in the path between two hosts is called the path MTU, and the Ethernet MTU of 1500 bytes is a common path MTU today.

  • When an IP datagram is to be sent out of an interface, if its size exceeds the MTU of the corresponding link, both IPv4 and IPv6 perform fragmentation, and these fragments are usually not reassembled until reaching the final destination. IPv4 hosts fragment the datagrams they generate, and IPv4 routers fragment the datagrams they forward. However, in IPv6, only hosts fragment the datagrams they generate, and IPv6 routers do not fragment the datagrams they forward.
    Note: A device marked as an IPv6 router may perform fragmentation, but only with respect to those datagrams it generates, never with respect to those it forwards. IP datagrams generated by the router's Telnet server are generated by the router, not forwarded by the router.

  • If the "don't fragment" bit (ie DF bit) of the IPv4 header is set, neither the host sending these datagrams nor the router forwarding them will allow them to be fragmented. When a router receives an IPv4 datagram that exceeds the MTU size of its outgoing link and has the DF bit set, it will generate an ICMPv4 error message (destination unreachable, needs fragmentation but DF bit is set).

  • TCP has an MSS, the purpose is to tell the peer the actual value of its reassembly buffer size, thus trying to avoid fragmentation. MSS is often set to the MTU minus the fixed length of the IP and TCP headers. The MSS value of using IPv4 in Ethernet is 1460, and the MSS value of using IPv6 is 1440 (the TCP header of both is 20 bytes, but the IPv4 header is 20 bytes, and the IPv6 header is 40 bytes)

TCP output

Each socket has a send buffer, we can use the SO_SNDBUF socket option to change the size of the buffer. If the send buffer of the socket cannot accommodate all the data of the application process, the application process will be put to sleep, and the default setting is blocking. A successful return from a write call to a TCP socket only means that we can reuse the original application process socket, and does not indicate that the peer TCP or application process has received the data.

UDP output

Any UDP socket has a send buffer size, but it is only an upper bound on the size of UDP datagrams that can be written to the socket. If an application process writes a datagram larger than the socket send buffer size, the kernel will return an EMSGSIZE error to the process. Since UDP is unreliable, it doesn't have to keep a copy of the application process's data, so there is no need for a real send buffer (application process data is usually copied to a kernel buffer of some form as it passes down the protocol stack area, but when the data is sent, the copy is discarded by the data link layer).
If a UDP application process sends large datagrams, they are compared to TCP application data

Guess you like

Origin blog.csdn.net/weixin_37778713/article/details/105386870