Transport layer—UDP protocol

Table of contents

1. Transport layer

1.1 Port number

1.2 Frequently asked questions about ports

1.3 netstat && pidof

2. UDP protocol

2.1 UDP protocol format

2.2 Features of UDP protocol

2.3 UDP buffer

2.4 UDP-based application layer protocol


1. Transport layer

During network transmission, the application layer needs to hand over the data to the transport layer first, and then the transport layer will further process the data and then continue to deliver the data downwards. This process runs through the entire network protocol stack, and finally the data can be sent to the network. middle

1.1 Port number

The port number (Port) identifies different applications on a host for network communication. After the host obtains the data from the network, it needs to deliver the data from the bottom up, and which application program the data should finally be handed over to is determined by the destination port number in the data. When the data obtained from the network is delivered upward, the destination port number corresponding to the data will be extracted at the transport layer, and then it will be determined which process on the current host the data should be delivered to.

The port number belongs to the concept of the transport layer, and the port-related fields are included in the header of the transport layer protocol

A five-tuple identifies a communication

In the TCP/IP protocol, a five-tuple of "source IP address", "source port number", "destination IP address", "destination port number" and "protocol number" is used to identify a communication.

If multiple client hosts access the server at the same time, there may be one client process on these client hosts, or there may be multiple client processes, all of which are accessing the same server

The server identifies a communication by "source IP address", "source port number", "destination IP address", "destination port number", and "protocol number":

  • First extract the destination IP address and destination port number in the data, and make sure that the data is sent to the current service process
  • Then extract the protocol number from the data and provide the corresponding type of service for the data
  • Finally, extract the source IP address and source port number in the data, use them as the destination IP address and destination port number of the response data, and send the response result to the corresponding client process.

protocol number && port number

  • The protocol number exists in the IP header, and its length is 8 bits. The protocol number indicates which protocol the data carried in the datagram uses, so that the IP layer of the destination host knows which protocol the data should be delivered to the transport layer for processing
  • The port number exists in the UDP and TCP headers, and its length is 16 bits. The function of the port number is to uniquely identify a process on a host
  • The protocol number acts between the transport layer and the network layer, while the port number acts between the application layer and the transport layer

Port number range division

The length of the port number is 16 bits, so the range of the port number is 0 ~ 65535:

  • 0 ~ 1023: well-known port number. Such as HTTP, FTP, SSH and other application layer protocols, their port numbers are fixed
  • 1024 ~ 65535: Port numbers dynamically assigned by the operating system. The port number of the client program is allocated by the operating system from this range, and can also be bound by the programmer

well-known port number

  • ssh server: use port 22
  • ftp server: use port 21
  • Telnet server: use port 23
  • http server: use port 80
  • https server: use port 443

View the /etc/services file, which records the network service name and the corresponding port number and protocol used

Note: Each line in the file corresponds to a service and consists of 4 fields, which respectively represent "service name", "used port", "used protocol", and "alias"

1.2 Frequently asked questions about ports

Can a port number be bound by multiple processes?

No, because the function of the port number is to uniquely identify a process on a host. If you bind a port number that has already been bound, the problem of binding failure will occur.

Can a process bind multiple port numbers?

Can. This does not conflict with "the port number uniquely identifies a process", but multiple ports uniquely identify the same process. Therefore, a process can be bound to multiple port numbers, and the process can be found through multiple port numbers

1.3 netstat && pidof

netstat command

netstat is an important tool for viewing network status

  • n: Refuse to display aliases, and convert all numbers that can be displayed into numbers
  • l: Only list services in the LISTEN state
  • p: Display the name of the program that established the related link
  • t(tcp): Only display tcp related options
  • u(udp): Only display udp related options
  • a(all): Display all options, default does not display LISTEN related

When viewing TCP-related network information, generally use the nltp combination option

When viewing UDP-related network information, generally use the nlup combination option

If you want to view connection information other than the LISTEN state, you can delete the l option

pidof command

In the past, the process id was checked by using the ps command, grep command and anonymous pipeline to query. Use the pidof command to view the process id through the process name, which is more convenient

 The pidof command can quickly kill a process with the kill command

2. UDP protocol

2.1 UDP protocol format

UDP protocol location 

The various interfaces used in network socket programming are a layer of system call interfaces located between the application layer and the transport layer. These interfaces are provided by the system, and upper-layer application protocols, such as HTTP, can be built through these interfaces. HTTP is based on TCP, actually because HTTP is built on TCP socket programming

The transport layer below the socket interface is actually managed by the operating system, so UDP belongs to the kernel and is built in the protocol stack of the operating system itself. Its code is not written by upper-level users, and all functions of UDP are provided by The operating system is completed, and the network can be considered as part of the operating system

UDP protocol format

  • 16-bit source port number: indicates where the data comes from
  • 16-bit destination port number: Indicates where the data is going
  • 16-bit UDP length: used to indicate the length of the entire datagram (UDP header + UDP data)
  • 16-bit UDP checksum: If the checksum of the UDP message is wrong, the message will be discarded directly

When programming socketUDP before, the type of port number is uint16_t . The fundamental reason is that the port number in the transport layer protocol is 16 bits.

Note: The UDP maximum length in the UDP protocol header is 16 bits, so the maximum length of a UDP message is 64K (including the size of the UDP header). However, 64K is a very small number in today's Internet environment. If the data to be transmitted exceeds 64K, it is necessary to manually subpackage at the application layer, send it multiple times, and manually assemble it at the receiving end

How does UDP separate the header from the payload?

UDP uses a fixed-length header, which contains only four fields, and the length of each field is 16 bits, totaling 8 bytes. Therefore, when UDP reads the message, the rest after reading the first 8 bytes is the payload.

How does UDP decide to deliver the payload to upper layers?

There are also many application layer protocols on the upper layer of UDP. UDP must hand over the payload to the corresponding upper layer protocol, that is, to the corresponding process of the application layer. Each network process in the application layer is bound to a port number, the server process must be bound to a port number explicitly, and the client process is usually dynamically bound to a port number by the system. UDP uses the destination port number in the header to find the corresponding application layer process

Note: The mapping relationship between the port number and the process ID is maintained in the kernel in a hash manner, so the transport layer can obtain the corresponding process ID through the port number, and then find the corresponding application layer process

How to understand the header?

The Linux operating system is written in C language. The UDP protocol is part of the kernel protocol stack, so the UDP protocol must also be written in C language. The UDP header is actually a bit segment structure

UDP data encapsulation:

  • When the application layer hands over the data to the transport layer, a variable of the UDP header type will be created at the transport layer, and then the fields in the header will be filled, and a UDP header will be obtained at this time.
  • At this time, the operating system opens up a space in the kernel to copy the UDP header and payload together, and a UDP message is formed at this time

UDP data distribution:

  • When the transport layer obtains a message from the lower layer, it will read the first 8 bytes of the message and extract the corresponding destination port number
  • Find the corresponding upper application layer process through the destination port number, and then deliver the remaining payload up to the application layer process

2.2 Features of UDP protocol

  • No connection: know the IP and port number of the opposite end and directly transmit data without establishing a connection
  • Unreliable: There is no confirmation mechanism and no retransmission mechanism; if the data of this segment cannot be sent to the other party due to network failure, the UDP protocol layer will not return any error information to the application layer
  • Datagram-oriented: cannot flexibly control the number and quantity of reading and writing data

Note: When packets are routed and forwarded in the network, not every packet selects the same routing path, so the order in which the packets are sent may be different from the order in which they are received.

Datagram Oriented

The UDP will send the message as long as it is given by the application layer to UDP without splitting or merging.

For example, use UDP to transmit 100 bytes of data: if the sender calls sendto() once to send 100 bytes, then the receiver must also call the corresponding recvfrom once to receive 100 bytes; instead of calling recvfrom 10 times in a loop, Receive 10 bytes each time

2.3 UDP buffer

  • UDP has no real send buffer. Calling sendto will directly send the message to the kernel, and the kernel will pass the data to the network layer protocol for subsequent transmission actions
  • UDP has a receive buffer. However, this receiving buffer cannot guarantee that the sequence of received UDP packets is consistent with the sequence of sending UDP packets; if the buffer is full, the arriving UDP data will be discarded
  • The UDP socket can both read and write, and has two channels, so UDP is full-duplex

Why does UDP have a receive buffer?

If UDP does not have a receiving buffer, then the upper layer is required to read the message obtained by UDP in time. If a message at UDP is not read, then the message data obtained by UDP from the bottom layer will be Forced to discard

When a message is transmitted from one host to another, host resources and network resources will be consumed during the transmission process. If UDP receives a message, it is forced to discard a message that may not be wrong just because the last received message was not read by the upper layer, which causes waste of host resources and network resources and packet loss

Therefore, UDP itself will maintain a receiving buffer. When a new UDP message arrives, it will put the message in the receiving buffer. At this time, when the upper layer reads data, it will directly read from this receiving buffer. Just read, and if there is no data in the UDP receive buffer, the upper layer will block when reading. Therefore, the function of the receiving buffer of UDP is to temporarily save the received message for the upper layer to read.

2.4 UDP-based application layer protocol

  • NFS: Network File System
  • TFTP: Trivial File Transfer Protocol
  • DHCP: Dynamic Host Configuration Protocol
  • BOOTP: Boot protocol (for diskless device boot)
  • DNS: Domain Name Resolution Protocol
  • Custom application layer protocol when programming UDP program

Guess you like

Origin blog.csdn.net/GG_Bruse/article/details/130395120