Network foundation: 4. Posix API and network protocol stack

1. What is Posix API

POSIX API (Portable Operating System Interface API) is a set of portable operating system interfaces for better compatibility between different UNIX-like operating systems. The API defines a set of standard functions and constants, as well as many other specifications, including file system hierarchy, process control, signal handling, thread management, memory management, and more. The POSIX API enables developers to write highly portable applications.

2. Network protocol stack

1. TCP header

field length (bytes) number of digits describe
source port 2 16 Identifies the sender's application port number
target port 2 16 Identifies the receiver's application port number
serial number 4 32 Used to ensure the reliability of data transmission and identify the sequence of data segments
Confirmation Number 4 32 Used to reply to the received data segment and confirm the next sequence number expected to be received
head length 4 4 Specifies the length of the TCP header and the length of the included options
reserved bit 6 6 Reserved bits must be 0
control bit 6 6 TCP control bits such as URG, ACK, PSH, RST, SYN, and FIN
window size 2 16 Define how much data the peer can send without causing congestion
checksum 2 16 Compute the checksum of the TCP header and data
emergency pointer 2 16 Used to indicate the location of urgent data
options variable variable Provides some advanced features such as maximum segment size, timestamps, etc.
0                         15 16                      31
+---------------------------+------------------------+
|          源端口            |         目的端口        |
+---------------------------+------------------------+
|                         序列号                      |
+----------------------------------------------------+
|                  确认号(如果A标志设置则存在)          |
+---------------------------+------------------------+
|   数据偏移/长度    |   保留  |          控制位         |
+---------------------------+------------------------+
|                        窗口大小                     |
+---------------------------+------------------------+
|                  校验和(如果需要则存在)              |
+---------------------------+------------------------+
|              紧急指针(如果URG标志设置则存在)          |
+---------------------------+------------------------+
|                       选项(可选)                   |
+---------------------------+------------------------+

2. UDP packets

field number of digits describe
source port number 16 sending port number
target port number 16 receiving port number
length 16 UDP datagram length (in bytes), including UDP header and data part
checksum 16 UDP datagram checksum, calculated by the sender and verified by the receiver
+------+------+-------+-------+------------------------------+
|                     UDP头部                |      数据部分   |
+-----------+------------+-------+----------+----------------+
|  源端口号  |  目标端口号  |  长度  |  检验和   |    数据部分     |
+-----------+------------+-------+----------+----------------+
|<----------------------------UDP报文------------------------>|

3. IP protocol

Field Name Length (unit: bit) describe
Version 4 Version number of the IP protocol
head length 4 The length of the IP header, in 32-bit words
differentiated services 8 Classify and mark how IP datagrams are handled
total length 16 Total length of IP datagram, including header and data
logo 16 The identifier uniquely marks a datagram for fragmentation and reassembly
the sign 3 Used to control fragmentation, such as no fragmentation, fragmentation or last fragment
Shard Offset 13 Used to determine where a fragment starts relative to the original datagram
survival time 8 The maximum number of network segments that IP datagrams can pass through in the network
protocol 8 Identify the upper layer protocol in the datagram, such as TCP, UDP, etc.
header checksum 16 Checksum for sanity checking of IP headers
source address 32 sender's IP address
target address 32 Receiver's IP address
options variable Optional fields to provide additional information
data variable actual transmitted data
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version|  IHL  |Type of Service|          Total Length         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|         Identification        |Flags|      Fragment Offset    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Time to Live |    Protocol   |         Header Checksum       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       Source Address                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Destination Address                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Options                    |    Padding    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                            Data                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

3. Frequently asked questions about the protocol stack

1. TCP three-way handshake process

The TCP three-way handshake refers to the sequence of three different data packets sent between the client and the server when establishing a TCP connection.

The first time: the client sends a data packet (SYN) requesting a connection to the server.

The second time: After the server receives the client's request, it replies with a data packet (ACK) confirming receipt of the request and a data packet (SYN) of its own request connection.

The third time: After receiving the reply from the server, the client replies with a data packet (ACK) confirming receipt of the server request connection, and the connection is successfully established at this time.

2. TCP four wave process

The first time: the closing party sends a FIN message, requesting to close the connection

The second time: After receiving the FIN message, the other party replies with an ACK message to confirm receipt of the close request, and enters the CLOSE_WAIT state, indicating that it is ready to close the connection

The third time: the other party sends a FIN message, indicating that it is also ready to close the connection

The fourth time: After receiving the FIN message, the closing party replies with an ACK message to confirm receipt of the closing request, and enters the TIME_WAIT state, waiting for possible delayed messages. After receiving the ACK message, the other party can close the connection.

During the whole process, it should be noted that each message must wait for the confirmation of the other party before proceeding to the next step to ensure the reliability of data transmission. At the same time, after the last waving, you need to wait for a period of time (usually twice the maximum segment lifetime) to ensure that the other party has received all the data and avoid unnecessary retransmissions.

3. Why does a three-way handshake be required to establish a connection, while a four-way handshake is required to disconnect

A three-way handshake is required to establish a connection because in the TCP protocol, both parties must ensure that the other party can receive the data sent by themselves, and confirmation is required when establishing a connection

Disconnection requires a four-way handshake because TCP is full-duplex communication. When one party wants to close the connection, it needs to tell the other party that it has no data to send, and wait for the other party's confirmation, and then wait for the other party to close the connection.

4. The duration and reason of TIME_WAIT state

The TIME_WAIT state usually lasts twice the MSL (Maximum Segment Lifetime), that is, in the TCP protocol, how long a data packet can survive at most. In the Linux system, the default value is 60 seconds, so the TIME_WAIT state generally lasts for 120 seconds.

The reason for the TIME_WAIT state is to ensure that all packets on the network are successfully received and processed by the receiver, preventing problems such as the loss of previously sent ACKs from causing subsequent connection requests to fail to arrive.

Its main purposes are:

  1. Confirm that the peer receives the request to disconnect
  2. Prevents reuse of the same port number during Active Close
  3. Handle data loss and errors caused by network delays and out-of-order packets

The existence of the TIME_WAIT state will occupy certain resources, such as memory and CPU resources, but the consumption in this case is relatively small. If the TIME_WAIT state lasts too long, it may affect the performance of the system

5. Timeout retransmission and fast retransmission

Both timeout retransmission and fast retransmission are retransmission mechanisms in the TCP protocol.

Timeout retransmission means that after the sender sends data, it waits for a period of time (called a timeout period), and if it does not receive an acknowledgment response, it will resend the data. This retransmission mechanism is suitable for unstable network conditions, but if the timeout is set too long, the network transmission speed will slow down

Fast retransmission means that when the receiver receives a discontinuous datagram, it sends a repeated ACK to indicate that all data packets before the second data packet have been received. When the sender receives 3 repeated ACKs, it will immediately retransmit the unacknowledged data packets without waiting for the timeout. This retransmission mechanism can avoid the problem of slowing down the network transmission speed caused by the timeout setting being too long, and speed up the data transmission speed

6. What are the fields in the TCP header

  1. Source Port Number (16 bits): Identifies the port number of the sending application.
  2. Destination port number (16 bits): identifies the port number of the receiver application.
  3. Serial number (32 bits): Determine the position of the first data byte in the TCP segment in the entire data stream, that is, the TCP serial number.
  4. Confirmation number (32 bits): Expect to receive the serial number of the first data byte of the next segment of the other party, that is, the TCP confirmation number.
  5. Data offset (4 bits): Indicates the length of the TCP header, in units of 4 bytes. The minimum value is 5 (excluding optional fields), and the maximum value is 15 (including all optional fields).
  6. Reserved (6 bits): reserved bit, not used.
  7. Control bit (6 bits): used to control the establishment, maintenance and release of TCP connections. Including URG, ACK, PSH, RST, SYN, FIN six control bits.
  8. Window size (16 bits): Indicates how many bytes of data the receiver can receive, that is, the TCP window size.
  9. Checksum (16 bits): Used to detect transmission errors in TCP headers and data.
  10. Urgent pointer (16 bits): Provides the location of urgent data, only valid when the URG flag is set.
  11. Option (variable length): Provide some additional TCP header information, such as maximum packet length, timestamp, etc. The length of the option field is variable and can be 0 bytes or more bytes.

7. The significance of the parameter backlog when TCP is listening

The parameter backlog of TCP when listening indicates the maximum length of the outstanding connection queue in the system . Specifically, when the TCP server calls the listen function, it will maintain two queues:

  • Completed connection queue
  • outstanding connection queue

The completed connection queue contains client connections that have been established and are waiting for the server to accept, while the incomplete connection queue contains client connections that have received SYN segments but have not yet completed the three-way handshake.

The backlog parameter controls the maximum length of the queue of outstanding connections. If the queue of outstanding connections is full, the server will not accept new connection requests until there is room in the queue. Therefore, the value of backlog should be set according to the expected load of the server. If the backlog is set too small, connection requests may be rejected; if it is set too large, it may take up too much memory resources

The significance of the backlog parameter is to control the size of the outstanding connection queue to ensure that the server can handle as many connection requests as possible and avoid problems such as denial of service or exhaustion of memory resources.

8. In which step of the three-way handshake does Accept occur?

The word Accept is generally used after the TCP connection is established, and will not occur in any step of the three-way handshake

9. What are the insecurities in the three-way handshake process

  1. SYN Flood attack: The attacker can send a large number of SYN requests, making the server unable to process normal requests.
  2. IP spoofing: The attacker can forge the source IP address to send a SYN request, thereby deceiving the server.
  3. Replay attack: An attacker can attack by intercepting and tampering with an old three-way handshake process, thereby bypassing the security measures of the server.
  4. Man-in-the-middle attack: The attacker can intercept and tamper with data during the communication process, so as to achieve the purpose of illegally obtaining information or interfering with communication.

10. The difference between TCP and UDP

  1. Connection method: TCP is a connection-oriented protocol. Before transmission, three handshakes are required to establish a connection. After the transmission is completed, four handshakes are required to close the connection. UDP is a connectionless protocol, and each data packet is sent independently without establishing a connection. .
  2. Transmission reliability: TCP guarantees the reliability of data transmission, and ensures that the data is transmitted to the destination accurately through mechanisms such as sequence numbers and confirmation responses; while UDP does not guarantee the reliability of data transmission, and data packets may be lost or repeated transmissions.
  3. Data transmission efficiency: UDP is superior to TCP in terms of data transmission efficiency, because it has no additional overhead for connection establishment and data verification, and can achieve efficient data transmission through broadcast and multicast.
  4. Flow control: TCP supports flow control, which can control congestion according to network conditions, making data transmission more stable; while UDP has no flow control mechanism, which can easily lead to network congestion.
  5. Application scenarios: TCP is suitable for application scenarios that require high transmission reliability, such as file transfer, email transmission, etc.; while UDP is suitable for application scenarios that require high transmission speed and low data transmission reliability requirements, such as audio and video transmission , games, etc.

Recommend a free open course of Zero Sound Academy. I personally think the teacher taught it well, so I would like to share it with you:

Linux, Nginx, ZeroMQ, MySQL, Redis, fastdfs, MongoDB, ZK, streaming media, CDN, P2P, K8S, Docker, TCP/IP, coroutines, DPDK and other technical content, learn now

Guess you like

Origin blog.csdn.net/weixin_44839362/article/details/130514827