Linux C system programming (13) Network programming TCP and UDP protocol

1 Introduction to computer network basics

1.1 Introduction to Computer Network Architecture

The architecture of the computer network is implemented according to a highly structured design method, using the layered principle. The advantages of layering are: good flexibility; easy maintenance and implementation, conducive to collaboration between various layers, and standardization.

Two standards for computer networks:

  1. OSI / RM (Open_Systems_Interconnection_Reference_Model): An international standard that has achieved in theory. The 7 layers of the OSI architecture: application layer, presentation layer, session layer, transport layer, network layer, data link layer, and physical layer.
  2. TCP / IP: International standard in practice. That is, the TCP / IP standard is adopted in reality. The four layers of the TCP / IP architecture: application layer, application layer, transport layer, Internet layer IP, and network interface layer.

1.2 Introduction to the TCP / IP model

  • Network interface layer: responsible for the reception and transmission of IP datagrams; responsible for the protocol of the network interface layer.
  • Internet layer IP: routing, flow control and congestion control; network layer protocol to provide connectionless services.
  • Transport layer: establish an end-to-end connection; run TCP and UDP protocols.
  • Application layer: protocol for running interprocess communication; data processing.

1.3 Connection-oriented services and connectionless services

The types of services are divided into two categories: connection-oriented services and connectionless-oriented services.

  1. Connection-oriented services: Before transmitting data, a connection must be established, and when the data transmission is completed, the connection is released. It is characterized by three processes: connection establishment, data transmission, and connection release; each data packet carries the source IP and destination IP of the packet data; the transmission is reliable, but the protocol is more complicated.
  2. Connection-free service: No connection establishment or connection release is required during communication. The characteristic is that there is no need to establish a connection, data transmission, and connection release process; transmission is unreliable, but the protocol is relatively simple.

2 Introduction to CS (client / server)

In the TCP / IP network, the process communication between the two hosts adopts the CS mode, where the client and server refer to the process. The description of the client and server is as follows:

@ 1 client: The party that initiated the connection request first. The client process is as follows:

//1 运行一个客户进程,向服务器特定端口发送请求连接报文;
//2 等待并接收服务器的应答;
//3 继续提出请求;
//4 通信结束后关闭通信通道。

@ 2 server: the party that accepts the request and provides the service. The server-side process is as follows:

//1 开启服务器进程;
//2 loop:等待并检测客户/下一个客户的连接请求;
//3     如果接收到客户的连接请求,则处理该请求;
//4     goto loop

3 User Datagram Protocol UDP (User Data Protocal)

3.1 User datagram format

User datagram UDP has two fields: header field (user datagram pseudo header + user datagram header) and data field.

@ 1 User datagram pseudo header format is shown in the figure:

among them:

  • Source IP address: 4 bytes, used to mark the IP address of the sending end.
  • Destination IP address: 4 bytes, used to mark the IP address of the receiving end.
  • 0: 1 byte, indicating padding bits.
  • Protocol: 1 byte, UDP protocol number is 17.
  • UDP datagram length: 2 bytes.

@ 2 The header of the user data report is shown in the figure:

among them:

  • Source port number: The source port number used to mark the datagram.
  • Destination port number: The destination port number used to mark the datagram.
  • UDP datagram length: contains the length of the datagram header + the length of the data part.
  • Checksum field: prevents user datagram errors during transmission.

Note: The pseudo header of the user datagram is a virtual data structure, which is mainly used to calculate the checksum and connect with the UDP user data to obtain a temporary datagram for calculating the checksum of the UDP user datagram.

3.2 User datagram inspection and calculation

The UDP datagram checksum is calculated twice, one at the sending end and one at the receiving end. (Inverse code + source code = 0x11111111 ... used here)

  1. At the sending end: Set the checksum field in the UDP datagram header to 0; consider the pseudo header and UDP user datagram to be connected by multiple 16b strings, (if the data portion of the user datagram is not even Bytes, you need to add a byte of all 0s, that is, the last 16b is all 0s, and this 16b refers to the lower bit), calculate the sum of these 16b according to the inverse binary code; finally invert this number, the result This is the checksum of the first byte of the sent datagram.
  2. At the receiving end: the received UDP user datagram together with the pseudo header (and possible padding of all 0 bytes), and the sum of these 16-bit words in binary inverse. When there are no errors, the result should be all 1. Otherwise, it indicates that an error has occurred, and the receiving end should discard this UDP user datagram (it can also be handed over to the application layer, but with a warning of an error).

3.3 Characteristics and uses of UDP user datagrams

Features of UDP user datagrams:

  1. For connectionless protocols, no connection is established before data is transferred.
  2. The reliability of the delivery is not guaranteed, just the best effort delivery.

Although UDP is an unreliable protocol, it has great advantages in transmitting certain aspects of data. The applications of UDP are as follows:

  1. Domain name conversion
  2. Link protocol
  3. Network management
  4. Remote File Service
  5. IP phone

4 Transmission Control Protocol TCP (Transmission Control Protocal)

4.1 Introduction to TCP

TCP provides connection-oriented services and is a reliable full-duplex channel. Compared with UDP, the TCP protocol is more complex, but it is also more reliable, mainly because the connection is oriented and TCP numbers and confirms the transmitted data.

4.2 Header of TCP segment

The TCP message segment is also divided into two parts: header and data. The format of the header is shown in the figure:

among them:

  • Source port number: The source port number used to mark the datagram.
  • Destination port number: The destination port number used to mark the datagram.
  • Sequence number: The sequence number of the first byte in the data sent in this segment.
  • Confirm sequence number: the sequence number of the first byte of the next segment data.
  • Data offset: occupies 4 bits, the offset unit is 4 bytes, so the maximum offset that can be expressed is 60 bytes.
  • Reserved: The value is 0.
  • 6 bits.
  • Window: used to control the amount of information sent by the other party, the unit is byte, mainly used in flow control and congestion control.
  • Checksum: The checksum method is the same as UDP, only the protocol number is different.
  • Optional: Used to deal with some special situations. The currently used option field can be used to define the maximum packet length in the communication process and can only be used when the connection is established.
  • Padding: used to fill the TCP packet segment, the content of the padding must be 0; used to ensure that the option is an integer multiple of 32b.  

The following 6 bits are explained in detail here:

  1. Urgent bit URG: When URG is 1, it indicates that there is urgent data in the message, and it should be transmitted as soon as possible, rather than in accordance with the original queuing sequence.
  2. Acknowledgment bit ACK: used when establishing a connection; the acknowledgment flag in the TCP packet header confirms the received TCP message.
  3. Push bit PSH: When PSH is 1, the sender will send the message as soon as possible, and when the receiver receives the message with the push bit of 1, it will be taken out of the receive buffer as soon as possible and handed over to the application process.
  4. Reset bit RST: When RST is 1, it indicates that a serious error has occurred in the TCP connection, the connection must be released, and then re-established.
  5. Synchronization bit SYN: used to synchronize the sequence number when establishing a connection.
  6. Termination bit FIN: used to release a connection. When FIN is 1, it indicates that the data on the sending end of this segment has been sent and the connection is requested to be released; FIN = 0 is invalid.

4.3 TCP data number and confirmation

The reason why TCP can achieve reliable data transmission is through TCP data labeling and confirmation.

  1. The value of the sequence number in the header of the segment sent by TCP each time indicates the sequence number of the first byte in the data portion of the segment (since the sequence number in the header of the TCP segment occupies 4 bytes, it can mark 4G Bytes of data), for general data transmission, it can ensure that the sequence number of each byte does not overlap with the sequence number of other bytes.
  2. TCP data confirmation: Confirm the sequence number of the last byte in the data that has been successfully received.
  3. The confirmation sequence number returned by the receiving end: the receiving end expects the value of the first byte sequence number in the next received data (this value should be equal to the sequence number of the last byte of the data that has been correctly received + 1).

The relationship between the sending sequence and the receiving sequence is shown in the figure:

4.4 TCP flow control and congestion management

TCP protocol usually uses variable size sliding window for flow control.

  1. Send window: the maximum number of bytes that the sender can send data at a time during data communication. The value of the window field at the beginning of the TCP segment is the size of the current window.
  2. Receiving window: dynamically adjust the size of the sending window of the other party according to the actual situation during data communication.

During TCP communication, the sending end maintains a pointer that points to the next data to be sent. Each time the sender sends a message, the pointer moves forward by the distance of a message segment. The movement of the sending window is performed after the sending end receives the confirmation from the receiving end, and the packet segment sent by the sending end each time is not larger than the size of the sending window. The sliding window essentially describes the size of the TCP datagram buffer of the receiver, and the sender calculates the maximum length of data that can be sent based on this data. If the sender receives a TCP datagram with a window size of 0 on the receiver, it stops sending data until the receiver sends a datagram with a window size other than 0.
Variable window flow control:

  1. Notification window: The size of the window set by the receiving end according to the ability to receive TCP segments at that time.
  2. Congestion window: The size of the window set by the sender according to the network congestion. (The larger the congestion window, the greater the amount of data that can be sent)

Under normal circumstances, the value of the sending window is MIN (notification window, congestion window). During the process of data transmission, the flow can be controlled by changing the window. (The flow control can not only allow the receiving end sufficient time to receive and process data, but also prevent the network from blocking). The process of using variable windows for flow control is shown in the figure:

4.5 TCP transport connection management

The method of TCP connection establishment is three-way handshake, and the method of disconnection is four wave-hands. The two mechanisms are described as follows:

@ 1 The three-way handshake process is shown in the figure:

The process description is as follows:

  1. The client sends a SYN to the server, and then waits for the server to send back a confirmation message.
  2. The server sends a SYN and ACK to the client to confirm that it has received the message from the client.
  3. After the client receives the confirmation message from the server, it returns an ACK to the server, and then a reliable connection can be established with the server.

Four waves: Because the TCP / IP connection is full-duplex, each direction must be closed individually. The process of waving four times is shown in the figure:
 

The process description is as follows:

  1. After the client sends data to the server, it sets FIN to 1 and tells it that I will close the data connection in this direction.
  2. After receiving the FIN, the server closes the data connection in this direction. Set ACK to 1 to tell the client that the client's information has been received and processed.
  3. At the same time, the server applies to the client to disconnect the data connection in the opposite direction. Set FIN to 1.
  4. The client receives the application from the server, sets ACK to 1, and both parties close the connection at the same time.
Published 289 original articles · praised 47 · 30,000+ views

Guess you like

Origin blog.csdn.net/vviccc/article/details/105172478