Network Principles (TCP/UDP)

Table of contents

1. Network foundation

1. IP address

 2. Port number

3. Agreement

4. OSI seven-layer model

2. UDP protocol

2.1 UDP protocol end format:

2.2 Features of UDP

3. TCP protocol

 3.1 TCP protocol segment format

 3.2 Principle of TCP

(1) Confirmation response mechanism

(2) Timeout retransmission mechanism

(3) Three handshakes, four waves

(4) Sliding window 

(5) Flow control

(6) Congestion control

(7) Delayed response

 Four. TCP/UDP comparison


1. Network foundation

1. IP address

I believe everyone is familiar with the IP address, but for those who are new to the Internet, it is still a very ignorant concept. Let's understand some basic principles of the network together;

concept:

IP addresses are mainly used to identify the network addresses of network hosts and other network devices (such as routers). Simply put, the IP address is used to locate the network address of the host;

For example: the IP address is equivalent to the delivery address of the express delivery. Although the host is located, there is still a problem with the IP address, that is, which process we want to transfer the data to receive, which requires the port number;

 2. Port number

Concept: In network communication, the IP address is used to identify the network address of the host, and the port number can identify the process of sending and receiving data in the host. Simply put: the port number is used to locate the process in the host.

For example: it is mentioned above that the IP address is the delivery address of the express delivery, then the port number is the designated consignee to ensure that the transmitted data can reach the designated point accurately;

Note on port numbers:

A process can be bound to multiple port numbers, but multiple processes cannot be bound to the same port number; after a process is started, the system will randomly assign a port;

3. Agreement

Concept: protocol, the abbreviation of network protocol, network protocol is a set of conventions and rules that all network devices that network communication (that is, network data transmission) must follow. Such as how to establish a connection, how to identify each other, etc. Only by following this convention can computers communicate with each other.

Why do you need an agreement?

To put it simply, it is a set of standards. Because computers have different operating systems and versions, and the hardware produced by many manufacturers is different, in order to allow these devices to maintain normal communication, a standard is agreed. , this set of standards is called the protocol;

Common default port numbers:

  • Port 22: reserved for the SSH server to bind the SSH protocol
  • Port 21: Reserved for the FTP server to bind the FTP protocol
  • Port 23: reserved for Telnet server binding Telnet protocol
  • Port 80: reserved for the HTTP server to bind the HTTP protocol
  • Port 443: reserved for the HTTPS server to bind the HTTPS protocol

4. OSI seven-layer model

OSI is the abbreviation of Open System Interconnection, translated as "Open System Interconnection"; the OSI model divides the work of network communication into 7 layers, from bottom to top are physical layer, data link layer, network layer, transport layer, session layer, presentation layer and application layer;

OSI is only a conceptual and theoretical model. Its disadvantage is that there are too many layers, which increases the complexity of network work, so it has no large-scale application. Later, people simplified the OSI and merged some layers. In the end, only 4 layers were retained. From bottom to top, they are interface layer, network layer, transport layer and application layer. This is the famous TCP/IP model;

Take the TCP/IP model as an example:

  • Application layer: Responsible for communication between applications, such as Simple Email Transfer (SMTP), File Transfer Protocol (FTP), Network Remote Access Protocol (Telnet), etc. Our network programming is mainly aimed at the application layer.
  • Transport layer: Responsible for data transmission between two hosts. Such as the Transmission Control Protocol (TCP), which can ensure that data is reliably sent from the source host to the destination host.
  • For example: the transport layer only cares about the address of communication between two hosts. For example, when we shop online, from Beijing to Shanghai, I don’t care how the transportation is done in the middle, I only care about the starting point and the ending point;
  • Network layer: Responsible for address management and routing. For example, in the IP protocol, a host is identified by an IP address, and a data transmission line (route) between two hosts is planned through a routing table. Routers work at the network layer.
  • For example: still take shopping as an example, the network layer is concerned with how to take the route, for example, there are many routes from Beijing to Shanghai;
  • Data link layer: Responsible for the transmission and identification of data frames between devices. For example, the driver of the network card device, frame synchronization (that is, what signal is detected from the network line is counted as the beginning of a new frame), conflict detection (automatic retransmission if a conflict is detected), data error checking, etc. There are standards such as Ethernet, Token Ring, and Wireless LAN. A switch (Switch) works at the data link layer.
  • For example: the data link layer is concerned with how the adjacent nodes between paths should go. For example, Beijing to Shanghai needs to pass through multiple cities, and the routes between these cities are the data link layer’s concern. of;
  • Physical layer: responsible for the transmission of optical/electrical signals. For example, the network cable (twisted pair) commonly used in Ethernet, the coaxial cable used in early Ethernet (now mainly used in cable TV), optical fiber, and the current wifi wireless network using electromagnetic waves all belong to the concept of the physical layer. The capability of the physical layer determines the maximum transmission rate, transmission distance, anti-interference and so on. Hub (Hub) works at the physical layer

Notice:

  • Communication must be at the same level. For example, the application layer of computer A and the transport layer of computer B cannot communicate because they are not at the same level, and data unpacking will encounter problems.
  • Each layer must function the same, that is, have exactly the same network model. If the network models are all different, then there will be chaos, and no one will know anyone.
  • Data can only be transmitted layer by layer and cannot jump layers.
  • Each layer can use the services provided by the lower layer and provide services to the upper layer.

2. UDP protocol

UDP (User Datagram Protocol) is a simple message-oriented transport layer protocol;

2.1 UDP protocol end format:

Source Port: Data is sent from here;

Destination port: where the data is going;

Length: identifies the length of the UDP header, including the header length and data length; the maximum length of a UDP message is 64KB;

Checksum: The meaning of its existence is to judge, if the checksum is wrong, it will be discarded directly;


2.2 Features of UDP

(1) No connection

The process of UDP transmission is similar to sending a letter. It can be transmitted after knowing the port number and IP address of the other party, and there is no need to establish a connection.

(2) Unreliable

Without any security mechanism, after the sender sends the datagram, if the segment cannot be sent to the other party due to network failure, the UDP protocol layer will not return any error message to the application layer;

(3) Datagram-oriented

The application layer sends the UDP the length of the message, and the UDP sends it as it is, neither splitting nor merging;


3. TCP protocol

TCP (Transmission Control Protocol) is a connection-oriented, reliable, byte-stream-based transport layer protocol.

 3.1 TCP protocol segment format


 3.2 Principle of TCP

(1) Confirmation response mechanism

    When the sender sends data, as shown in the figure above, when sending data for the first time, the data includes a header and a payload. The header is a serial number, and the payload contains the data to be sent. When sending for the first time, the receiver will accept the response and send Return the confirmation serial number to the sender to ask for the next data, and tell the sender what data I have received through the confirmation serial number of ack;

    When confirming the response, it is often accompanied by the situation that the data sent first arrives first, and the data sent later arrives first. Don’t worry about this situation. For TCP, it will help us The whole team, there is a receiving buffer area inside TCP, which is a memory space in the kernel. When there is a disorder, TCP will help us organize the received messages in this buffer area according to the serial number;

(2) Timeout retransmission mechanism

In the process of network transmission, it is inevitable that "packet loss" will occur, then TCP will adopt a retransmission mechanism so that the sent data packets can reach the destination accurately. In packet loss, there are generally two types: Condition:

1. For the first time, the data of the sender is directly lost, and the receiver does not arrive, so ack will not be sent;

 

2. The receiver has received the data, but the returned ack is lost;

In this case, there is a problem. When the ack returned by the receiver is lost, the sender will continue to send, so that the receiver will always receive duplicate data. This is not a small problem. Take mobile payment as an example , originally had to pay 1 yuan, but because the recipient of the packet loss did not return a response, the payment will be deducted all the time! ! !

Of course, the function of TCP is very powerful, and TCP will take corresponding measures to help us solve this problem. TCP will help us deduplicate according to the serial number of the received data in the receiving buffer, so as to ensure that the read The data will not be duplicated! ! !

In the process of retransmission, packet loss may also occur, and the lost packet continues to be retransmitted, but every time a packet is lost, the timeout waiting time will become longer, TCP will think that retransmission is useless, so it simply gives up (This is a serious network problem). If ack is not obtained after multiple consecutive retransmissions, TCP will try to reset the connection. If the connection fails, TCP will close the connection and give up network communication;

Interview question: How does TCP achieve reliability?

Answer: confirmation response + timeout retransmission

The three-way handshake and four-way handshake cannot be answered here. This is related to the connection, not the reliability;

(3) Three handshakes, four waves

TCP establishes a connection: three-way handshake

The so-called handshake means that the communication parties perform a network interaction, which is equivalent to establishing a relationship between the client and the server through three interactions;

process:  

1. The first handshake: the client sends a SYN (synchronization segment) message to the server.

  2. The second handshake: After receiving the SYN message, the server will respond with a SYN+ACK message.

  3. The third handshake: After receiving the SYN+ACK message, the client will respond with an ACK message.

  4. After the server receives the ACK message, the three-way handshake is established.


The role of the three-way handshake:

To put it simply, it is to verify whether the respective sending capabilities and receiving capabilities of the client and server are normal;


Note: The three-way handshake is automatically completed by the kernel, and the application cannot intervene;

TCP disconnect: four waves

process:

1. The first wave: the client sends a FIN message, and a sequence number is specified in the message. At this point the client is in the FIN_WAIT1 state.

2. The second handshake: After receiving the FIN, the server will send an ACK message, and take the serial number value of the client + 1 as the serial number value of the ACK message, indicating that the message from the client has been received. The server is in the CLOSE_WAIT state.

3. The third wave: If the server also wants to disconnect, it will send a FIN message and specify a serial number, just like the first wave of the client. At this time, the server is in the state of LAST_ACK.

4. Wave for the fourth time: After the client receives the FIN, it also sends an ACK message as a response, and uses the serial number value of the server + 1 as the serial number value of its own ACK message. At this time, the client is in the TIME_WAIT state. It takes a while to ensure that the server will enter the CLOSED state after receiving its own ACK message.

5. After the server receives the ACK message, it is in the closed connection and is in the CLOSED state.


In the three-way handshake, the returned ACK and SYN are combined and sent to the client, while the FIN and ACK in the four-way handshake cannot be combined and sent. This is because the ACK and SYN in the three-way handshake are triggered at the same time. It is completed by the kernel, and the ACK and FIN in the four waved hands are triggered at different times. The ACK is completed in the kernel, and the FIN is controlled by the application code. The FIN is only triggered when the close method of the socket is called. ;

(4) Sliding window 

As mentioned above, when the sender sends a data segment, it must give an ACK to confirm the response, and the next data segment will be sent after receiving the ACK. If we want to send a large number of data segments, then the sender needs one by one. Waiting for ACK, which wastes a lot of time, so in order to shorten the waiting time, it is necessary to send in batches to improve the efficiency of sending;

 After bulk transfer:

The process of transferring data in batches above is called a sliding window. Batch sending is not nonsensical, but waiting for ACK after sending to a certain extent. We call the waiting data for batch sending the "window size";

(5) Flow control

The sliding window mentioned above is to make the transmission more efficient. The larger the window, the more data sent in batches, and the overall efficiency will be improved. However, the faster the better, if the transmission is too fast, it will be very slow It is easy to fill up the receiving buffer. At this time, if you send data again, it is easy to lose packets. TCP also has such a mechanism to limit the speed of the sender through flow control;


So how is it controlled? ? ?

We let the ACK message carry a "window size" field. When the ACK is 1, it is an ACK message at this time, and the window size field will take effect at this time. Here, the remaining space of the receiving buffer is used as the window size;

 When the sender finds that the other party is full, it will suspend sending, but it will not stop all the time. It will trigger a window detection message every once in a while. If it finds that the buffer has capacity, it will continue to send in batches;

(6) Congestion control

We all know that there are often many switches and routers between the client and the server. Obviously, on the transmission path, if any device in the middle has a problem, the transmission rate will be affected to a certain extent; the role of congestion control It is to measure the transmission ability of intermediate nodes;

Congestion control is to find a suitable sending rate through experiments; at the beginning, send at a small rate, if there is no packet loss, you can increase the rate appropriately, if the packet is lost, reduce the rate; finally Get an appropriate sending rate, which is actually a dynamic balancing process;

 

(7) Delayed response

It is mentioned above that the receiver needs to return ACK to the sender. The delayed response here is not to send the ACK immediately, but to wait for a while before sending it. The advantage of this is that the window size can be increased, because the delayed response will , the application consumes a batch of data from the receiving buffer, and at the same time, it can also increase the sending rate of the sender;

Two ways to delay the response in TCP:

  • Quantity limit: Respond every N packets;
  • Time limit: Respond once when the maximum delay time is exceeded;

 Four. TCP/UDP comparison

The same point: Both TCP and UDP are above the network layer, and the transport layer protocol can both protect the transmission of the network layer. Both parties need to open ports for communication, and there are multiplexing and demultiplexing technologies in both TCP and UDP;

The difference: TCP is reliable transmission, UDP is unreliable transmission;

scenes to be used:

In most scenarios, TCP can be used. For scenarios with high efficiency requirements but low reliability requirements, UDP can be used. However, in the final analysis, it is still necessary to choose according to specific needs;

Guess you like

Origin blog.csdn.net/m0_63635730/article/details/130219103