Characteristics of TCP (including three-way handshake, four-way handshake problem)

Basics of the TCP protocol

insert image description here
4-digit header length : In fact, it is the dividing line between the header and the payload, and the unit is 4 bytes. For example: the length of the 4-bit header is 1111=>15, and the actual length of the header is 15*4=>60 bytes
insert image description here
Reserved bit : identifies what type of datagram this TCP datagram is
URG: whether the urgent pointer is valid
ACK: confirmation number Whether it is valid
PSH: Prompt the receiving end application to read the data from the TCP buffer immediately
RST: The other party requests to re-establish the connection; we call the segment carrying the RST identifier a reset segment
SYN: Request to establish a connection; we refer to the segment carrying the SYN identifier It is called the synchronous segment
FIN: to inform the other party that the local end is going to be closed, and we call the segment carrying the FIN flag the end segment

1. Confirmation response (the core of reliability)

1. Principle

When the sender sends data to the receiver, the receiver responds with a response message. If the sender receives the response message, it is considered that the other party has received it. Since the transmission sequence on the network is uncertain, the logic cannot be determined simply by the received data, and the response needs to be numbered. The sequence number of the acknowledgment response does not necessarily start from 1. The TCP sequence number and confirmation sequence number are numbered in bytes. The confirmation response means that 1001 means that everything before 1001 has been received.
insert image description here

2. Other scenes

The acknowledgment response mechanism is not exclusive to TCP. TCP is also used in many other scenarios, such as distributed scenarios. Peak
insert image description here
shaving and valley filling: the message queue is responsible for saving some data, but the storage space in the message queue is not unlimited. Like rocketmq, the data is directly persisted to the disk, and the storage space is quite large; like rabbitmq, the default data is in memory, and the storage space is more limited. The stored data needs to be eliminated periodically. If a piece of data has not been consumed, it cannot be easily eliminated. Use the confirmation response mechanism to judge whether the data has been successfully consumed.

2. Overtime retransmission (reliability)

During the transmission process of the confirmation response mechanism, there may be packet loss. Once the data packet loss occurs, it will enter the timeout retransmission mechanism.
There are two possibilities at this time, one is that the sent message is lost and the other party does not see it, and the other is that the message replied by the other party is lost. After sending the data, the party that sends the message waits for 500ms. If no response is received, it is considered a packet loss (in Linux and Windows, the timeout is controlled in units of 500ms). When the network is normal, the probability of two or even three consecutive packet loss is very low.

3. Connection management (three-way handshake, four-way wave) reliability

1. Three-way handshake (how to establish a connection)

A Client
B Server
insert image description here
(1) Main purpose:
The first point is to confirm that the transmission between A and B is smooth through the three-way handshake process. In particular, confirm whether the sending and receiving capabilities of A and B are normal. AB B confirms that A’s transmission is normal; BA A confirms that B’s transmission and reception are normal; AB B confirms that A’s reception is normal
. few start.

(2) Extended question:
Is it okay to shake hands four times or twice?
Four times: no need, the transmission overhead is greater than three times
twice: not possible

(3) State transition corresponding to TCP
This figure also includes socketAPI (Linux, C language version) at the corresponding stage.
insert image description here
When the ServerSocket instance is created, it enters the LISTEN state, and both SYN and ACK are 1 during the connection.
LISTEN: the phone is turned on The signal is good, you can call at any time
ESTABLISHED: After the call is connected, both parties can transmit information, and accpt returns.

2. Wave four times (how to disconnect)

A can be a client or a server
insert image description here
(1) For B, ACK and FIN triggers are actually different:
B will immediately trigger ACK as soon as it receives FIN, this is done by the kernel, and
B actually sends FIN by the user Code control, only when the socket.close() operation room appears in the code will the FIN be triggered. (In essence, the corresponding PCB file descriptor is released in the kernel. If close is not called, but the process ends or the socket object is recycled by GC, the PCB will be destroyed and FIN will also be triggered.)

(2) The state transition corresponding to TCP
insert image description here
CLOSE_WAIT: After the server receives the FIN, it enters the waiting state and waits for the user code to call close to send the FIN.
TIME_WAIT: Indicates that the client enters TIME_WAIT after receiving the FIN. This state is to deal with the last ACK packet loss.
If A receives the FIN and returns the ACK, it will be destroyed without entering the TIME_WAIT state. At this time, once the last ACK packet is lost, the ACK cannot be retransmitted (the connection has been destroyed).

(3) Why is there a large number of CLOSE_WAIT on the server?
There is a bug in the code, and close is not called immediately.

(4) Why is there a large amount of TIME_WAIT on the server?
It may be a code bug. The party who initiates FIN will enter TIME_WAIT, and needs to check whether the server should actively disconnect.

(5) Why the port binding failed?
Whoever disconnects first will enter TIME_WAIT. After the process exits, the TIME_WAIT state still exists, and the TCP connection still exists. If you let the server exit first, the server will enter the TIME_WAIT state (the original connection occupies the port), and then if the server starts immediately, the new process will try to re-bind this port, and there may be a port binding failure Case. Therefore, when the TCP server is operating, the client is terminated first and the server is terminated.
There is a REUSE_ADDR option in the socket api. If this option is added, it will allow us to reuse the port in the TIME_WAIT state when binding the port. This option should be set by default in the Java socket.

(6) Can Four Waves Be Three Waves?
In delayed acknowledgment and piggyback acknowledgment, ACK and FIN may merge.

(7) Will it be executed after four waves of hands?
Not necessarily, waving four times is a normal disconnection process. In fact, sometimes TCP will be disconnected abnormally, for example: the network cable is disconnected.

4. Sliding window (efficiency)

1. Transmission principle

TCP is not only to ensure reliability, but also to improve transmission efficiency as much as possible. In this process, the sender has to spend a lot of time waiting, and this waiting is actually a waste of a lot of time. Therefore, batch sending, waiting for a batch of ACKs at a time, overlaps the ACK waiting time of multiple sets of data, and the length of the data sent in one batch is called the "window size". If there is no limit on the length of data sent in batches (window Infinity, send a batch without waiting for ACK at all), there is no reliability at all.
The larger the window, the higher the overall efficiency and the lower the reliability
The smaller the window, the lower the overall efficiency and the higher the reliability
insert image description here

2. Window range

insert image description here
The current window range is 1001-5000, that is, the sender has sent 1001-2000, 2001-3000, 3001-4000, 4001-5000, four sets of data, while waiting for the ACK of the four sets of data, assuming that 2001 arrives first, the sender It also knows that the data 1001-2000 has been received by the other party (the confirmation serial number indicates that the previous ones have been received, and if the ACK of 3001 is received, it means that both 1001-2000 and 2001-3000 have been received by the other party), and then it will be sent again immediately. Send a 5001-6000, and still ensure that the window size is 4 pieces of data. It is not to wait for all 4 ACKs to arrive before sending them later.

3. Packet loss occurs in the sliding window

As long as it is not all lost, in fact, in order to improve the efficiency of TCP, not every piece of data has an ACK, and there will be an ACK every few pieces of data. If there is a set of data in the transmission file, it is assumed that 1001-2000 is lost, the file will continue to be sent, but these will stay in the buffer (2001-7000), and the receiving end will always send 1001 ACK to the sending end. The end knows that 1001-2000 is lost, and it will return 7001ACK after sending it to the receiving end.
insert image description here

5. Flow control (reliability)

1. Flow control:

It is a further supplement to the sliding window, which limits the window size based on the processing capability of the receiver. According to the receiver's processing capability (the size of the receiver's free buffer space), dynamically determine the sender's sending rate (control window size) Example
: the size of the receiving buffer is 4000, when 1-1000 data arrives, the buffer here If 1000 is used and there are 3000 left, the returned ACK will tell the sender the message of 3000, and when the sender sends data again, it will send it according to 3000 as the window size. Until the window size is 0, after the sender stops, in order to be able to query the current window size of the receiver, every once in a while, there will be another window detection packet, through this packet (not transmitting specific business data) to trigger ACK, and get the window size.

2. The actual size of the window

There is a 16-bit window size in TCP (see the first picture).
The TCP header byte option includes a window expansion factor. The actual window size is the value of the window field shifted left by M bits. (equivalent to *2)

6. Congestion control (reliability)

Limit the window size of the sender from another angle, look at the problem from a macro point of view, regard the entire link as a whole, and only look at the results. Start slowly first, use a relatively small window to transmit data, and check whether there is packet loss. If there is no packet loss, it means that the network is relatively smooth, and gradually increase the sending rate; if there is packet loss, it means that the network is congested, so reduce the sending rate. Obtained in this way: the actual sending window size = min (flow control window, congestion control window). As shown below
insert image description here

7. Delayed response (efficiency)

In order to improve efficiency, let the window size be as large as possible on the basis of ensuring reliability. For the previous flow control, the window size is the size of the remaining space in the receive buffer. In terms of delayed response, the window size is the window size when the receiver waits for a while before returning ACK after receiving the message (some data is out of the buffer while the receiver is waiting, so the window size is larger)
insert image description here

8. Piggybacking (efficiency)

On the basis of delayed response, many client/server communication modes are in the form of one question and one answer, and the application program needs to respond.
Due to the delayed response, the returned ACK is not returned immediately but waits for a while, which happens to be combined with the response.
insert image description here
The four waves in the delayed answer and piggyback answer can be changed to three times but the time is uncertain.

9. Oriented to byte stream (sticky packet problem)

In the case of byte stream-oriented, you need to pay attention to the problem of sticky packets (referring to the data of the application layer). Not a TCP-specific problem, all byte-oriented streams have this problem. Datagram-oriented ones like UDP don't have this problem.
How to solve the sticky packet problem: solve it by designing a reasonable application layer protocol
1. Set the terminator and separator for the application layer data
2. Set the length for the application layer data
insert image description here

10. Abnormal situation in TCP (heartbeat mechanism)

1. Process terminated

Regardless of how the process is terminated, the corresponding PCB will be released in essence, and the corresponding file descriptor will also be released, which will also trigger four waves of hands. "Process termination" does not mean that the connection is terminated. The process termination is actually equivalent to calling the socket. .close() is all.

2. Machine restart

When the machine is restarted, the process is actually suspended first, and the process is still waved four times.

3. The machine is powered off / the network cable is disconnected (emergency!! The machine is too late to perform any actions)

(1) The power failure means that the other side of the receiver is
still sending data at this time. Obviously, the sender will not get ACK at this time, so it will retransmit after a timeout. After several retransmissions, it will try to reset the connection. Reset the message segment, and then the sender will give up the connection and recycle the resources corresponding to the connection.
(2) The power-off is that the sender
is trying to receive data at this time and the other party is trying to receive data. At this time, no data can be received. At this time, the strategy adopted by the receiver is the "heartbeat packet" mechanism (also called "keep alive"). After a period of time, send a PING packet to the other party, expecting the other party to return a PONG packet. If the PING packet is sent, there is no PONG after a long time, and it will not work after several retries. At this time, the other party is considered to be offline.
Heartbeat packet, this is a very widely used mechanism, not just in TCP.

Guess you like

Origin blog.csdn.net/stitchD/article/details/123739231