TCP protocol-handshake and waving

 

Know the TCP protocol

TCP is called "Transmission Control Protocol", which is a protocol in the transport layer, which controls the transmission of data in detail. 
Features:

  • Byte-oriented
  • Safe and reliable
  • Connection-oriented

TCP protocol segment format

  • Source port number and destination port number: Here is the same as UDP, each data must know from which process to come to which process.
  • 32-bit serial number and 32-bit confirmation serial number: The serial number and confirmation signal here can be understood as information that two communication processes reply to each other when sending and receiving data. For example, process A starts sending data to process B from sequence number 1000 and sends five data. Then when B receives the data reply, the confirmation sequence number of A here should be from 1006, if it is not 1006, for example, 1003, it means that 1004, 1005 data packet B has not been received, so A starts the retransmission mechanism . This guarantees the reliability of the data and is also one of the characteristics of TCP. The serial number is the number that the process sends the message, and the confirmation is the number that the destination process is expected to return. Perform a comparison to verify whether the data packet arrives.
  • 4-bit TCP header length: The 4-bit TCP header length here can be understood as four bits representing the length, and the value represented by the four bits multiplied by four is the length of the TCP header. It can be seen from the figure that the minimum length of the header is 20 bytes, which means that the length of the four-bit TCP header here is 0101 by default. And the length of TCP header cannot exceed 15 * 4 = 60 bytes.
  • 6-bit flag: ① URG: whether the emergency pointer is valid ② ACK: whether the confirmation number is valid ③ PSH: prompt the receiving application to immediately read the data from the TCP buffer ④ RST: the other party requests to re-establish the connection, we call the RST logo Reset segment ⑤ SYN: request to establish a connection, we call the SYN identifier carrying the synchronization segment ⑥ FIN: notify the other party that the local end is closed, we call the FIN carrying end segment.
  • 16-bit window size: The window size here can be seen as a sign, indicating the size of the remaining space in the TCP buffer. Play a role of flow control. If 16 is the window is full, then this time is not allowed to receive data. Data arriving later will be lost.
  • 16-bit checksum: The checksum here is filled by the sender, CRC check. If the verification fails when the receiving end verifies the data, it is considered that there is a problem with the data. The checksum here not only verifies the TCP header, but also the data portion.
  • 16-bit emergency pointer: Identifies which part of the data is emergency data.

    Connection process

    We know that the TCP protocol is connection-oriented, which means that it can only be used after the client and server have successfully connected. So what is the connection process between the client and the server? In simple terms, it is three handshakes and four wave hands. The main idea is that the client needs three handshakes to connect to the server and four wave hands to disconnect after the communication is complete.

  • Three handshake

 

Both the client and the server made some preliminary preparations before shaking hands. The server first allocates a descriptor, then fills in the sockaddr_in structure, binds the created file descriptor and the server port, and then listens to make the file descriptor just become a listening descriptor, and finally blocks to accept to wait for the client's connection. The client is relatively simple, is to allocate file descriptors, fill the sockaddr_in structure, and finally connect to request the server connection until the server responds.

When the client requests the server's response through the connect, it sends a synchronization message segment, that is, a SYN request, to the server, and waits for the server's response after sending. If the server receives the SYN synchronization segment, it will send an ACK response to the client, meaning that it received the synchronization segment sent by the client. At the same time, the server will also send a SYN synchronization segment to request the client's response. After receiving the SYN synchronization segment, the client will also send an ACK response to reply to the server. This process is the process of three handshake.

In this way, the connection between the client and the server is both, both have to send a request and both have to respond. Seen from the figure, SYN_SENT is the request connection status, and SYN_RCVD is the waiting connection status. After the three-way handshake is successful, both the server and the client will enter the ESTABLISHED state, that is, the TCP connection is successful. At this time, the data can be transmitted.

During this process, if the client's SYN request is lost, the server will not respond, and the client will have a waiting time. When the waiting time arrives and no ACK response is received, the client will initiate another request. If multiple requests are unsuccessful, the client may determine that the network is abnormal and will not request again. Similarly, after the server receives the client's SYN request, it will also send an ACK response and send a SYN request. If the client does not respond to the server ACK, the server will also resend until it is judged that the network is abnormal. Therefore, if any of the three handshake is missing, the connection will not be successful and communication will not be possible. So the three-way handshake is also a way to ensure TCP reliability.

What is the purpose of the three-way handshake?
Answer: The personal understanding is easy to understand is to synchronize the serial number and confirmation number of the two parties, and exchange the size information of the tcp window.
Why do we need two handshake to complete the solution but three times?
Answer: The three-way handshake is required to prevent the invalid connection request segment from being suddenly transmitted to the server, which will cause an error.

Wave four times

After the data transmission between the client and the server is completed, the client has no request, so at this time, close is called to close the file descriptor, and the FIN_WAIT_1 state is entered. At the same time, the FIN end message segment is sent to the server. Wait for the response from the server. When the server receives the FIN end segment here, this time, the server enters the CLOSE_WAIT state. And respond to the client to send ACK. When the client receives the ACK response from the server, it enters the FIN_WAIT_2 state. When the server calls close, it sends a FIN end segment to the client. Enter the LAST_ACK state at this time. At this time, when the client receives the FIN sent by the server, it will respond to the server with an ACK, and the client enters the TIME_WAIT state. After the TIME_WAIT ends, it enters CLOSED and disconnects successfully. When the server receives the last ACK from the client, it enters the CLOSED state. Successfully disconnected.

CLOSE_WAIT and LAST_ACK status

During the three-way handshake, the server can send SYN and ACK at the same time, but why the FIN and ACK sent by the server are sent separately? ? This is actually the case. 

First of all, the FIN signal is sent because of calling close. When the client calls close, it sends a FIN end segment and enters the FIN_WAIT_1 state. However, the user segment in the server is actually imperceptible to this segment. The kernel will process this segment by itself, which means that the kernel will respond with an ACK. This process is not determined by the user code. The server's FIN is sent by the user code by calling close, so the kernel and the server do not necessarily process this information at the same time. Therefore, FIN and ACK are not necessarily sent at the same time. Note: This is not necessarily here! ! ! However, the SYN is sent directly by the kernel during the three-way handshake, so this can achieve a synchronous transmission.

If the server code does not call close, it means that the FIN end segment has not been sent. So that is to say, the connected server remains in the CLOSE_WAIT state for a long time, what impact will this have? 
The server remains in the CLOSE_WAIT state for a long time, which means that the allocated file descriptor has not been closed and returned. Then, if a large number of CLOSE_WAIT exists, it will cause a resource leak. There may be no allocable file descriptors at the end, which will make some clients unable to connect, resulting in inestimable impact.

TIME_WAIT

After the client sends the ACK response for the last time, it enters the TIME_WAIT state, and what is the client doing in this state? 
The answer is to wait! After the client finally sends the ACK response, it enters the TIME_WAIT state. This is to prevent the last ACK response from being lost. Here, the TIME_WAIT state will wait for 2MSL.

The unit MSL here is Max Segment Life, which means the maximum survival time of a message. The survival time here refers to the entire process from the occurrence of a message to its reception. The time of this process is MSL. 
Under Linux, you can use cat / proc / sys / net / ipv4 / tcp_fin_timeout to view the value of MSL. 

After the client sends the ACK response for the last time, why wait for 2MSL?

This is to ensure that the last ACK message arrives. Because the client enters the TIME_WAIT state after sending the last ACK response, if the ACK message is lost, then the server finds that it has not received an ACK response after waiting for an MSL, and then it will resend a FIN message. The time of such an ACK response plus the time of the retransmitted FIN is exactly 2MSL. If the client does not receive the FIN message after waiting for 2MSL, it means that the server received the ACK message sent by the client, which disconnects the connection. 

Here you can see that after the client exits, it enters the TIME_WAIT state.

In other words, at TIME_WAIT, the TCP connection between the client and the server still exists. 
In some cases, the server may also request to disconnect, and the server first enters FIN_WAIT_1. In this case, the server eventually enters the TIME_WAIT state. So what's the problem in this state?

Here, we terminated the server and found that the server entered the TIME_WAIT state. At this time, the server was restarted again and found that it could not be started. At this time, the port number binding failed because of failure to start. Why is that?

In fact, this is because in the TIME_WAIT state, the TCP connection still exists, so the port number just now is still bound. When the server was started again, the port number at this time was not released. So it only prompts that the binding failed.

If the server needs to handle a large number of client connections, the survival time of each connection is very short, but there are a large number of client requests per second. At this time, if the server actively closes the connection, a large number of TIME_WAIT connections will be generated. Due to our large demand, it will result in a large number of TIME_WAIT connections, resulting in insufficient server ports to handle new connections. How to solve this time?

At this time, the function setsockopt (listenfd, SOL_SOCKET, SO_REUSEADDR, & opt, sizeof (opt) can be used to solve this problem. The function of this function is to allow the creation of multiple socket descriptors with the same port number but different IP addresses. In socket () and Just call between bind ().
 

Published 42 original articles · Like 10 · Visitors 10,000+

Guess you like

Origin blog.csdn.net/qq_37659294/article/details/104561843