One, three handshakes

1.1 Three-way handshake process

The specific process of the three-way handshake is as follows:

The server process is ready to receive the external TCP connection, usually by calling the three functions of bind, listen and socket . This opening method is considered to be passive open (passive open). Then the server process is in the LISTEN state, waiting for the client connection request .
The client initiates an active open through connect, and sends a connection request to the server. The first synchronization bit in the request is SYN = 1 , and an initial sequence number is selected at the same time, abbreviated as seq = x . The SYN segment is not allowed to carry data, and only consumes a sequence number. At this point, the client enters the SYN-SEND state.
After the server receives the client connection, it needs to confirm the client's message segment. In the acknowledgment segment, set both the SYN and ACK bits to 1 . The confirmation number is ack = x + 1 , and at the same time choose an initial sequence number seq = y for yourself . This segment cannot carry data either, but it also consumes a sequence number . At this point, the TCP server enters the SYN-RECEIVED (synchronously received) state.
After receiving the response from the server, the client also needs to confirm the connection. The ACK in the confirmation connection is set to 1, the sequence number is seq = x + 1, and the confirmation number is ack = y + 1 . TCP stipulates that this segment can carry data or not . If it does not carry data, the sequence number of the next data segment is still seq = x + 1. At this point, the client enters the ESTABLISHED (connected) state
After receiving the confirmation from the client, the server also enters the ESTABLISHED /ɪˈstæblɪʃt/ state.

This is a typical three-way handshake process, and the establishment of a TCP connection can be completed through the above three message segments. The purpose of the three-way handshake is not only to let the communicating parties know that a connection is being established, but also to use the option field in the data packet to exchange some special information and exchange the initial sequence number.

Generally, the first party to send a SYN message is considered to actively open a connection, and this party is usually also called a client. The receiver of the SYN is usually called the server, which is used to receive the SYN and send the following SYN, so this opening method is passive opening.

1.2 Why is there a three-way handshake, not two or four?

Confirm that the sending and receiving capabilities of the client and server are normal

We need to first understand the meaning of the three-way handshake. Three-way handshake (Three-way Handshake) actually means that when establishing a TCP connection, the client and the server need to send a total of 3 packets. The main function of the three-way handshake is to confirm whether the receiving and sending capabilities of both parties are normal, and to specify their own initialization sequence number to prepare for subsequent reliable transmission. In essence, it is to connect to the specified port of the server, establish a TCP connection, and synchronize the serial numbers and confirmation numbers of both parties, and exchange TCP window size information.

The first handshake: the client sends a network packet, and the server receives it. In this way, the server can conclude that the sending capability of the client and the receiving capability of the server are normal.
The second handshake: the server sends a packet, and the client receives it. In this way, the client can conclude that the receiving and sending capabilities of the server and the receiving and sending capabilities of the client are normal. However, at this time, the server cannot confirm whether the receiving ability of the client is normal.
The third handshake: the client sends a packet, and the server receives it. In this way, the server can conclude that the client's receiving and sending capabilities are normal, and the server's own sending and receiving capabilities are also normal.

If there are only two handshakes, then the client knows that the acceptability of both parties is normal, but the server does not know whether its sending and the client's accepting ability are normal. Therefore, a three-way handshake is required to confirm whether the receiving and sending capabilities of both parties are normal.

Prevent "history connection" from initializing the connection

Imagine that if you use two handshakes, the following situation will occur:

If the client sends a connection request, but does not receive an acknowledgment due to the loss of the connection request message, the client retransmits the connection request again. A confirmation was later received and the connection was established. After the data transmission is completed, the connection is released, and the client sends two connection request segments, the first one is lost, and the second one reaches the server, but the first lost segment is only in some The network node has been stuck for a long time, and it is delayed until a certain time after the connection is released to reach the server. At this time, the server mistakenly believes that the client sends a new connection request, so it sends a confirmation message segment to the client, agreeing To establish a connection, a three-way handshake is not used. As long as the server sends a confirmation, a new connection is established. At this time, the client ignores the confirmation sent by the server and does not send data. The server waits for the client to send data, which wastes the network. resource. In the case of two handshakes, the server has no intermediate state for the client to prevent historical connections, which may cause the server to establish a historical connection, resulting in waste of resources.

So a third handshake is required. When encountering network congestion and the first lost segment arrives at the server, because of the third handshake, the client will ignore the second segment sent by the server when it knows that the segment has already been sent according to the seq number of the packet. At this time, the tcp connection is not established, so the server will not wait for the client to send data after a period of time, reducing the waste of network resources.

Synchronize the initial sequence number (ISN) of both parties

Both sides of the communication of the TCP protocol must maintain a "serial number". The serial number is a key factor for reliable transmission. Its role: it can deduplicate, maintain the order of acceptance, and understand what has been accepted

It can be seen that the serial number plays a very important role in the TCP connection, so when the client sends a SYN message carrying the "initial serial number", the server needs to return an ACK response message, indicating that the client's SYN message has been completed. If it is successfully received by the server, when the server sends the "initial serial number" to the client, it still needs to get a response from the client, so that the initial serial numbers of both parties can be reliably synchronized.

Why not a four-way handshake ?

The three-way handshake has theoretically established the least reliable connection, so there is no need to use more communication times.

1.3 SYN Flood (SYN flood attack)

1.3.1 What is SYN Flood

SYN Flood is an attack method that exploits TCP protocol vulnerabilities. The attacker uses the three-way handshake process in the TCP protocol to send a large number of TCP SYN packets (connection request packets) to the target host. After receiving these requests, the target host will Send a TCP SYN-ACK packet (that is, a synchronous response packet) to the attacker, indicating that a connection can be established. The attacker does not continue to communicate after receiving these response packets, but discards these response packets and does not send TCP ACK packets (that is, confirmation packets), so that a large number of unfinished connection requests accumulate in the connection queue of the target host, eventually causing The target host cannot normally process other legitimate connection requests, resulting in service unavailability.

1.3.2 Basic Principles of SYN Flood

The attacker sends a large number of forged TCP connection requests, each with a SYN flag, but does not actually establish a connection.
After receiving the SYN request, the receiving server will send a SYN-ACK (synchronization-response) response to the attacker, indicating that the connection request is accepted.
After receiving the SYN-ACK response, the attacker will not send an ACK (response) confirmation, but discard the connection request without establishing a real connection.
The attacker keeps repeating the above steps and sends a large number of forged SYN requests, which causes the server to wait for ACK confirmation and consumes a lot of resources at the same time.
Due to exhaustion of server resources, legitimate client connection requests cannot be responded, resulting in service unavailability.

1.3.3 How to defend against SYN Flood attacks

To prevent SYN Flood attacks, the following measures can be taken:

Connection limit: limit the number of servers that maintain a half-connected state at the same time. When the number of connections reaches the upper limit, new connection requests will be discarded. This can prevent the server from maintaining a large number of semi-connected states in a short period of time, thereby mitigating the impact of SYN Flood attacks. The connection limit can be set at the operating system level or in the firewall configuration.
Use SYN Cookies technology: This technology can use a special encryption algorithm to generate a cookie value during the TCP handshake process and send it to the client. When the client returns an ACK packet with a cookie value, the target host can verify whether the connection is legal based on the cookie value.
Filtering and speed limiting: Use firewall or intrusion prevention system (IDS/IPS) to filter and limit traffic to prevent a large number of forged SYN requests from entering the server. Filtering and rate limiting can help reduce the power of SYN Flood attacks and control the attack traffic within an acceptable range.
Load balancing: Use load balancing equipment to distribute traffic to multiple servers, thereby evenly distributing attack traffic to different servers and reducing the pressure on a single server.
Cloud protection: Deploy the server on the cloud service provider's platform, allowing the cloud service provider to handle large-scale SYN Flood attacks. Cloud service providers usually have strong infrastructure and protection mechanisms that can effectively deal with DDoS attacks.
Limit concurrent connections: The application layer can limit the number of concurrent connections from a single IP address. This prevents a single IP address from sending a large number of connection requests in a short period of time, slowing down the impact of an attack.

Two or four waves

2.1 The process of waving four times

The specific process of waving four times is as follows:

The client application program sends a message segment to release the connection, stops sending data, and actively closes the TCP connection. The client host sends a message segment to release the connection. The first FIN position in the message segment is 1 , does not contain data , and the sequence number bit seq = u . At this time, the client host enters the FIN-WAIT-1 (termination wait 1) stage.
After the server host receives the message segment sent by the client, it sends a confirmation response message. In the confirmation response message, ACK = 1, generates its own serial number bit seq = v, ack = u + 1, and then the server host enters CLOSE -WAIT (shutdown wait) state .
After the client host receives the confirmation response from the server host, it enters the state of FIN-WAIT-2 (stop waiting 2) . Wait for the client to send a connection release segment.
At this time , the server host will send a message segment for disconnection. In the message segment, ACK = 1, sequence number seq = v, ack = u + 1. After sending the disconnection request message, the server host will Entered the stage of LAST-ACK (final confirmation).
After the client receives the disconnection request from the server, the client needs to respond. The client sends a disconnected message segment. In the message segment, ACK = 1, and the sequence number seq = u + 1, because the client No more data has been sent since the connection was disconnected, ack = v + 1, and then enters the TIME-WAIT (time waiting) state . Please note that the TCP connection has not been released at this time. The client will enter the CLOSED state only after the waiting time is set, that is, 2MSL. The time MSL is called the Maximum Segment Lifetime (Maximum Segment Lifetime).
After the server mainly receives the disconnection confirmation from the client, it will enter the CLOSED state. Because the server ends the TCP connection earlier than the client , and the entire connection disconnection process needs to send four segments, so the process of releasing the connection is also called four waved.

Either side of the TCP connection can initiate the close operation, but usually the client initiates the close connection operation. However, some servers, such as web servers, will also initiate the operation of closing the connection after responding to the request. The TCP protocol stipulates that a close operation is initiated by sending a FIN message.

2.2 TIME_WAIT

After the communication parties establish a TCP connection, the party that actively closes the connection will enter the TIME_WAIT state. The TIME_WAIT state is also called the 2MSL wait state. In this state, TCP will wait twice the maximum segment lifetime (Maximum Segment Lifetime, MSL) time.

MSL needs to be explained here

MSL is the expected maximum lifetime of a TCP segment, that is, the longest time it exists in the network. This time is limited, because we know that TCP relies on IP data segments for transmission. There is a TTL field in IP datagrams, which determines the lifetime of IP packets. Generally, the maximum lifetime of TCP is 2 minutes, but this value can be modified, and this value can be modified according to different operating systems.

Based on this, let's discuss the state of TIME_WAIT.

When TCP performs an active close and sends the final ACK, TIME_WAIT should exist with 2 * maximum lifetime, so that TCP can resend the final ACK to avoid loss. Resending the final ACK is not because TCP retransmitted ACK, but because the other party retransmitted FIN. The client often sends FIN because it needs an ACK response to close the connection. If the survival time exceeds 2MSL , the client will send RST, causing the server to make an error.

Another factor that affects the 2MSL waiting state is that when TCP is in the waiting state, the communication parties define the connection (client IP address, client port number, server IP address, server port number) as non-reusable. Only when the 2MSL wait ends, or when the initial sequence number used by a new connection exceeds the highest sequence number used by the previous instance of the connection, or when the timestamp option is allowed to distinguish the segment of the previous connection instance to avoid confusion, This connection can only be used again. Unfortunately, some implementations impose stricter constraints. In these systems, if a port number is used by any communication end in the 2MSL waiting state, then the port number will not be used again.

2.2.1 The significance of the existence of TIME_WAIT state

Ensure reliable connection termination: When the TCP connection is closed, both the server and the client need to send the last ACK segment as confirmation to ensure the two-way closure of the connection. The TIME_WAIT state ensures that both the server and the client have enough time to receive the last ACK segment that may be stuck on the network.
Handling delayed segments: When a TCP connection is closed, delayed segments may appear in the network, and these segments may arrive after the connection is closed. The TIME_WAIT state allows these delayed segments to be processed correctly after they arrive, preventing subsequent connections from incorrectly receiving old data.

2.2.2 Why the state time should keep 2 MSL

The duration of the TIME_WAIT state is set to twice the MSL (Maximum Segment Lifetime), mainly to ensure that all connection-related segments on the network can completely disappear in the network. During the closing of a TCP connection, there are two aspects to consider:

Reliably receive the final acknowledgment: When the TCP connection is closed, the final ACK (confirmation) segment may experience some delay in the network, which is called an "orphan" segment. If the resource is released immediately after the connection is closed, and these delayed segments are still in the network, it may cause subsequent connections to receive old data by mistake. The TIME_WAIT state gives enough time for these delayed segments to be received and processed, thereby ensuring reliable closure of the connection.
Eliminate redundant connections: If the client tries to re-establish a new connection in the TIME_WAIT state, but the segment of the old connection still exists in the network, the server will reject the request. This eliminates redundant connections that could lead to data corruption.

MSL refers to the maximum time that a data packet can live on the network in an IP network. Since the message may be delayed, copied, retransmitted, etc. in the network, setting twice the MSL as the duration of the TIME_WAIT state can ensure that all connection-related message segments disappear completely in the network, thereby ensuring the reliability of the TCP connection sex and stability.

It is worth noting that, in fact, the TCP connection can be closed immediately after the TIME_WAIT state ends, because the resources related to the old connection are no longer needed at this time. However, setting twice the MSL is to ensure reliable closure of the connection when the network environment is unstable or the latency is high. If the network environment is very stable and the latency is low, the connection can be closed faster.

2.3 CLOSE_WAIT

CLOSE_WAIT is a state during the closing process of the TCP connection, indicating that the local application has completed sending data and closed the sending port, but data from the remote application can still be received. In this state, the TCP connection remains open until the remote application also closes the connection.

The CLOSE_WAIT state is usually caused by the local application. When the local application closes the sending port of the connection, but the remote application still needs to send some data or complete some operations, the TCP connection will enter the CLOSE_WAIT state. In this state, the local application can still receive data from the remote application until the remote application also closes the connection.

Once the remote application closes the connection, the TCP connection will transition from the CLOSE_WAIT state to the FIN_WAIT_2 state, indicating that the remote application has completed sending data and closed the sending port, but data from the local application can still be received. In the FIN_WAIT_2 state, the local application can still send data to the remote application until the local application also closes the receiving port of the connection.

The CLOSE_WAIT state usually does not cause problems, but if the TCP connection lasts too long in the CLOSE_WAIT state, it may cause a waste of system resources. Therefore, in practical applications, you should try to avoid the TCP connection from entering the CLOSE_WAIT state. For example, before closing the connection, you can send a FIN segment (request to close the connection) and wait for the remote application to send an ACK segment (acknowledgement of receipt). to the FIN segment) and then close the receiving port of the connection. This can prevent the TCP connection from entering the CLOSE_WAIT state, improving system performance and resource utilization.

2.4 The difference between TIME_WAIT and CLOSE_WAIT

TIME_WAIT and CLOSE_WAIT are two states in the process of closing a TCP connection. They have some differences:

CLOSE_WAIT state:
- The CLOSE_WAIT status appears on the server, indicating that the server has received the connection close request (FIN segment) sent by the client, but the server is still waiting for the application layer to complete the data transmission. That is to say, the server has closed the data transmission channel to the client, but can still receive data from the client. In the CLOSE_WAIT state, after the server waits for the application to process the remaining data, it will send its own request to close the connection, enter the LAST_ACK state, and finally complete the four-wave process.
TIME_WAIT state:
- The TIME_WAIT state appears on the client, indicating that the client has sent a connection close request (FIN segment) to the server and received the server's confirmation (ACK segment). In the TIME_WAIT state, the client waits for a period of time (usually waiting for twice the MSL, that is, the longest segment life), to ensure that all delayed segments in the network are discarded. This prevents subsequent connections from mistaking old packets for the same connection. After the waiting time is over, the client enters the CLOSED state and completely closes the connection.

To summarize the differences:

The CLOSE_WAIT state is on the server side, which means that the server waits for the application layer to process the data and then closes the connection, and can receive the data sent by the client.
The TIME_WAIT state is on the client side, which means that after the client waits for a period of time, it ensures that all delayed segments in the network are discarded, and then completely closes the connection.

Jiwang - TCP three-way handshake and four-way handshake