Computer network-TCP three-way handshake detailed explanation

TCP is connection-oriented, that is, before the sender and receiver send data, they must first establish a connection, so that the connection is always used to transmit data until the connection is disconnected. Establishing a connection includes setting parameters, allocating memory space, and negotiating parameters between the sending and receiving parties. This process requires three successful communications, which is generally called a "three-way handshake".

In layman's terms, these three communications are:

  • Initiator: "Hi, may I establish a connection with you?" (send request, wait for reply)
  • Recipient: "Okay, I'm ready, come on."
  • Initiator: "Okay, thanks, I'm sending you data now."

Of course, there are many details involved in the specific implementation process, which will be described one by one below.

1. TCP segment structure

To understand what messages are sent during the three-way handshake, you must first know which fields the TCP message segment consists of, and which fields play a key role in this process.
insert image description here
We focus on the following fields:

  • Sequence number: Sequence number, used to mark the sequence number of a message segment, the byte stream number of the first byte of the message segment
  • Acknowledgment number: Acknowledgment number, only when the ACK flag is 1, the acknowledgment sequence number field is valid
  • ACK: Used to indicate that the acknowledgment number is valid
  • SYN: Synchronize Sequence Numbers

2. Three-way handshake

Next, look more closely at how a TCP connection is established.
Suppose a process running on one host (client) wants to establish a connection with a process on another host (server). The TCP in the client will establish a TCP connection with the TCP in the server as follows:
insert image description here

  • Step 1: The client's TCP sends a special TCP message segment to the server's TCP. This message segment does not contain application layer data, and the SYN of its header is set to 1. This special message segment is called a SYN message . paragraph. And, the client randomly selects an initial sequence number ( client_isn , initial sequence number) and puts that value under the sequence number field. The client sends a SYN segment and enters the SYN_SENT state, waiting for the server to confirm.
  • Step 2: Once the server receives the TCP SYN segment (it can be judged from the SYN flag bit being 1), it will allocate TCP buffers and variables for the connection, and send a segment allowing the connection (connection -granted segment) to client TCP. This message segment also does not contain application layer data, and SYN is also set to 1, and the ACK flag bit is also 1, the confirmation number is client_isn+1 , and an initial sequence number server_isn is selected as the value of the sequence number field. This segment is called the SYNACK segment. At this point the server enters the SYN_RECV state.
  • Step 3: After the client receives the SYNACK segment (which can be judged by the SYN, ACK, and confirmation number), it also allocates buffers and variables for the connection. Then the client will send a confirmation message to the server again, the ACK flag is 1, and the confirmation number is server_isn+1 , (this time the SYN is 0, and the SYN is only set to 1 in the first two handshakes), this time the message Segments can carry data from the application layer. At this point the client enters the ESTABLISHED state. After the server receives this message segment, it also enters the ESTABLISHED state. At this time, the connection is completely established, and the two parties can send data to each other.

3. SYN flood attack

In the above discussion, we know that when the server receives a SYN segment, it allocates and initializes connection variables and caches, and then sends a SYNACK in response. Before receiving the ACK segment from the client, the connection is not fully established, we call it a half -open connection. If the client does not send ACK to complete the third step of the three-way handshake, the server will terminate the half-open connection within a certain period of time and reclaim the allocated resources.

Under such a protocol, it is easy to be attacked by a Denial of Service (DoS) attack called a SYN flood attack. The attacker sends a large number of TCP SYN message segments to the server without completing the third step of the three-way handshake, so that the server continuously allocates resources for these half-open connections, causing the server's connection resources to be exhausted.

There is currently a defense mechanism against this attack called a SYN cookie .

The idea of ​​this mechanism is not to allocate resources immediately after receiving SYN (because of fear), but to judge whether the initiator of the connection is a legitimate user in the third step, and if so, allocate resources and establish a connection .

First of all, when the server receives a SYN, it does not allocate resources immediately, but generates an initial serial number as follows: the serial number is " the source and destination IP addresses and port numbers in the SYN message segment and an The hash value of the secret number (secret number) that you know, that is to say, only if you know the secret number, can you calculate the serial number (this initial serial number is called "cookie" ) . The server then sends a SYNACK containing this initial sequence number. It should be noted that the server does not maintain any state information about the SYN at this time, and does not even need to remember the cookie value. So if the client doesn't return an ACK, then nothing happens to the server, and now the SYN flood attack can't be done.

So how does the legitimate user complete the third step? In fact, nothing has changed, and it still proceeds in the original way, sending an ACK to the server. At this time, it is the server that needs to do something. How does the server judge that this ACK message is a confirmation of the previous SYNACK? It's very simple, because the previous SYNACK sequence number is calculated based on " the source and destination IP addresses and port numbers in the SYN message segment and a secret number that only the server knows ", so if it is still the same user this time, Then the source and destination IP addresses and port numbers will not change, and then the server also knows the secret number. Using the original hash function, the serial number can be obtained, and then add 1 to see if it matches the ACK The confirmation numbers of the messages are equal. If they are equal, it means that the ACK corresponds to the previous SYNACK and is legal, so a connection is created.

4. Why "three times"

First, why a three-way handshake instead of four or more? This problem is relatively simple, because since the problem can be solved three times, why use four times to waste resources?

But in fact, the point of the question is, why can't it be used only twice? Can't the third handshake be removed?
For the improved version of the "three-way handshake" against SYN flood attacks (see above), the third handshake is definitely necessary, which is obvious.

What if the attack is not considered? Can two handshakes do the trick?

Xie Xiren's edition of "Computer Network" discussed this issue. In general, the three-way handshake is to prevent the invalid connection request segment from being sent to the server suddenly, causing inconsistency between the two parties and resulting in waste of resources.

"Invalid connection request segment" refers to the situation where the client sends a SYN segment, which stays in the network due to congestion or other reasons, so that the client thinks that the packet is lost (in fact, it is not lost) , so re-send a SYN message segment, assuming that this time it is successfully completed, then the two parties establish a connection. This seems to be no problem, but there is a hidden danger in the network, that is, the SYN segment that is still being transmitted in the network. If the SYN is received by the server during the connection, the server will just ignore it, and everything will be fine. , but what if it is received after the connection is released? At this time, the server thinks that someone has sent him a connection request, so it responds with a SYNACK. If two handshakes are used, the server thinks that the connection has been established at this time. But when the client receives this SYNACK, if he does not initiate a connection, then he will ignore the SYNACK, as if nothing happened (if the client happens to initiate a connection at this time, then he will actually ignore the SYNACK, Because the confirmation number is wrong.). That's a big problem. At this time, the server thinks that the connection is good and sends data to the client, but the client is in the CLOSED state, and will discard these packets, which is very wasteful. And there is another embarrassing problem, that is, when the client intends to initiate a connection at this time, the server ignores it again. This is embarrassing here, and they don't even want to send data to each other. Of course, these problems do not seem to be unsolvable. When the client finds that the server always sends data to itself, but it always discards it, it may send an RST to the server (the RST tag number of the message segment is 1), forcing the service end closes the connection. But resources are always wasted for a while. With the three-way handshake, there will be no such problems.


Related reading:
Computer Network - Detailed Explanation of TCP Four-way Waving Process
Three-way Handshake and Four-way Wave of TCP Connection - Analogy of Long-distance Love Couples Beginning to Interact and Break Up (easy to understand)

TCP is connection-oriented, that is, before the sender and receiver send data, they must first establish a connection, so that the connection is always used to transmit data until the connection is disconnected. Establishing a connection includes setting parameters, allocating memory space, and negotiating parameters between the sending and receiving parties. This process requires three successful communications, which is generally called a "three-way handshake".

In layman's terms, these three communications are:

  • Initiator: "Hi, may I establish a connection with you?" (send request, wait for reply)
  • Recipient: "Okay, I'm ready, come on."
  • Initiator: "Okay, thanks, I'm sending you data now."

Of course, there are many details involved in the specific implementation process, which will be described one by one below.

1. TCP segment structure

To understand what messages are sent during the three-way handshake, you must first know which fields the TCP message segment consists of, and which fields play a key role in this process.
insert image description here
We focus on the following fields:

  • Sequence number: Sequence number, used to mark the sequence number of a message segment, the byte stream number of the first byte of the message segment
  • Acknowledgment number: Acknowledgment number, only when the ACK flag is 1, the acknowledgment sequence number field is valid
  • ACK: Used to indicate that the acknowledgment number is valid
  • SYN: Synchronize Sequence Numbers

2. Three-way handshake

Next, look more closely at how a TCP connection is established.
Suppose a process running on one host (client) wants to establish a connection with a process on another host (server). The TCP in the client will establish a TCP connection with the TCP in the server as follows:
insert image description here

  • Step 1: The client's TCP sends a special TCP message segment to the server's TCP. This message segment does not contain application layer data, and the SYN of its header is set to 1. This special message segment is called a SYN message . paragraph. And, the client randomly selects an initial sequence number ( client_isn , initial sequence number) and puts that value under the sequence number field. The client sends a SYN segment and enters the SYN_SENT state, waiting for the server to confirm.
  • Step 2: Once the server receives the TCP SYN segment (it can be judged from the SYN flag bit being 1), it will allocate TCP buffers and variables for the connection, and send a segment allowing the connection (connection -granted segment) to client TCP. This message segment also does not contain application layer data, and SYN is also set to 1, and the ACK flag bit is also 1, the confirmation number is client_isn+1 , and an initial sequence number server_isn is selected as the value of the sequence number field. This segment is called the SYNACK segment. At this point the server enters the SYN_RECV state.
  • Step 3: After the client receives the SYNACK segment (which can be judged by the SYN, ACK, and confirmation number), it also allocates buffers and variables for the connection. Then the client will send a confirmation message to the server again, the ACK flag is 1, and the confirmation number is server_isn+1 , (this time the SYN is 0, and the SYN is only set to 1 in the first two handshakes), this time the message Segments can carry data from the application layer. At this point the client enters the ESTABLISHED state. After the server receives this message segment, it also enters the ESTABLISHED state. At this time, the connection is completely established, and the two parties can send data to each other.

3. SYN flood attack

In the above discussion, we know that when the server receives a SYN segment, it allocates and initializes connection variables and caches, and then sends a SYNACK in response. Before receiving the ACK segment from the client, the connection is not fully established, we call it a half -open connection. If the client does not send ACK to complete the third step of the three-way handshake, the server will terminate the half-open connection within a certain period of time and reclaim the allocated resources.

Under such a protocol, it is easy to be attacked by a Denial of Service (DoS) attack called a SYN flood attack. The attacker sends a large number of TCP SYN message segments to the server without completing the third step of the three-way handshake, so that the server continuously allocates resources for these half-open connections, causing the server's connection resources to be exhausted.

There is currently a defense mechanism against this attack called a SYN cookie .

The idea of ​​this mechanism is not to allocate resources immediately after receiving SYN (because of fear), but to judge whether the initiator of the connection is a legitimate user in the third step, and if so, allocate resources and establish a connection .

First of all, when the server receives a SYN, it does not allocate resources immediately, but generates an initial serial number as follows: the serial number is " the source and destination IP addresses and port numbers in the SYN message segment and an The hash value of the secret number (secret number) that you know, that is to say, only if you know the secret number, can you calculate the serial number (this initial serial number is called "cookie" ) . The server then sends a SYNACK containing this initial sequence number. It should be noted that the server does not maintain any state information about the SYN at this time, and does not even need to remember the cookie value. So if the client doesn't return an ACK, then nothing happens to the server, and now the SYN flood attack can't be done.

So how does the legitimate user complete the third step? In fact, nothing has changed, and it still proceeds in the original way, sending an ACK to the server. At this time, it is the server that needs to do something. How does the server judge that this ACK message is a confirmation of the previous SYNACK? It's very simple, because the previous SYNACK sequence number is calculated based on " the source and destination IP addresses and port numbers in the SYN message segment and a secret number that only the server knows ", so if it is still the same user this time, Then the source and destination IP addresses and port numbers will not change, and then the server also knows the secret number. Using the original hash function, the serial number can be obtained, and then add 1 to see if it matches the ACK The confirmation numbers of the messages are equal. If they are equal, it means that the ACK corresponds to the previous SYNACK and is legal, so a connection is created.

4. Why "three times"

First, why a three-way handshake instead of four or more? This problem is relatively simple, because since the problem can be solved three times, why use four times to waste resources?

But in fact, the point of the question is, why can't it be used only twice? Can't the third handshake be removed?
For the improved version of the "three-way handshake" against SYN flood attacks (see above), the third handshake is definitely necessary, which is obvious.

What if the attack is not considered? Can two handshakes do the trick?

Xie Xiren's edition of "Computer Network" discussed this issue. In general, the three-way handshake is to prevent the invalid connection request segment from being sent to the server suddenly, causing inconsistency between the two parties and resulting in waste of resources.

"Invalid connection request segment" refers to the situation where the client sends a SYN segment, which stays in the network due to congestion or other reasons, so that the client thinks that the packet is lost (in fact, it is not lost) , so re-send a SYN message segment, assuming that this time it is successfully completed, then the two parties establish a connection. This seems to be no problem, but there is a hidden danger in the network, that is, the SYN segment that is still being transmitted in the network. If the SYN is received by the server during the connection, the server will just ignore it, and everything will be fine. , but what if it is received after the connection is released? At this time, the server thinks that someone has sent him a connection request, so it responds with a SYNACK. If two handshakes are used, the server thinks that the connection has been established at this time. But when the client receives this SYNACK, if he does not initiate a connection, then he will ignore the SYNACK, as if nothing happened (if the client happens to initiate a connection at this time, then he will actually ignore the SYNACK, Because the confirmation number is wrong.). That's a big problem. At this time, the server thinks that the connection is good and sends data to the client, but the client is in the CLOSED state, and will discard these packets, which is very wasteful. And there is another embarrassing problem, that is, when the client intends to initiate a connection at this time, the server ignores it again. This is embarrassing here, and they don't even want to send data to each other. Of course, these problems do not seem to be unsolvable. When the client finds that the server always sends data to itself, but it always discards it, it may send an RST to the server (the RST tag number of the message segment is 1), forcing the service end closes the connection. But resources are always wasted for a while. With the three-way handshake, there will be no such problems.


Related reading:
Computer Network - Detailed Explanation of TCP Four-way Waving Process
Three-way Handshake and Four-way Wave of TCP Connection - Analogy of Long-distance Love Couples Beginning to Interact and Break Up (easy to understand)

Guess you like

Origin blog.csdn.net/psq1508690245/article/details/115198272