Too difficult, save the child, I still don’t understand TCP’s three-way handshake and four waved hands.

Recommended reading:

I posted a circle of friends a few days ago and found that the girl I had a long crush gave me a like, so I tossed and turned and stayed up all night! Do you think the girl feels about me? How else would you suddenly like me? Should we take the opportunity to confess?

So the next day I simulated many confession words in my heart, and even practiced breathing repeatedly. At night, I dialed the girl's WeChat voice. Before the other party could speak, I couldn't restrain my inner thoughts, and started talking to myself, a frantic expression...Five minutes in one go, everything is so natural!

But after I finished speaking, I didn’t wait for the girl’s response for a long time... It took a while before I heard the other person’s voice: "Hey! Hey! My signal is not good. I didn't hear a word of what you were saying just now. I'm shopping with my boyfriend...".

I hung up the phone, and I also made an in-depth summary of my failed confession! The reason is because I did not learn TCP well!

If I understand TCP, I must at least ask "Are you there?" before confession! Establish a reliable connection first, and ensure that the connection is normal before you can start confession!

If I understand TCP, then I need the other party's constant confirmation when I speak, so as to ensure that every word I say can be heard by the other party! So I can confess success!

So all because I didn’t learn TCP well, so I walked into the library...

Let's first look at the definition of TCP:

TCP is called Transmission Control Protocol (Transmission Control Protocol), which is a connection-oriented, reliable, byte stream-based transport layer communication protocol. TCP is a transport protocol specially designed to provide a reliable end-to-end byte stream on unreliable internetwork.

We know every word in it, but it's not so easy to understand even if it's connected together! Then we will extract some key words, which are the ones I highlighted above: connection-oriented, reliable, based on byte stream, transport layer, protocol, end-to-end! Understanding these keywords also understands the implementation principle of TCP, so let's start with these keywords for analysis!

Transport layer

Let's talk about the transport layer first, because we can look at TCP from a relatively high level. Let's first look at the classic OSI seven-layer network reference model:

When we need to exchange data on the network, we need to go through several layers. Each layer has related implementations. The TCP we are going to talk about today is an implementation of the transport layer. Maybe when we talk about the transport layer, we naturally think of TCP, but TCP is just an implementation of the transport layer. Other common transport layer protocols include UDP, etc.!

I know that dry text is too abstract for you, so I will grab a bag and have a look to make these layers more concrete! All the packages in this article are sent through postman to request, and then use wireShark to catch! If you don’t know about these two softwares, you can go to know about it first, but I won’t explain too much here. We enter the domain name of www.17coding.info in postman, and then send a request, wireshark can catch the data packet.

The figure has marked the relationship between each layer and the captured data packet! what! Didn't we talk about the 7-layer network reference model above? Why are there only 5 layers of data packets? Pay attention to refer to the two characters. The 7-layer model is a theoretical model. In actual networks, the application layer, session layer, and presentation layer are often integrated as application layer!

What is an agreement?

When it comes to agreement, it is an agreement that both parties abide by! For example, in this article I wrote, you can understand every word I wrote and understand what I mean. That's because we all follow the Chinese grammar, which in itself is an agreement. For example, when we write code, we must write it in accordance with the prescribed grammar so that the compiler can compile it correctly.

There are also many protocols in computer networks, such as common application layer protocols http, ftp, dns protocols and so on. Common transport layer protocols include TCP, UDP, etc.... In fact, these protocols are a specification that both the sender and the receiver follow. If we follow its specifications, we can also become the implementer of the protocol, such as writing our own web server to handle user requests. We can even stipulate a set of agreements for others to use!

TCP header format

We talked about the definition of the protocol before, and the TCP protocol must also have certain specifications! In this way, both parties in communication can recognize each other’s data messages and exchange data. Let’s first look at the TCP message format

A TCP message contains a data header and a data body. The header has a fixed length of 5 lines and a variable length of 1 line! The first 5 lines on the picture are fixed length! Each line of fixed length occupies 4 bytes (32 bits). Therefore, the fixed length of the header is 5*4=20 bytes!

At this point, we can grab a packet to see and deepen our impression. We still send a request to www.17coding.info, and then look at the TCP part of the data packet

Next, we will analyze the TCP header line by line:

first row:

1. Source port: sender port

2. Target port: receiver port

Earlier we said that TCP is end-to-end, which can be well reflected here! Each data packet has the port of the sender and the receiver. Here each port occupies 2 bytes (16 bits).

The second and third lines:

1. Sequence number: TCP is byte stream oriented. Data is stored and sent in the buffer in blocks. The sequence number is used to mark how many bytes of the entire data are the first byte of a data packet.

2. Confirmation number: After each request, the receiver will reply to the sender, telling the other how many bytes it has received, and how many bytes should be sent from the next data packet. The value here is generally equal to the received sequence number + the length of the data part of the received packet.

The serial number and confirmation number here are indispensable to ensure the reliability of TCP. We will analyze it in detail through packet capture later! The serial number and confirmation number each occupy 4 bytes (32 bits)!

Fourth line:

1. Data offset: It is more appropriate to call the head length here. As mentioned earlier, part of the TCP header length is variable, so it is necessary to identify where the data part of the data packet starts. This value occupies 4 bits.

2. Reserved: unused, for extended use. This value occupies 3 places.

3. Logos: There are 9 logos in total, each of which occupies 1 position and 9 positions in total. You can see these 9 flags in the screenshot above!
3.1. NS: Nonce, related to ECN explicit congestion notification.
3.2. CWR: The CWR flag and the following ECE flag are both used in the ECN field of the IP header. When the ECE flag is 1, it will notify the other party that the congestion window has been reduced.
3.3. ECE: ECN-Echo. If this flag is set, it will Notify the other party that the network from the other party to this side is blocked.
3.4. URG: Urgent, used to add plugging on the sender. For example, when downloading files, if you need to stop the download halfway through the download, you need to send an urgent request to tell the other party to stop sending data. Data packets are not queued.
3.5. ACK: Acknowledgment, marked as an acknowledgement.
3.6. PSH: Push, corresponding to URG, used to stop the receiver.
3.7. RST: Reset, indicating that a serious error has occurred, and the TCP connection may need to be re-created. If we open a certain website and never get it out, and we refresh with F5, the previous data packet will be rejected.
3.8. SYN: Used for synchronization, when creating a request. Will bring this mark when shaking hands!
3.9. FIN: used when the communication is over and the connection is released. Will bring this mark when waving!

4. Window: Whether it is the sender or the receiver, there are corresponding sending and receiving windows. Before communication, the two parties will negotiate the size of the window. The sender sets its own sending window according to the receiver's receiving window, and the sending window is also limited by the congestion window. This will be mentioned in the congestion control section! During the sending process, the window will be adjusted according to the receiver's processing capacity. This value has a great effect on TCP's reliable transmission and flow control! This value occupies 16 bits.

The fifth line:

1. Checksum: It is used to check whether the data packet is complete or modified. This value occupies 16 bits.

2. Urgent pointer: The pointer used to mark the urgent data in this segment, that is, it indicates that the data from the head of the data part of the data packet to the specified position is urgent data, and only starts when the flag bit URG is set effect. This value occupies 16 bits.

Sixth line:

1. Options: There are some important data in the options, let’s pick a few and talk about it

1.1. MSS: The full name of MSS is Maximum segment size, the maximum data length that can be carried by each segment negotiated by both parties (excluding segment header).

1.2. WS: The full name of WS is called Window scale, also called window factor! It is used to adjust the window size. Earlier we talked about the window size field, so what is the window factor for? The early network bandwidth and hardware configuration were relatively poor, so the maximum window size is only 16 bits reserved, that is, the maximum value that can be set is 65535. With the development of hardware and network, 65535 can no longer be satisfied. So I added a WS option to expand! If WS is set, the actual window size is equal to the window size multiplied by the window factor.

1.3. SACK: The full name of SACK is Selective ACK. Selective acknowledgment is based on accumulated acknowledgments (described later)! SACK may only be sent when out-of-sequence packets are received. If the receiver receives the following data packet and finds that the previous data packet is lost, it will notify the sender which segments are missing and need to be retransmitted!

2. Padding: This field is to make the entire header a multiple of 4 bytes. There are many similar usages in java!

We find a data packet and look at its detailed header data:

1. The red part shows that the length of the TCP header is 32 bytes, and the option part is 12 bytes. Earlier we said that the TCP header has a fixed length of 20 bytes, so 20+12=32.
2. The window size of the yellow line is 259byte, and the window factor is 256. So the actual window size is 259*256=66304!

How to understand connection-oriented

From the example of my confession failure, I can see that I started to confess before making sure that the connection is normal, which caused me to finish talking but the other party did not hear it because of the bad signal. If I make sure that the connection is normal in advance, this will not happen! We said earlier that TCP is connection-oriented, so how is TCP connection-oriented?

What did the three handshake explain?

Yes, it all started with a handshake! We all know that TCP needs to go through three handshake to establish a connection, so what does each handshake explain? Wouldn't it work if there were only two handshake? Let's first look at a scene where a call is connected:

A: Hello, can you hear it?
B: I can hear it, can you hear it?
A: I can hear it too.

Before the formal call, in order to ensure the reliability of the call, it is often necessary to confirm through the above three conversations. Are these three dialogues necessary? What is the necessity of each conversation?

A: Hello, can you hear it? (Let B know that A can speak)
B: I can hear, can you hear? (Let A know that B can hear and speak)
A: I can hear too. (Let B know that A can hear)

Only after three conversations can you confirm that your voice can be heard by the other party and that the other party's voice can be heard. This will enable follow-up dialogue. Here we have to offer the classic three-way handshake picture:

We analyze the three-way handshake process and the state after each handshake as follows:

1. Host A sends identification SYN=1 (SYN means that A requests to establish a connection with B, as mentioned before when talking about the TCP header), the sequence number Seq=x, the state of A is SYN_SENT after the first handshake request is sent, After B receives the request, the status changes from LISTEN to SYN_RCVD!

2. After receiving the connection request, host B sends the identification SYN=1, ACK=1 (SYN means B requests to establish a connection with A, ACK means to respond to A's connection request), sequence number Seq=y, confirmation number Ack =(x+1), after A receives the confirmation from B, the status becomes ESTABLISHED, and the status of B is still SYN_RCVD!

3. After host A receives it, check whether Ack is correct. If it is correct, send the identification ACK=1 (representing a response to B's connection request), sequence number Seq=(x+1), confirmation number Ack=(y+1) . After B receives A's confirmation, the status of A and B both become ESTABLISHED!

The points we should pay attention to here are:

1. The SYN and ACK in the brackets in the sending request in the figure are the flag bits in the TCP header mentioned above! And Seq and Ack represent the serial number and confirmation number respectively.

2. After receiving the Seq sent by the sender, the receiver responds with an Ack. The value of Ack is equal to Seq+1, which means that the sender is about to start sending the data at the Seq+1 position.

2. After receiving the connection request from A, B sends the two identification bits SYN and ACK in the reply at the same time, and sends the request to establish the connection and the response to A in the same packet. This is why only three handshake is needed. The connection can be established.

We still send a request to www.17coding.info, the following is the three-way handshake package:

In the info column, we can obviously see that the header of the sent data packet has the flags we mentioned above, as well as header information such as Seq and Ack, and header option data such as Win and MSS! Therefore, the three-way handshake is not just simply establishing a connection, but also negotiating some parameters!

When I select a row with the mouse, if the data packet contains an acknowledgment of a certain data packet (that is, there is an ACK mark), I can see a small tick in the No column of the corresponding data packet, such as the above In the picture, my mouse selects the data packet of the third handshake, and there is a small tick in front of the data packet of the second handshake.

Why does it only take three to shake hands and four to wave?

Through the three-way handshake, the two parties have established a reliable connection, and data can be transmitted! When the data transfer is complete, the connection must be closed, because the connection is also a resource! It takes four waves to close the connection!

Why does it take three handshake to complete a handshake, but four waves? Can I do it three times? In fact, there is nothing wrong with it! For example, the following dialogue scene:

A: I'm finished, hang up after you finish!
B: Okay, I'm done, and I can hang up!
A: Okay, goodbye.
hang up…

In this way, three conversations can be achieved by waving hands, but in the actual network, when I send a request, the response body of the server may be relatively large, and it takes a long time to transmit! Therefore, when the client initiates a disconnect request, the server first responds with an acknowledgement, and then sends the server disconnect request after all data transmission is completed.

A: I'm finished, hang up after you finish!
B: Okay...
B:...
B: I'm finished, I can hang up
A: Okay, goodbye,
hang up...

So in most cases four waves are required! However, in my personal packet capture practice, there will be three waves of hands that can complete the disconnection.

Here we have to offer the classic four waved hands again:

We analyze the process of four waves and the state after each wave as follows:

1. Host A sends flag FIN=1 (FIN indicates that A requests to close the connection) to close the data transmission from A to B. At this time, the status of A is FIN_WAIT_1!

2. Host B sends an ACK to A after receiving the close request (ACK means answering A's close connection request), and A no longer sends data to B. At this time, A's status is FIN_WAIT_2, and B is CLOSE_WAIT!

3. Host B sends flag FIN=1 to close the data transmission from B to A. At this time, the status of A is TIME_WAIT and B is LAST_ACK!

4. Host A sends an ACK to B after receiving the close request, and B no longer sends data to A at this time. At this time, both A and B are closed, and the state becomes CLOSED.

In the figure, we can see that the TIME_WAIT state of A will continue for 2MSL and then become CLOSED. The Chinese of MSL (Maximum Segment Lifetime) can be translated as "message maximum survival time"! It is the longest time any message has existed on the network, and the message will be discarded after this time. What is the function of TIME_WAIT to maintain 2MSL?

1. Host A sends an ACK to Host B at the fourth wave. If the connection is closed directly after the sending is completed, if B does not receive the ACK due to network reasons, then B cannot close the connection! Therefore, after A responds to the confirmation, it still needs to wait. In case B does not receive the response, it will continue to send the FIN request.

2. If you don't wait for 2MSL, the client's port may be reused. If you use this port again to establish a connection with the server, there will be interference between the two connections using the same four-tuple!

Let’s look at the waved packet sent to www.17coding.info:

Maybe you can't immediately see the four waved data packets when you capture the packets! That's because in HTTP1.1 and later, long connections are enabled by default! That is, after a request, the established connection will not be closed immediately, but will be used by other subsequent requests to reduce the resource consumption of each re-establishment of the connection! If you want to catch four waved packets immediately after sending the request, you can set the Http header Connection:close. So every time you send a request, you can see the complete three-way handshake and four waved hands!

How does TCP guarantee reliable transmission?

To ensure the reliability of transmission We have already mentioned connection-oriented, establishing a connection is the first step to ensure data transmission. How to ensure reliable data transmission after the connection is established?

Let's go back to the scene of our phone call. Generally, in the process of dialogue, both parties have to interact and respond to each other. It is not that one person keeps talking and the other party has no response! For example, the following scenario:

A: Let me tell you, I met a girl online last week.
B: Oh, awesome!
A: Then I made an appointment yesterday to meet.
B: 666! and then?
A: Then we @#¥%……&
B: Fuck, I didn’t catch what you said just now, you say it again?

Such confirmation and response ensure that the communication between the two parties can be complete and reliable. TCP also uses this y response and confirmation retransmission mechanism to ensure reliable transmission on unreliable networks. As long as I do not receive the confirmation, I think it was not sent successfully and will re-send it.

Stop waiting for agreement

The stop-waiting protocol means that every time a data packet is sent to the other party, it needs to wait for the other party's response and then send the next data packet! Stop waiting for the agreement will have the following situations:

1. No error condition: A sends the M1 package to B, and B will give A a confirmation after receiving it. When A receives B’s confirmation, it will send package M2.

2. Timeout retransmission: A sends M1 packet to B. If the packet is lost during the sending process, A will resend it. The time A waits for retransmission is slightly more than the round trip time (RTT) of a message.

3. Confirmation loss: If B is lost when sending the confirmation to A, A will resend the M1 packet to B. Since B has already processed the M1 data packet, B will discard the packet, and then retransmit the confirmation M1 to A.

4. Confirmation is late: If A sends data packet M1 to B, B will be delayed when replying to confirm. At this time, A will resend the packet M1 to B, and B discards the data packet after receiving it, and then retransmits the confirmation M1 to A. At this time, A will receive multiple confirmations, and A will also discard the confirmation after receiving the late confirmation for the second time.

As we can see from the above, the stop-waiting protocol always waits until the confirmation is received before sending the next data packet. As long as I do not receive the confirmation from you, I assume that you have not received the data packet I sent, and I will resend it! Although this is reliable, it will lead to lower channel utilization!

Pipeline transmission

Pipeline transmission is to send multiple sets of data packets at a time, and it is not necessary to stop and wait for the other party's confirmation after sending a set each time. Since there is always uninterrupted data transmission on the channel, a higher channel utilization rate can be obtained!

How to ensure the reliability of pipeline transmission? The sender needs to maintain the sending window. If the sending window is 5, then 5 data packets will be sent at the same time, and then wait for confirmation! If there is a confirmation from the receiver, the window will slide and the sixth packet will be sent.

If it is a single confirmation, the efficiency may be relatively low, so there is a cumulative confirmation! That is to say, if the sender sends data packets 1, 2, 3, 4, the receiver only needs to reply to the confirmation of data packet 4, which means that the 1234 data packets have been received, and the fifth data packet can be sent. Up! If data packets 1, 2, 3, 4 are sent, and the third packet is lost, how to confirm? TCP will only reply to the acknowledgment of packet 2 and selectively confirm packet 4 (SACK mentioned in the TCP header option), so that the sender knows that packet 4 has been successfully sent, and only needs to resend the packet 3.

Continuing the previous packet capture example, the receiver does not confirm every packet, but cumulatively confirms multiple packets:

Here we can see that the client only confirms once after the server sends multiple packets.

Flow control and congestion control

From the previous we know that by establishing a reliable connection and confirmation mechanism, the reliability of the TCP connection is guaranteed! However, the processing power of the computer used by everyone is different. What if I send too fast and the other party cannot handle it? How do the communicating parties coordinate the frequency of sending and receiving data?

Sliding window technology in bytes

When introducing the TCP header, we have already mentioned the sliding window, and introduced the relevant control parameter Win! Speaking of receiving window and sending window! What is their relationship like?

Assuming that A needs to transmit data to B, B must first tell A how big his receiving window is. A sets up its own sending window according to B's receiving window! The sending window of A cannot be larger than the receiving window of B! Before starting to transfer data, the initial window settings are as follows:

As shown in the figure above, can we see that the receiving window of B is set to 10 bytes, and the sending window of A cannot exceed 10 bytes! If you start to transmit data, A will encapsulate the data into multiple data packets for transmission, as shown below

Before receiving B's confirmation, A's window will not slide, which means that up to 10 bytes of data can be sent. If B receives the data and confirms the reply to A, then the window of A will slide, as shown in the following figure:

In this way, A can send the 11th and 12th bytes again! If B's ​​processing power becomes weak, you can also notify A to reduce the sending window! In this way, the receiving and sending capabilities of both parties are well coordinated! This is a good realization of TCP's reliable transmission and flow control!

The above data packet continues to be sent. If the data packet composed of 3, 4, and 5 bytes is lost during the sending process, but the following data is received, will the sending window of A move?

If this is the case, A's sending window will not move. When B receives the following data packet, the Ack reply to A will be set to 3, and a SACK is set in the option (described in the TCP header option) to tell A which part of the data has been received, and which part of the data is needed Retransmit!

Congestion control

Using sliding window technology, the receiving and sending capabilities of both parties can be well coordinated. However, the network conditions are very complicated, and there may be thousands of senders and receivers on the same network! If everyone needs to transmit data and needs to occupy the network, failure to control measures will cause the entire network to be blocked or even paralyzed.

If I were to drive from Shenzhen to Guangzhou, I would take the highway. If I was the only one driving, it would surely be unimpeded! But the highway is not mine, everyone can pass! So when the holiday comes, everyone flocks to it, and the high-speed carrying capacity will not be adjusted due to the holiday! At this time, measures such as traffic control and current limiting are often needed to ease the traffic!

1. The green line represents an ideal situation, if the expressway throughput is 100! When the number of vehicles to pass does not exceed 100, all vehicles can pass smoothly! When more than 100 vehicles need to pass, the number of vehicles passing each time is 100, and the load that can be provided is relatively stable.

2. Red means that if the expressway throughput is 100 without any traffic control! When the number of vehicles that need to pass does not exceed 100, a slight traffic jam will occur! However, as the number of vehicles increases, serious congestion or even paralysis will occur!

3. Blue means that under traffic control, if the expressway throughput is 100! When the number of vehicles that need to pass does not exceed 100, a slight traffic jam will occur! But with the increase of vehicles, the traffic has been kept high load, and there will be no paralysis!

The network is like a highway, and the transmitted data packets are like vehicles to pass, and TCP is more like a traffic policeman, maintaining the order of data transmission! How does TCP do it?

Slow start and congestion avoidance

The sender maintains a cwnd (congestion window, note that the congestion window here cannot be larger than the sending window mentioned above!), and the congestion window is set to 1. If it is found that the packet is not lost, adjust the congestion window to 2! If there is no packet loss, adjust the congestion window to 4! In this way every time it grows to 16 at a rate of 2 times! Then 17, 18, 19 are increased one by one until the size is consistent with the sending window. This is the so-called slow start and congestion avoidance, 16 is the slow start threshold...

Do you have the feeling of getting an inch!

I Cengceng not go ...
...
I went in not moving ...
...
I ...

If packet loss is found during transmission, the congestion window size will be adjusted to 1, and the new slow start threshold will be set to one-half of the time when congestion occurs, that is, when the congestion window is 24, there will be loss. In case of packet phenomenon, the new slow start threshold is adjusted to 12! If you understand the above text description, the following figure is not difficult to understand!

Fast retransmission

I talked about cumulative confirmation before, and also talked about selective confirmation. This is related to fast retransmission! If the receiver finds a packet loss, it will not wait for the cumulative confirmation, and will notify the sender of three repeated confirmations to notify the other party to resend the lost packet. When the receiver receives three repeated confirmations, it realizes that the data packet is lost and retransmits!

As can be seen from the figure below, when there is a packet loss, the receiver's Ack is equal to 50, and SACK selectively confirms the bytes between 60 and 89! At this time, the sender also knows that the data of 50 59 is lost and retransmits!

Quick recovery

If once packet loss occurs, the congestion window becomes 1, which is too silly. It would be great if there could be a quick recovery mechanism! TCP uses the fast recovery mechanism! When there is a packet loss, the slow start will not be performed again, but it will go directly to congestion avoidance! That is, the new slow start threshold is added and increased!

After reading the complete text, let's go back to the definition of TCP. Can you understand it again?

TCP is called Transmission Control Protocol (Transmission Control Protocol), which is a connection-oriented, reliable, byte stream-based transport layer communication protocol. TCP is a transport protocol specially designed to provide a reliable end-to-end byte stream on unreliable internetwork.

Guess you like

Origin blog.csdn.net/weixin_45784983/article/details/108578875