This article absolutely talks about TCP/IP!

This article absolutely talks about TCP/IP!

Liang Xu Linux  Yesterday

image

image

 

 

From: Denver, Author: Ruheng

Link: https://juejin.im/post/6844903490595061767

image

One, TCP/IP model

The TCP/IP protocol model (Transmission Control Protocol/Internet Protocol) contains a series of network protocols that form the basis of the Internet and is the core protocol of the Internet.

The reference model based on TCP/IP divides the protocol into four levels, which are link layer, network layer, transport layer and application layer. The figure below shows the contrast between the TCP/IP model and the OSI model.

image

The TCP/IP protocol family is packaged layer by layer from top to bottom. The top layer is the application layer, where there are http, ftp and other familiar protocols. The second layer is the transport layer, and the well-known TCP and UDP protocols are at this level. The third layer is the network layer. The IP protocol is here. It is responsible for adding IP addresses and other data to the data to determine the destination of the transmission. The fourth layer is the data link layer. This layer adds an Ethernet protocol header to the data to be transmitted and performs CRC encoding to prepare for the final data transmission.

 

image

 

The above figure clearly shows the role of each layer in the TCP/IP protocol, and the process of TCP/IP protocol communication actually corresponds to the process of data stacking and popping. In the process of stacking, the data sender continuously encapsulates the header and the tail at each layer, adding some transmitted information to ensure that it can be transmitted to the destination. In the process of unstacking, the data receiver continuously removes the header and tail at each layer to obtain the final transmitted data.

 

image

The above figure uses the HTTP protocol as an example for specific instructions.

Second, the data link layer

The physical layer is responsible for the exchange of 0 and 1 bit streams with the voltage level of the physical device and the flashing of light. The data link layer is responsible for dividing the sequence of 0 and 1 into data frames that are transmitted from one node to another adjacent node. These nodes are uniquely identified by MAC (MAC, physical address, a host will have a MAC address).

 

image

  • Encapsulate into a frame: Add header and trailer to the network layer datagram and encapsulate it into a frame. The frame header includes the source MAC address and the destination MAC address.

  • Transparent transmission: zero-bit padding, escape characters.

  • Reliable transmission: It is rarely used on a link with a low error rate, but the wireless link WLAN will ensure reliable transmission.

  • Error detection (CRC): The receiver detects an error, and if an error is found, the frame is discarded.

Third, the network layer

1. IP protocol

The IP protocol is the core of the TCP/IP protocol. All TCP, UDP, IMCP, and IGMP data are transmitted in the IP data format. It should be noted that IP is not a reliable protocol. This means that the IP protocol does not provide a mechanism for processing the data after it is not communicated. This is considered to be the upper layer protocol: TCP or UDP.

1.1 IP address

In the data link layer, we generally use MAC addresses to identify different nodes, and in the IP layer, we also have a similar address identification, which is the IP address.

The 32-bit IP address is divided into network bits and address bits. This can reduce the number of routing table records in the router. With a network address, you can limit the terminals with the same network address to be in the same range, so the routing table only needs By maintaining a direction for this network address, you can find the corresponding terminals.

Class A IP address: 0.0.0.0~127.0.0.0
Class B IP address: 128.0.0.1~191.255.0.0
Class C IP address: 192.168.0.0~239.255.255.0

1.2 IP protocol header

image

Here only introduce: the eight-bit TTL field. This field specifies how many routes the packet will pass through before it will be discarded. Every time an IP data packet passes through a router, the TTL value of the data packet will decrease by 1. When the TTL of the data packet becomes zero, it will be automatically discarded.

The maximum value of this field is 255, which means that a protocol packet will be discarded when it travels 255 times in the router. Depending on the system, this number is different, usually 32 or 64.

2. ARP and RARP protocols

ARP is a protocol for obtaining MAC addresses based on IP addresses.

The ARP (Address Resolution) protocol is a resolution protocol. Originally, the host does not know which interface of which host this IP corresponds to. When the host wants to send an IP packet, it will first check its own ARP cache (that is, An IP-MAC address correspondence table cache).

If the queried IP-MAC value pair does not exist, the host sends an ARP protocol broadcast packet to the network. This broadcast packet contains the IP address to be queried, and all hosts that directly receive the broadcast packet will query their own IP address, if a host that receives the broadcast packet finds that it meets the conditions, it will prepare an ARP packet containing its own MAC address and send it to the host that sends the ARP broadcast.

The broadcast host will update its ARP cache (the place where the IP-MAC correspondence table is stored) after receiving the ARP packet. The host that sends the broadcast will use the new ARP cache data to prepare the data link layer for data packet sending.

The work of the RARP protocol is the opposite, so I won’t repeat it.

3. ICMP protocol

The IP protocol is not a reliable protocol. It does not guarantee that data will be delivered. Naturally, the work of ensuring data delivery should be done by other modules. One of the important modules is the ICMP (Network Control Message) protocol. ICMP is not a high-level protocol, but an IP layer protocol.

An error occurs when transmitting IP data packets. For example, the host is unreachable, the route is unreachable, etc. The ICMP protocol will packetize the error information and then send it back to the host. Give the host a chance to deal with errors, which is why it is said that the protocol built above the IP layer is possible to achieve security.

Four, ping

Ping can be said to be the most famous application of ICMP, and it is part of the TCP/IP protocol. Use the "ping" command to check whether the network is connected, which can help us analyze and determine network failures.

For example: when one of our websites is not available. Usually ping this website. Ping will echo some useful information. The general information is as follows:

image

The word ping is derived from sonar positioning, and this program does exactly that. It uses ICMP protocol packets to detect whether another host is reachable. The principle is to use ICMP with a type code of 0 to send a request, and the requested host responds with ICMP with a type code of 8.

五、Traceroute

Traceroute is an important tool used to detect the routing situation between a host and a destination host, and it is also the most convenient tool.

The principle of Traceroute is very, very interesting. After it receives the IP of the destination host, it first sends a UDP packet with TTL=1 to the destination host, and after the first router that passes through receives this packet, it will automatically After the TTL is reduced by 1, and the TTL becomes 0, the router discards the packet, and at the same time generates an ICMP datagram that the host is unreachable to the host. After receiving this datagram, the host sends a UDP datagram with TTL=2 to the destination host, and then stimulates the second router to send an ICMP datagram to the host. Repeat this way until it reaches the destination host. In this way, traceroute gets all the router IPs.

image

Six, TCP/UDP

Both TCP/UDP are transport layer protocols, but the two have different characteristics and different application scenarios. The following is a comparative analysis in the form of a chart.

image

Message-oriented

The message-oriented transmission method is how long a message is handed over to UDP by the application layer, and UDP is sent as it is, that is, one message is sent at a time. Therefore, the application must select a message of the appropriate size. If the message is too long, the IP layer needs to be fragmented, reducing efficiency. If it is too short, the IP will be too small.

Byte stream oriented

For byte streams, although the interaction between the application and TCP is one data block at a time (with different sizes), TCP treats the application as a series of unstructured byte streams. TCP has a buffer. When the data block transmitted by the application is too long, TCP can divide it into shorter blocks and transmit it.

Regarding congestion control and flow control, it is the focus of TCP, which will be explained later.

Some applications of TCP and UDP protocol

image

When should I use TCP?

When there is a requirement for the quality of network communication, such as: the entire data must be accurately and accurately transmitted to the other party. This is often used for some reliable applications, such as HTTP, HTTPS, FTP and other file transfer protocols, and POP, SMTP and other mail transmissions. Agreement.

When should I use UDP?

When the network communication quality is not high, the network communication speed is required to be as fast as possible, then UDP can be used.

Seven, DNS

DNS (Domain Name System), a distributed database on the Internet that maps domain names and IP addresses to each other, can make it easier for users to access the Internet without having to remember the IP number string that can be directly read by the machine. Through the host name, the process of finally obtaining the IP address corresponding to the host name is called domain name resolution (or host name resolution). The DNS protocol runs on top of the UDP protocol and uses port number 53.

Eight, TCP connection establishment and termination

1. Three handshake

TCP is connection-oriented, no matter which party sends data to the other party, a connection must be established between the two parties. In the TCP/IP protocol, the TCP protocol provides reliable connection services, and the connection is initialized through a three-way handshake. The purpose of the three-way handshake is to synchronize the serial number and confirmation number of the two parties in the connection and exchange TCP window size information.

image

The first handshake : establish a connection. The client sends a connection request segment, setting the SYN position to 1, and the Sequence Number to x; then, the client enters the SYN_SEND state and waits for the server's confirmation;

The second handshake : The server receives the SYN segment. The server receives the SYN segment from the client, and needs to confirm the SYN segment, and set the Acknowledgment Number to x+1 (Sequence Number+1); at the same time, it also needs to send the SYN request information by itself, setting the SYN position to 1. , Sequence Number is y; the server puts all the above information into a message segment (ie SYN+ACK message segment) and sends it to the client together. At this time, the server enters the SYN_RECV state; 

The third handshake : The client receives the SYN+ACK segment from the server. Then set the Acknowledgment Number to y+1, and send an ACK segment to the server. After this segment is sent, both the client and the server enter the ESTABLISHED state to complete the TCP three-way handshake.

Why do you need to shake hands three times?

In order to prevent the failed connection request message segment from being suddenly transmitted to the server, an error occurs.

Specific example: "Failed connection request message segment" is generated in such a situation: the first connection request message segment sent by the client is not lost, but stays at a network node for a long time , So that it is delayed to reach the server some time after the connection is released. Originally, this is a segment that has long since expired. However, after the server receives this invalid connection request segment, it mistakes it for a new connection request sent by the client again.

So it sends a confirmation segment to the client, agreeing to establish a connection. Assuming that the "three-way handshake" is not used, then as long as the server sends an acknowledgment, a new connection is established. Since the client does not send a request to establish a connection, it ignores the server's confirmation and does not send data to the server. But the server thinks that the new transport connection has been established, and has been waiting for the client to send data. In this way, a lot of server resources are wasted in vain. The "three-way handshake" approach can prevent the above phenomenon from happening. For example, in the case just now, the client will not send a confirmation to the server's confirmation. Since the server cannot receive the confirmation, it knows that the client did not request to establish a connection. "

2. Wave four times

After the client and server establish a TCP connection through a three-way handshake, when the data transmission is completed, the TCP connection must be disconnected. For the disconnection of TCP, there is a mysterious "four breakup" here.

image

The first breakup : Host 1 (which can be a client or a server), set the Sequence Number, and send a FIN segment to host 2. At this time, host 1 enters the FIN_WAIT_1 state; this means that host 1 has no data to request Sent to host 2;

The second breakup : Host 2 receives the FIN segment sent by Host 1, and returns an ACK segment to Host 1. The Acknowledgment Number is Sequence Number plus 1; Host 1 enters the FIN_WAIT_2 state; Host 2 tells Host 1, I " Agree to your closing request; the
third breakup : Host 2 sends a FIN segment to Host 1, requesting to close the connection, and Host 2 enters the LAST_ACK state;

The fourth breakup : Host 1 receives the FIN segment sent by Host 2, and sends an ACK segment to Host 2, and then Host 1 enters the TIME_WAIT state; Host 2 closes the connection after receiving the ACK segment from Host 1 ; At this time, host 1 still does not receive a reply after waiting for 2MSL, which proves that the Server side has been shut down normally, so well, host 1 can also close the connection.

Why break up four times?

The TCP protocol is a connection-oriented, reliable, byte stream-based transport layer communication protocol. TCP is full-duplex mode, which means that when host 1 sends a FIN segment, it just means that host 1 has no data to send, and host 1 tells host 2 that all its data has been sent; however, At this time, host 1 can still receive data from host 2; when host 2 returns the ACK segment, it means that it already knows that host 1 has no data to send, but host 2 can still send data to host 1; when host 2 also When the FIN segment is sent, this time it means that host 2 has no data to send, and it will tell host 1 that I have no data to send, and then each other will happily interrupt the TCP connection.

Why wait for 2MSL?

MSL: The maximum survival time of a message segment, which is the longest time in the network before any message segment is discarded. There are two reasons:

  • Ensure that the full-duplex connection of the TCP protocol can be reliably closed

  • Ensure that the repeated data segment of this connection disappears from the network

The first point: If host 1 is directly CLOSED, then host 2 did not receive the final ACK from host 1 due to the unreliability of the IP protocol or other network reasons. Then the host 2 will continue to send the FIN after the timeout. At this time, since the host 1 is CLOSED, it cannot find the connection corresponding to the retransmitted FIN. Therefore, host 1 does not enter CLOSED directly, but keeps TIME_WAIT. When receiving FIN again, it can ensure that the other party receives ACK, and finally closes the connection correctly.

The second point: If host 1 is directly CLOSED, and then initiates a new connection to host 2, we cannot guarantee that the port number of this new connection is different from the port number of the just-closed connection. In other words, it is possible that the port numbers of the new connection and the old connection are the same. Generally speaking, no problems will occur, but there are still special circumstances: assuming that the port number of the new connection and the old connection that have been closed are the same, if some data of the previous connection is still stuck in the network, these delayed data are being established The new connection arrives at host 2. Because the port number of the new connection and the old connection are the same, the TCP protocol considers that the delayed data belongs to the new connection, which is confused with the data packet of the real new connection. Therefore, the TCP connection has to wait for 2 times the MSL in the TIME_WAIT state, so as to ensure that all data of this connection disappears from the network.

Nine, TCP flow control

If the sender sends the data too fast, the receiver may not have time to receive it, which will cause the loss of data. The so-called flow control is to let the sender not send too fast, let the receiver have time to receive.

Using the sliding window mechanism can easily implement flow control on the sender on the TCP connection.

Let A send data to B. When the connection was established, B told A: "My receiving window is rwnd = 400" (where rwnd stands for receiver window). Therefore, the sending window of the sender cannot exceed the value of the receiving window given by the receiver. Please note that the unit of TCP window is byte, not message segment. Assume that each segment is 100 bytes long, and the initial value of the data segment sequence number is set to 1. Uppercase ACK represents the acknowledgment bit ACK in the header, and lowercase ack represents the value ack of the acknowledgment field.

image

It can be seen from the figure that B has performed three flow control. The first time the window is reduced to rwnd = 300, the second time it is reduced to rwnd = 100, and finally to rwnd = 0, that is, the sender is not allowed to send any more data. This state that causes the sender to suspend sending will continue until the host B resends a new window value. The three segments sent by B to A are all set with ACK=1, and the acknowledgment number field is meaningful only when ACK=1.

TCP has a persistence timer for each connection. As long as one party of the TCP connection receives the zero window notification from the other party, the continuous timer is started. If the duration set by the timer expires, a zero-window control message segment (carrying 1 byte of data) is sent, and the party that receives this segment will reset the duration timer.

10. TCP congestion control

The sender maintains a state variable of the congestion window cwnd (congestion window). The size of the congestion window depends on the degree of network congestion and changes dynamically. The sender makes its own sending window equal to the congestion window.

The principle of the sender to control the congestion window is: As long as there is no congestion in the network, the congestion window will be increased a little more so that more packets can be sent. But as long as the network is congested, the congestion window is reduced to reduce the number of packets injected into the network.

Slow start algorithm:

When the host starts to send data, if a large amount of data bytes are injected into the network immediately, it may cause network congestion, because it is not clear about the load situation of the network. Therefore, a better method is to probe first, that is, gradually increase the sending window from small to large, that is, gradually increase the congestion window value from small to large.

Usually when the message segment is just started to be sent, the congestion window cwnd is first set to the value of the maximum message segment MSS. After each confirmation of a new segment is received, the congestion window is increased to a value of at most one MSS. Using this method to gradually increase the congestion window cwnd of the sender can make the packet injection rate into the network more reasonable.

image

After each transmission round, the congestion window cwnd is doubled. The time elapsed in a transmission round is actually the round-trip time RTT . However, the "transmission round" is more emphasized: the message segments allowed by the congestion window cwnd are continuously sent out, and the confirmation of the last byte that has been sent is received.

In addition, the "slow" of slow start does not mean that the growth rate of cwnd is slow, but it means that cwnd=1 is first set when TCP starts to send the segment, so that the sender only sends one segment at the beginning (the purpose is to test Look at the congestion of the network), and then gradually increase cwnd.

In order to prevent the congestion window cwnd from increasing too large and causing network congestion, it is also necessary to set a slow start threshold ssthresh state variable. The usage of the slow start threshold ssthresh is as follows:

  • When cwnd <ssthresh, the above slow start algorithm is used.

  • When cwnd> ssthresh, stop using the slow start algorithm and use the congestion avoidance algorithm instead.

  • When cwnd = ssthresh, either the slow start algorithm or the congestion control avoidance algorithm can be used. Congestion avoidance

Congestion avoidance

Let the congestion window cwnd increase slowly, that is, increase the sender's congestion window cwnd by 1 instead of doubling each time a round-trip time RTT passes. In this way, the congestion window cwnd grows slowly according to a linear law, which is much slower than the growth rate of the congestion window of the slow start algorithm.

image

Whether in the slow start phase or in the congestion avoidance phase, as long as the sender judges that the network is congested (the basis is that no confirmation is received), the slow start threshold ssthresh must be set to half of the sender window value when congestion occurs (but not Less than 2). Then reset the congestion window cwnd to 1, and execute the slow start algorithm.

The purpose of this is to quickly reduce the number of packets sent by the host to the network, so that the congested router has enough time to process the backlog of packets in the queue.

The following figure illustrates the above-mentioned congestion control process with specific values. The size of the sending window is now as large as the congestion window.

image

2. Fast retransmission and fast recovery

Fast retransmission

The fast retransmission algorithm first requires the receiver to send a repeat confirmation immediately after receiving an out-of-sequence segment (in order to make the sender know that there is a segment that has not arrived at the other party) and not wait until it sends data before piggybacking confirm.

image

After receiving M1 and M2, the receiver sends out confirmations respectively. Now suppose that the receiver did not receive M3 but then received M4.

Obviously, the receiver cannot confirm M4, because M4 is a received out-of-sequence message segment. According to the principle of reliable transmission, the receiver can do nothing, or send a confirmation to M2 at an appropriate time.

However, in accordance with the provisions of the fast retransmission algorithm, the receiver should send a repeated confirmation of M2 in a timely manner, so that the sender can know that the message segment M3 has not reached the receiver as soon as possible. The sender then sent M5 and M6. After receiving these two messages, the receiver must also send a repeated confirmation to M2 again. In this way, the sender has received a total of four M2 acknowledgments from the receiver, of which the last three are repeated acknowledgments.

The fast retransmission algorithm also stipulates that as long as the sender receives three repeated acknowledgments in a row, it should immediately retransmit the message segment M3 that the other party has not received, instead of waiting for the retransmission timer set by M3 to expire.

Since the sender retransmits the unacknowledged segment as soon as possible, the use of fast retransmission can increase the overall network throughput by about 20%.

Quick recovery

The fast recovery algorithm is used in conjunction with fast retransmission. The process has the following two main points:

  • When the sender receives three repeated confirmations in a row, it executes the "multiplication reduction" algorithm to halve the slow start threshold ssthresh.

  • The difference with the slow start is that the slow start algorithm is not executed now (that is, the congestion window cwnd is not set to 1), but the cwnd value is set to the value of the slow start threshold ssthresh halved, and then the congestion avoidance algorithm (" Addition increases"), so that the congestion window slowly increases linearly.

     

Liang Xu Personal WeChat

 

Add Liang Xu's personal WeChat and get 3 sets of must-read materials for programmers

 

→ Sharing of selected technical data

→ Master Ruyun Exchange Community

 

image

 


 

All the blog posts of this official account have been organized into a directory, please reply "m" in the official account to get it!

Recommended reading:

How to use curl

Can M1 Mac install Ubuntu and Linux? ?

Comic | How many TCP connections can a Linux server support at most?

 

5T technical resources are on sale! Including but not limited to: C/C++, Linux, Python, Java, PHP, artificial intelligence, single-chip microcomputer, Raspberry Pi, etc. Reply "1024" in the official account to get it for free! !

 

image

Read the original

Reading 8562

Like 49 Watching 36

Write your message

Featured Message

  • wendell0111

     

    I reviewed the content of the computer network that I learned in my undergraduate course, thank you guys

  • Github Navigation Station (Friends)

     

    Like a good article

  • L⃰o⃰l⃰i⃰o⃰y⃰

     

    Some information writes that after the network is congested, the cwnd congestion window becomes half of the original, that is, the new ssthresh value starts

Guess you like

Origin blog.csdn.net/wzlsunice88/article/details/111352541