Transport layer: Transmission Control Protocol TCP

The TCP protocol is a very complex protocol in the TCP/IP protocol suite. This chapter will summarize from the shallower to the deeper.

Knowledge point overview

Insert picture description here

One, TCP basics

1. What are the main features of tcp?

(1) Connection-oriented transport layer protocol

The application must establish a connection before using TCP, and the established connection must be released after the data is transmitted.

(2) A TCP link can only have two endpoints

That is to say, each TCP link can only be point-to-point (one-to-one).

(3) TCP provides reliable delivery of data.

The data transmitted by TCP arrives in order without error, loss, repetition, and order.

(4) TCP provides full-duplex communication

1. TCP allows the processes of both communicating parties to send data at any time.
2. Both ends of the TCP have a sending buffer and a receiving buffer. Used to temporarily store data for two-way communication.
3. When sending, the application program transfers the data to the TCP buffer, and TCP sends it out when appropriate.
4. When receiving, the TCP bar data is put into the buffer, and the upper application process reads the buffer when appropriate.

(5) Oriented byte stream

1. It is a sequence of bytes flowing into or out of the application process.
2. Although the application program and TCP interact with data blocks of varying sizes, each data block in TCP is regarded as a segment of byte stream.
3. TCP does not guarantee that the data block received by the receiver application and the data block sent by the sender application have a corresponding size relationship, but the byte stream received by the receiver application must be the same as that sent by the sender application The byte stream is exactly the same.
ps: The schematic diagram is as follows:

Insert picture description here

2. TCP link

TCP is connection-oriented, and many features of TCP are related to this.So what is the TCP connection?
In fact, the endpoint of the TCP link is neither the host, the IP address of the host, the application process, nor the protocol port of the transport layer. The endpoint of a TCP link is called a socket (socket).

(1) Socket (socket)

1. Socket: RFC 793 definitionPort number spliced ​​to IP addressIt constitutes a socket.
2. Representation method: Write the port number after the IP address in dotted decimal notation, separated by a colon.
3. Chestnuts: 192.23.23.23:8080

(3) TCP link

1. Each TCP link is uniquely determined by the two endpoints (two sockets) of the two communication segments.
2. The endpoint of the TCP connection is a very abstract socket, namely (IP address: port number).
3. The same An IP address can have multiple different TCP connections, and the same port number can also appear in multiple different TCP connections

2. The working principle basis of reliable transmission

The message sent by TCP is delivered to the lower IP layer, and the IP layer only provides best-effort services and does not guarantee the reliability of transmission. At this time, TCP needs to take appropriate measures to ensure the reliability of transmission.

Reliable transmission under ideal assumptions

1. The transmission channel does not produce errors.
2. Regardless of the speed of sending data at multiple blocks, the receiver always has time to process the received data.

Measures taken in practice

In the actual network. None of the above two, but we can adopt some reliable protocols to control.
1. When there is an error in the data, let the receiver retransmit the data.
2. When the receiver is too late to receive data, tell the sender to appropriately reduce the sending speed.

1. Stop waiting for the agreement

1. Earlier we learned that one of the characteristics of TCP communication isZensokou Tsushin, That is, the two parties in the communication are both the sender and the receiver.
2. For the convenience of discussion, we set A as the sender and B as the receiver in the following figure.
3. In fact, the data unit transmitted by each layer can be called a packet. This is also expressed in groups.

What is the stop waiting agreement?

1. Stop waiting protocol: The sender stops sending after sending a packet and waits for the receiver's confirmation. The receiver sends the confirmation after receiving the packet, and then sends the next packet after receiving the receiver's confirmation.
2. Timeout retransmission: As long as A does not receive the confirmation after a period of time, it considers that the packet sent just now is lost, and therefore retransmits the packet that was previously sent

To achieve retransmission over time, it is necessary to set a timeout timer every time a packet is sent. If the other party's confirmation is received before the timeout timer expires, the set timeout timer is cancelled.

(1) Stop waiting protocol: no error condition
Insert picture description here
(2) Stop waiting protocol: timeout retransmission (when an error occurs)
Insert picture description here

1. Possible circumstances for errors:

1. B detects a packet error after receiving the M1 packet, and discards M1.
2. M1 is lost during transmission.
ps: In these two cases, B will not do anything, and will not notify A to receive packets with errors.

2. Pay attention when retransmitting over time:

1. After A sends a packet, it needs to temporarily keep a copy of the packet that has been sent (it can be used when a timeout retransmission occurs), and only delete the packet copy after receiving the corresponding confirmation.
2. Numbering is required for grouping and confirmation grouping. In this way, it can be clear which of the sent packets received the acknowledgment. Which of the sent packets did not receive confirmation.
3. The retransmission time set by the timeout timer should be longer than the average round-trip time of data in packet transmission

(3) Stop waiting for the agreement: Confirm lost (error occurred)
Insert picture description here

Possible situations:
B sends an acknowledgement group to A after receiving the M1 packet. But the packet is lost during sending. A will retransmit packet M1 after the timeout timer arrives.
If at this time B receives the M1 sent by A again, then B will take two actions:

1. Discard this duplicate packet and not deliver it to the upper layer.
2. Send an acknowledgment packet to A (a acknowledgment packet must be sent every time a packet is received, and it cannot be considered that it will not be sent after receiving this packet. Moreover, the reason why A retransmits M1 is because it did not receive the confirmation of M1)

(4) Stop waiting for agreement: confirm late arrival (error occurred)

Insert picture description here

Possible situations:
B receives M1 sent by A, and then starts to send confirmation packets. But when sending an acknowledgment packet, the packet is delivered very slowly due to some reasons. The timeout timer event of A's M1 has not received the acknowledgment. At this time, A is sending an M1, and B receives M1 and finds that this M1 is duplicated. Repeat M1, send confirmation packet. At this time, A receives the confirmation packet. Continue to send packets M2...Mn At this time, the turtle-speed M1 packet is finally received. . .
deal with:

A receives the late confirmation packet and discards it.

reward

1. Generally, A can always receive confirmations for all sent packets. If A keeps retransmitting packets but always fails to receive confirmation, it means that the communication line is too poor to communicate.
2. Using the above confirmation and retransmission mechanism, we can achieve reliable communication on an unreliable transmission network
3. The above-mentioned reliable transmission protocol is often called ARQ (Automatic Repeat reRuest). This means that the retransmission request is done automatically. The receiver does not need to request the sender to retransmit an error packet.

Stop waiting for the advantages and disadvantages of the agreement

The advantages are simple, and the disadvantages are low channel utilization.
1. Assuming that there is a straight channel between AB (low utilization rate)
2. Using pipeline transmission

1. Pipeline transmission means that the sender can send multiple packets continuously, without having to stop and wait for the other party's confirmation after sending a packet. In this way, data is always being transmitted on the channel without interruption. Obviously, this transmission method can achieve high channel utilization.
2. WhenWhen using pipeline transmissionWill useContinuous ARQ protocolwithSliding window protocol

2. The basics of continuous ARQ protocol and initial understanding of sliding window

The sliding window is the essence of the TCP protocol, which will be summarized later, here is a summary of the continuous ARQ protocol basis.

Insert picture description here

As shown in a above:

1. This represents the sliding window maintained by the sender
2. The 5 packets in the sending window are allCan send continuouslyGo out whileNo need to wait for confirmation. In this way, the channel utilization is improved.
3. There is a time coordinate t in the above picture. We think forward is the direction of time increase, and backward is the direction of time decrease.
4. Packet transmission is to send from small to large according to the packet sequence number.

(1)Continuous ARQ protocol provisions

1. The sender: every time a confirmation packet is received, the sliding window is moved forward by one packet position.
2. Recipient: adopt cumulative confirmation method.

1. Cumulative confirmation explanation: The receiving method does not have to send confirmations to the received packets one by one, but after receiving the packets,Send an acknowledgment to the last packet arriving in sequence. This means that all packets up to this packet have been received correctly.
2. Advantages of cumulative confirmation: easy to implement, even if the confirmation is lost, there is no need to retransmit.
3. Cumulative confirmation shortcomings: it cannot reflect to the sender the information of all packets that the receiver has correctly received.
4. Shortcomings: if the sender sends the first 5 packets, the third packet in the middle is lost. At this time, the receiver can only send an acknowledgement to the first two packets. The sender cannot know the whereabouts of the next three packets, and has to retransmit all the next three packets. This is called Go-back-N, which means that you need to go back and retransmit the N packets that have been sent. It can be seen that when the quality of the communication line is not good, the continuous ARQ protocol will bring a negative impact

Third, the first format of the TCP message

1. Although one of the characteristics of TCP is byte-oriented, the data unit transmitted by TCP is message segment.
2. TCP segment = TCP header + TCP data part
3. TCP header is very important, and the first 20 bytes are fixed. After the header, there are 4n (n is an integer) bytes that are optional to increase as needed. Therefore, the minimum length of the TCP header is 20 bytes.

Insert picture description here

TCP segment diagram

(1) Source port, destination port

Each occupies two bytes and writes the source port number and destination port number respectively. Similar to UDP, the sharing of TCP is also achieved through ports.

(2) Serial number

1、4 bytes
2. The value range is [0,2^32-1]. When the sequence number is greater than 2^32 -1, the sequence number will start from 0 (the remainder operation is performed).
3. TCP is byte stream oriented.In a TCP link, each byte transmitted is numbered in sequence.
4. The start number of the byte stream to be transmitted must be set when the connection is established.
5. The serial number value represents: the serial number of the first byte in the sent data of the message segment. (The serial number is illustrated below)
6. The serial number is also called the segment serial number.

Insert picture description here

Serial number diagram

For example: the sequence number field value of a message segment is 301, and the data carried is 100 bytes in total. This means that the sequence number of the first byte of the data in this segment is 301, and the sequence number of the last byte is 400. Obviously, the data sequence number of the next segment (if any) should start from 401, that is, the sequence number field value of the next segment should be 401. The name of this field is also called "Segment Number"

(3) Confirmation number

1、4 bytes
2. The sequence number of the first byte number in the next message expected to be received.

(4) Data offset

1、4 positions
2. It points out the TCP segmentStart of datadistanceStart of TCP segmentHow far is the data offset indicates the length of the header.
3. There is an option field with uncertain length in the header, so the data offset field is necessary

(5) Urgent URG (URGent)

When URG = 1, it indicates that the urgent pointer field is valid. It tells the system that there is urgent data in this segment and it should be transmitted as soon as possible (equivalent to high-priority data), rather than in the original queuing order.

(6) ACK (ACKnowlegment)

The confirmation number field is valid only when ACK = 1. When ACK = 0, the confirmation number is invalid. TCP stipulates that ACK must be set to 1 in all message segments transmitted after the connection is established

(7) Push PSH (PuSH)

When two application processes communicate interactively, sometimes the application process at one end hopes to receive a response from the other immediately after typing a command. In this case, TCP can use push operations

(8) Reset RST (ReSeT)

When RST = 1, it indicates that there is a serious error in the TCP connection (for example, due to a host crash or other reasons), the connection must be released, and then the transport connection is re-established. RST set to 1 is also used to reject an illegal segment or refuse to open a connection. RST can also be called a rebuild bit or a reset bit.

(9)同步SYN (SYNchronization)

Used to synchronize the sequence number when the connection is established. When SYN = 1 and ACK = 0, it indicates that this is a connection request segment. If the other party agrees to establish a connection, it should make SYN = 1 and ACK =1 in the response segment. Therefore, setting SYN to 1 means that this is a connection request or connection acceptance message.

(10) Termination of FIN

Used to release a connection. When FIN = 1, it indicates that the data of the sender of this message segment has been sent, and the transport connection is required to be released.

(11) Window

1. It occupies two bytes, the value range is [0,2^16-1]
2. Window: The receiving window for sending this message (not your own sending window)
3. The window value tells the other party: from this message Counting from the confirmation number in the header of the paragraph, the amount of data the receiver currently allows the other to send

For example, suppose the confirmation number is 701 and the window field is 1000. This means that starting from 701, the party sending this segment still has a receiving buffer space for receiving 1000 bytes of data (byte sequence numbers are 701 ~ 1 700)

4. The window field clearly indicates the amount of data that the other party is now allowed to send. The window value is constantly changing dynamically

(12) Checksum

1. It occupies 2 bytes. The scope of inspection and field inspection includes two parts: header and data. Similar to UDP.
2. Change 17 in the 4th field of the pseudo header to 6 (TCP protocol number is 6), and change the UDP length in the 5th field to TCP length.

(13) Urgent pointer

1. It occupies 2 bytes. The emergency pointer is meaningful only when URG = 1, it points out the number of bytes of emergency data in this segment (normal data after the end of the emergency data)

(14) Option

1. The length is variable, up to 40 bytes. When the "option" is not used, the length of the TCP header is 20 bytes.
2. TCP initially only specified one option, namely == Maximum Segment Length MSS == (Maximum Segment Size) [RFC 879]. Please pay attention to the meaning of the term MSS. MSS is the maximum length of the data field in each TCP segment. The data field plus the TCP header equals the entire TCP segment. Therefore, MSS is not the maximum length of the entire TCP segment, but "the length of the TCP segment minus the length of the TCP header.
3. With the development of the Internet, several options have been added. Such as window expansion options and time stamp options. Etc. [RFC 1323]. The selection confirmation (SACK) option [RFC 2018] was added later. The locations of these options are in the "Options" field shown in Figure 5-14.

Fourth, the realization of TCP reliable transmission

Five, TCP flow control

Six, TCP congestion control

Seven, TCP link management

To be continued! ! !

Guess you like

Origin blog.csdn.net/qq_38350635/article/details/104001439