(transport layer) tcp protocol

Reprinted from: http://www.cnblogs.com/kzang/articles/2582957.html

content

Header format
Data unit
specific
attention Automatic
retransmission request ARQ specific implementation Send
buffer
receive
buffer
Sliding window
acknowledgment loss and acknowledgment late
timeout TCP's Finite State Machine






header format

Emoticons:

Description of each segment:

  • Source port and destination port : each occupies 2 bytes. The port is the service interface between the transport layer and the application layer. The multiplexing and demultiplexing functions of the transport layer can only be realized through the port.
  • Sequence number : occupies 4 bytes. Each byte in the data stream transmitted in the TCP connection is programmed with a sequence number. The value of the sequence number field refers to the sequence number of the first byte of the data sent in this segment
  • Confirmation number : 4 bytes, is the sequence number of the first byte of the data expected to receive the next segment of the other party
  • Data offset/header length : occupies 4 bits, it indicates how far the data start of the TCP segment is from the start of the TCP segment. The unit of "data offset" is 32-bit words (in 4 words) Section is the unit of calculation)
  • Reserved : occupy 6 bits, reserved for future use, but should be set to 0 at present
  • Urgent URG : When URG=1, it indicates that the urgent pointer field is valid. It tells the system that there is urgent data in this segment and should be transmitted as soon as possible (equivalent to high-priority data)
  • Confirm ACK : The confirmation number field is valid only when ACK=1. When ACK=0, the confirmation number is invalid
  • PSH(PuSH) : When the receiving TCP receives a segment with PSH = 1, it will deliver it to the receiving application process as soon as possible, instead of waiting until the entire cache is full before delivering it upwards
  • RST (ReSeT) : When RST=1, it indicates that there is a serious error in the TCP connection (such as due to host crash or other reasons), the connection must be released, and then the transport connection is re-established
  • Synchronous SYN : Synchronous SYN = 1 indicates that this is a connection request or connection acceptance message
  • Terminate FIN : used to release a connection. FIN=1 indicates that the data of the sender of this segment has been sent, and requests to release the transport connection
  • Checksum : occupies 2 bytes. The scope of the checksum field check includes the header and the data. When calculating the checksum, a 12-byte pseudo-header should be added in front of the TCP segment
  • Emergency pointer : occupies 16 bits, indicating how many bytes of emergency data in this segment (emergency data is placed at the top of the data in this segment)
  • Option : variable length. TCP initially only specified one option, that is, the maximum segment length MSS. MSS tells the other TCP: "The maximum length of the data field of the segment that my cache can receive is MSS bytes. ." [MSS (Maximum Segment Size) is the maximum length of the data field in the TCP segment. The data field plus the TCP header is equal to the entire TCP segment]
  • Padding : This is to make the entire header length an integer multiple of 4 bytes
  • Other options :
    • Window expansion : occupies 3 bytes, one of which represents the shift value S. The new window value is equal to the number of window bits in the TCP header increased to (16 + S), which is equivalent to shifting the window value to the left by S bits After getting the actual window size
    • Timestamp : Occupies 10 bytes, of which the most important field is the timestamp value field (4 bytes) and the timestamp echo reply field (4 bytes)
    • Select confirmation : The receiver has received two 2 bytes that are not continuous with the previous byte stream. If the sequence numbers of these bytes are all within the receiving window, then the receiver will accept the data first, but the information must be accurate Tell the sender to stop sending the received data again

 


data unit

The data unit protocol transmitted by TCP is the TCP segment (segment)

 


Features

TCP is a connection-oriented transport layer protocol.
Each TCP connection can only have two endpoints, and each TCP connection can only be point-to-point (one-to-one).
TCP provides reliable delivery services.
TCP provides full-duplex communication .
Word-oriented throttling

 


Notice

TCP does not care how long the application process sends to the TCP cache at a time.
TCP determines how many bytes a segment should contain according to the window value given by the other party and the current degree of network congestion (UDP sends The length of the message is given by the application process.)
TCP can divide too long data blocks into shorter ones and transmit them. TCP can also wait for enough bytes to accumulate before forming a message segment and send it out.
Each TCP connection has two The endpoint of a
TCP connection is not the host, not the IP address of the host, not the application process, nor the protocol port of the transport layer. The endpoint of a TCP connection is called a socket or socket.

 


Automatic Repeat Request ARQ

definition:

The reliable transmission protocol is often called ARQ (Automatic Repeat reQuest)

Cumulative confirmation:

  • Definition: The receiver generally adopts the method of accumulative acknowledgment. That is, it is not necessary to send acknowledgments to the received packets one by one, but to send an acknowledgment to the last packet that arrives in sequence, which means that all packets up to this packet have been correctly received. arrive
  • Advantages: easy to implement, no need to retransmit even if the acknowledgment is lost
  • Disadvantage: Can't reflect to the sender all the packets that the receiver has received correctly

Go-back-N:

If the sender sends the first 5 packets, and the third packet in the middle is lost. At this time, the receiver can only send an acknowledgment to the first two packets. The sender cannot know the whereabouts of the last three packets, and has to All three packets are retransmitted again

 


Implementation

illustrate:

  • Each end of a TCP connection must have two windows, a send window and a receive window
  • The TCP reliable transport mechanism is controlled by the sequence number of the bytes. All TCP acknowledgments are based on the sequence number and not on the segment.
  • The four windows at both ends of TCP are often in dynamic change
  • The round-trip time RTT of a TCP connection is not fixed. A specific algorithm needs to be used to estimate a more reasonable retransmission time

Emoticons:

 


send buffer

The send buffer is used to temporarily store:

  • Data sent by the sending application to the sender TCP to be sent
  • Data sent by TCP but not yet acknowledged

Emoticons:

 

 


receive buffer

The receive buffer is used to temporarily store:

  • data that arrives in order but has not been read by the receiving application;
  • Data arriving out of order

 Emoticons:

 


sliding window

Emoticons:

Features:

  • Sliding window in bytes
  • A's send window is not always as big as B's receive window (because of a certain time lag)

Require:

  • The TCP standard does not specify how to deal with data that arrives out of sequence. Usually, it is temporarily stored in the receiving window, and after the missing bytes in the byte stream are received, it is then delivered to the upper-layer application process in order.
  • TCP requires that the receiver must have the function of accumulating acknowledgments, which can reduce the transmission overhead

Implementation: 

 


Lost acknowledgment and late acknowledgment

 


Timeout retransmission time selection

Implementation:

Every time TCP sends a segment, it sets a timer for the segment. As long as the retransmission time set by the timer expires but no acknowledgement has been received, the segment will be retransmitted

Weighted Average Round Trip Time:

practice:

TCP retains a weighted average round-trip time RTTS of RTT (this is also called smooth round-trip time). When the RTT sample is measured for the first time, the RTTS value is taken as the measured RTT sample value. The RTT sample of , then recalculate the RTTS as follows:

official:

New RTTS = ( 1 - α) × (old RTTS) + α (new RTT sample)

illustrate:

In the formula, 0 ≤ α < 1. If α is very close to zero, it means that the RTT value is updated slowly. If α is selected to be close to 1, it means that the RTT value is updated faster
. The recommended α value of RFC 2988 is 1/8, that is, 0.125

Timeout retransmission time RTO:

The RTO should be slightly larger than the weighted average round trip time RTTS derived above.
RFC 2988 recommends calculating the RTO using the following formula:

RTO=RTTS + 4×RTTD

RTTD is the weighted average of the deviations of the RTT.
RFC 2988 recommends calculating RTTD as such. For the first measurement, the RTTD value is taken as half of the measured RTT sample value. In subsequent measurements, the weighted average RTTD is calculated using the following formula :

New RTTD = (1-β)×(old RTTD)+β×|RTTS﹣New RTT Sample|

β is a coefficient less than 1, and its recommended value is 1/4, that is, 0.25.
When calculating the average round-trip time RTT, as long as the segment is retransmitted, its round-trip time sample is not used.

Modified Karn Algorithm:

Each time a segment is retransmitted, the RTO is increased a little:

new RTO = γ×(old RTO)

The typical value of the coefficient γ is 2.
When the retransmission of the message segment no longer occurs, the average round-trip delay RTT and the timeout retransmission time RTO are updated according to the round-trip delay of the message segment.

continuous timer

  • TCP has a duration timer for each connection
  • As long as one side of the TCP connection receives the zero window notification from the other side, the continuous timer is started
  • If the time set by the continuous timer expires, a zero-window probe segment (carrying only 1 byte of data) is sent, and the other party gives the current window value when confirming the probe segment.
  • If the window is still zero, the party receiving the segment resets the persistence timer
  • If the window is not zero, the deadlock stalemate can be broken

 


The timing of sending the segment

TCP maintains a variable, which is equal to the maximum segment length MSS. As long as the data stored in the cache reaches MSS bytes, it is assembled into a TCP segment and sent out
. The sender's application process specifies the request to send the segment, that is
When a timer of the sender of the push operation supported by TCP expires, the currently existing cached data is loaded into the segment (but the length cannot exceed the MSS) and sent out

 


transport connection

three phases:

  • Connection established:
    • Emoticons:

    • step:
      • A's TCP sends a connection request segment to B, the synchronization bit in its header is SYN = 1, and the sequence number seq = x is selected, indicating that the sequence number of the first data byte when transmitting data is x
      • After B's TCP receives the connection request message segment, if it agrees, it will send back confirmation (B should make SYN = 1 in the confirmation message segment, make ACK = 1, its confirmation number ack = x﹢1, the one you choose sequence number seq = y)
      • After A receives this segment, it gives confirmation to B, its ACK = 1, and the confirmation number ack = y﹢1 (A's TCP notifies the upper-layer application process that the connection has been established, B's TCP receives the confirmation from host A, Also notify its upper application process: TCP connection has been established)
  • data transmission
  • Connection release:
    • Emoticons:

    • step:
      • After the data transmission is over, both parties of the communication can release the connection. Now A's application process first sends a connection release segment to its TCP, stops sending data again, and actively closes the TCP connection (A releases the FIN in the header of the connection release segment). = 1, its serial number seq = u, waiting for B's confirmation)
      • B sends an acknowledgment, the acknowledgment number ack = u+1, and the segment's own sequence number seq = v (the TCP server process notifies the high-level application process. The connection from A to B in this direction is released, and the TCP connection is in a half-closed state. B If sending data, A still has to receive)
      • If B has no data to send to A, its application process notifies TCP to release the connection
      • After A receives the connection release segment, it must send an acknowledgment. In the acknowledgment segment, ACK = 1, the confirmation number ack=w﹢1, and its own sequence number seq = u + 1
    • Notice:

The TCP connection must be released after the time 2MSL ( the purpose of the 2MSL time  --- in order to ensure that the last ACK segment sent by A can reach B. To prevent the "invalid connection request segment" from appearing in this connection Medium.A After sending the last ACK segment, and after a time of 2MSL, all segments generated during the duration of this connection can disappear from the network. In this way, the next new This old connection request segment will not appear in the connection)

    • Processing when missing confirmation is found:

Three questions:

  • To enable each party to be sure of the existence of the other
  • To allow both parties to negotiate some parameters (such as maximum segment length, maximum window size, quality of service, etc.)
  • Ability to make allocations on transport entity resources such as cache size, items in join tables, etc.

 


Send a TCP request to the client

 


Concepts related to congestion handling

Congestion window:

meaning:

The size of the congestion window depends on the degree of network congestion and changes dynamically. The sender makes its own sending window equal to the congestion window. If the receiver's receiving capability is considered, the sending window may also be smaller than the congestion window.

The principle of the sender's control of the congestion window:

As long as the network is not congested, the congestion window is increased to send more packets. But as long as the network is congested, the congestion window is reduced to reduce the number of packets injected into the network.

Multiply reduction:

It means that no matter in the slow start stage or the congestion avoidance stage, as long as there is a timeout (that is, a network congestion occurs), the slow start threshold ssthresh is set to the current congestion window value multiplied by 0.5

Additive increase:

It means that after the congestion avoidance algorithm is executed, after receiving the confirmation of all the segments (that is, after a round-trip time), the congestion window cwnd is increased by one MSS size, so that the congestion window is slowly increased to prevent the network from appearing prematurely. congestion

Fast retransmission:

The fast retransmission algorithm first requires the receiver to send a duplicate acknowledgment immediately after receiving an out-of-sequence segment. This allows the sender to know early that a segment has not reached the receiver, and the sender only needs to receive three duplicates in a row. Confirmation should immediately retransmit the segment that the other party has not yet received

Quick recovery:

When the sender receives three consecutive repeated acknowledgments, it executes the "multiply reduction" algorithm to halve the slow start threshold ssthresh. But then the slow start algorithm is not executed

Upper limit value of send window:

The upper limit of the sending window of the sender should be taken as the smaller one of the two variables of the receiver window rwnd and the congestion window cwnd, that is, it should be determined according to the following formula:
The upper limit of the sending window Min [rwnd, cwnd]

    • When rwnd < cwnd, the receiver's receiving capability limits the maximum value of the sending window
    • When cwnd < rwnd, it is the maximum value of the network congestion limit sending window

Congestion avoidance specific implementation

Slow start algorithm:

  • When the host just starts to send a segment, the congestion window cwnd = 1 can be set first, that is, it is set to the value of a maximum segment MSS.
  • After each acknowledgment of a new segment is received, the congestion window is increased by 1, that is, an MSS value is added.
  • After using the slow start algorithm, the congestion window cwnd is doubled after each transmission round (round-trip time RTT)

Congestion Avoidance Algorithm:

The congestion window cwnd increases slowly, that is, the sender's congestion window cwnd is increased by 1 every time a round-trip time RTT passes, so that the congestion window cwnd grows slowly according to a linear law.

Usage of slow start threshold ssthresh:

  • When cwnd < ssthresh, use slow start algorithm
  • When cwnd > ssthresh, stop using slow start algorithm and use congestion avoidance algorithm instead
  • When cwnd = ssthresh, either the slow start algorithm or the congestion avoidance algorithm can be used

When the network is congested (based on not receiving acknowledgments on time):

  • It is necessary to set the slow start threshold ssthresh to half the sender window value when congestion occurs (but not less than 2)
  • Then reset the congestion window cwnd to 1. Execute the slow start algorithm

 


TCP's Finite State Machine

illustrate:

  • Each box in the diagram of the TCP finite state machine is a possible state of the TCP
  • The uppercase English string in each box is the TCP connection state name used by the TCP standard. Arrows between states indicate possible state transitions
  • The word next to the arrow indicates the cause of the transition, or what action occurs after the state transition occurs
  • There are three different arrows in the figure
    • Thick solid arrows indicate normal transitions to client processes
    • Thick dashed arrows indicate normal transitions to server processes
    • Another thin line arrow indicates abnormal transition

 
 
Tags:  network

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325340782&siteId=291194637