Transport layer TCP and UDP protocols

Table of contents

transport layer

transport layer function

Services provided by the transport layer

The two protocols of the transport layer

TCP protocol and UDP protocol

port

Port classification

The relationship between IP address and port

UDP protocol 

Foreword:

UDP packet format

pseudo-header for checksum

Pseudo header content

TCP protocol

TCP packet format

Understanding of TCP protocol data segment

TCP pseudo-header

Pseudo header content

flags

Understand the sequence number and confirmation number of TCP

Several points of TCP

TCP reliable transmission

Stop waiting - ARQ (automatic repeat-request) automatic repeat request

Continuous ARQ protocol + sliding window protocol

Specific case

SACK selective confirmation

Transport Layer Data Segmentation

flow control

principle:

Specific case

special case of flow control

congestion control

foreword

congestion control method

basic knowledge

Understanding of congestion window, receiving window, and sending window

Congestion Control - Slow Start

congestion avoidance

Congestion Control - Fast Retransmission

fast recovery

TCP connection management

TCP establishes a connection three-way handshake

The reason why TCP establishes a connection three-way handshake

Connection process (take http get request as an example - the data part is 4 bytes)

TCP releases the connection and waves four times

Why is it necessary to wave four times to release the connection?

The reason why the client needs to wait for some time after sending the ACK message

confirmation of time wait

transport layer

transport layer function

  1. Define the port number of the application layer protocol data packet, flow control
  2. Segmentation of raw data

Services provided by the transport layer

  1. Transport connection service (mainly for the requirements of the session layer, to establish a corresponding connection for each transport connection)
  2. Data transmission services (flow control, error control, sequence control)

The two protocols of the transport layer

  • TCP (Transmission Control Protocol) transmission control protocol
  • UDP (User Datagram Protocol) User Datagram Protocol 

TCP protocol and UDP protocol

port

Meaning: the unique identification of the application in the computer

Port classification

  • Source port: generally located on the client
  • Destination port: generally located on the server

The relationship between IP address and port

Notice:

  • Ports can be divided into virtual ports and physical ports, where virtual ports refer to ports inside the computer or in the switch router, which are invisible.
  • In the UDP/TCP header, the port occupies 2 bytes, so the port range can be calculated: 0-65535
  • The firewall can be set to open and close certain ports to improve security

UDP protocol 

Foreword:

  • The UDP protocol is connectionless, which reduces the overhead of establishing and releasing connections
  • UDP does its best to deliver, and does not guarantee reliable transmission
  • UDP does not need to maintain some complex parameters, the header is only 8 bytes
  • The UDP protocol does not need to establish a connection, and sends data directly without reordering or confirmation

UDP packet format

Notice:

  • 16-bit UDP length: refers to the length of the UDP header and the total length of the UDP data
  • 16-bit UDP checksum: Calculate a value from the pseudo-header, header and data part to detect whether there is an error in the UDP user data during transmission, and discard it if there is an error
  • The pseudo-header only works when calculating the checksum and will not be passed to the network layer

pseudo-header for checksum

Pseudo header content

  • source IP address
  • target IP address
  • Reserved bit (fixed to 0)
  • Protocol type (numeric representation of the protocol used by the encapsulation)
  • The UDP length does not include the length of the pseudo-header, mainly the total length of the UDP header and the data part

Notice:

  • The pseudo-header is fixed at 12 bytes
  • The pseudo-header only works when calculating the checksum and will not be passed to the network layer

TCP protocol

Preface: This protocol requires that a connection must be established before data transmission, and the connection must be released after the data transmission is completed.

TCP packet format

Understanding of TCP protocol data segment

  • Sequence number: 32 bits. Each byte in the transmission process has a number. After the connection is established, the sequence number represents the number of the byte where the starting position of each data packet is transmitted to the other party at this time.
  • Confirmation number: occupying 32 bits, it is only meaningful when the ACK in the flags flags is 1, representing the first byte number (serial number) of the TCP data part that is expected to be transmitted by the other party next time
  • Data offset: occupies 4 bits, data offset × 4 is the length of the header; from this, it can be calculated that the maximum length of the TCP header is 60 bytes; because the minimum length of the header is 20 bytes, the minimum data offset is 5
  • Reserved: occupying 6 bits, currently all 0
  • Flags flags: 6 bits, mainly used to analyze the state of the control data segment
  • Window: occupies 16 bits, mainly used for TCP flow control, to inform the other party of the size of the data allowed to be sent next time (the real receiving buffer size needs to be multiplied by the window scaling factor)
  • Checksum: occupying 16 bits, the checksum is calculated by pseudo header + header + data; it is used to detect whether there is an error in the TCP data segment during transmission, and discard it if there is an error
  • Urgent pointer: occupying 16 bits, it is only meaningful when the URG in the flags flags is 1 (if the urgent pointer is 8, it means that the first 8 bytes of the TCP data part are urgent data)

TCP pseudo-header

Preface: Occupies 12 bytes, only works when calculating and verifying, and will not be transmitted to the network layer

Pseudo header content

  • source IP address
  • target IP address
  • Reserved bit (fixed to 0)
  • Protocol type (numeric representation of the protocol used by the encapsulation)
  • The TCP length does not include the length of the pseudo-header, mainly the total length of the TCP header and the data part

Notice:

  • There are only 4 fields (data offset) in the TCP header to record the header length of the TCP message, and no field to record the data length of the TCP message, but the pseudo-header has the total length of the TCP header and data (TCP length)
  • The data length of TCP/UDP can be inferred from the header of the IP packet: the total data length of the transport layer = the total length of the network layer - the header length of the network layer - the header length of the transport layer

flags

  • UGR: Urgent pointer
  • ACK: Confirm that the response sequence number is valid (confirm connection establishment)
  • PSH: data transfer flag
  • RST: reset the connection (when RST=1, it indicates that there is a problem with the connection, and the connection must be released and re-established)
  • SYN: SYN=1, ACK=0 means this is a request to establish a connection
  • FIN: The sender completes the sending task and requests to release the connection

Note: The setting of flags is realized by setting 0/1

Understand the sequence number and confirmation number of TCP

Several points of TCP

  • reliable transmission
  • flow control
  • congestion control
  • connection management

TCP reliable transmission

Reliable transmission: The client initiates a request to the server, and the server receives the request and returns data to the client. If the data returned by the server is relatively large, the server cannot transmit the data at one time. At this time, the data will be divided into several segments for processing. Send, if a data packet is lost during the sending process (the client does not confirm receipt), the server will only resend the packet

Stop waiting - ARQ (automatic repeat-request) automatic repeat request

Example: A sends 3 packets to B (M1, M2, M3)

  • Normal situation: A sends M1 to B, B sends a confirmation request to A after receiving it, A sends M2 to B after receiving the confirmation, and so on
  • Timeout situation: A sends M1 to B (M1 has an error during transportation) B discards the error message and does not respond to A for confirmation. After the timeout, A resends M1 to B, and B responds to A for confirmation after receiving the intact M1
  • Acknowledgment loss: A initiates M1 to B, and B responds to A to confirm M1 after receiving it (confirmation in the transmission process is invalid). Respond to A for confirmation, and A receives the confirmation response
  • Confirmation is late: A sends M1 to B, B receives the response and confirms to A, because the response timeout, A retransmits M1 to B, B discards the duplicate M1 after receiving it, and then responds to A, at this time, A can continue after receiving the confirmation Send other data packets, after a while B's confirmation arrives, A will not take any action

Continuous ARQ protocol + sliding window protocol

Notice:

  • The size of the sliding window is determined by the receiver (B)
  • Some systems will send an RST message to disconnect the TCP connection before retransmitting 5 times.
  • When the receiver gets the data from the sender, it will first put it in the cache (with a size limit). If the cache is not enough, it will tell the sender the specific receiving window size (the receiving window size is not fixed)

Specific case

Data sent from computer A to computer B

Sending process:

SACK selective confirmation

Preface: In the process of TCP communication, if a data packet in the middle of the sending sequence is lost (for example, 3 of 1, 2, 3, 4, 5 is lost during the sending process), if SACK technology is not used, TCP will finally confirm through retransmission Subsequent packets (finally confirmed 2, will retransmit 3, 4, 5) so that the packets that have been sent correctly may also be sent repeatedly (such as 4, 5), which reduces the performance of TCP; in order to improve the above situation, SACK was developed technology, it can tell the sender which data is missing, which data has been received in advance, so that TCP only resends the lost packet (3) without sending subsequent packets (4, 5)

Example: 

Understanding: Suppose the sender sends so much data from 201 to 1000, the receiver only accepts the data in the brown range, and the data in the white range is not received; then it will tell the sender that it has accepted (left border 301 right border 401, left boundary 501, right boundary 601, left boundary 701, right boundary 801, left boundary 901, right boundary 1001).

Notice:

  • SACK information will be placed in the option part of the TCP header
  • kind: occupies 1 byte; a value of 5 means that this is a SACK option (meaning that there are many kinds of TCP options)
  • length: 1 byte; indicates how many bytes the SACK option occupies in total
  • left edge: occupies 4 bytes, left border
  • right edge: occupies 4 bytes, the right border
  • Because the option part has a maximum of 40 bytes, only 2 bytes of 40 bytes are just needed, and only 38 bytes are left for the left and right borders, because a set of left and right borders is 8 bytes, then only 4 groups of left and right borders can be stored ( The TCP data offset calculation shows that the TCP header is up to 60 bytes, with a fixed length of 20 bytes, so the option part is up to 40 bytes)

Transport Layer Data Segmentation

Why choose to segment data at the transport layer instead of waiting until the network layer and then passing it to the data link layer?

Because reliable transmission is controlled at the transport layer, if there is no segmentation at the transport layer, then once data is lost, the data at the entire transport layer must be retransmitted; if segments are segmented at the transport layer, once data is lost, Only those segments that are lost need to be retransmitted.

flow control

Preface: When the receiver gets the data from the sender, it will put it in the cache first. If the receiver's cache is full and the sender is still sending data crazily, then the receiver can only discard the received data packets. A large amount of packet loss will greatly waste network resources, so flow control is required

Meaning: Let the sender not send too fast, so that the receiver has time to receive and process

principle:

  • Control the sending rate of the sender by confirming the window field in the message
  • The window size sent by the sender cannot exceed the window size given by the receiver
  • When the sender receives the receive window size is 0, the sender will stop sending data

Specific case

special case of flow control

At first, the receiver sent a message segment with a window of 0 to the sender. Later, the receiver had some storage space, and sent a message segment with a non-zero window to the sender. The sender’s sending window was always 0, and the two parties fell into a deadlock

Solution: When the sender receives the 0 window notification, the sender stops sending messages, and at the same time starts a timer, and sends a test message after a period of time to ask the receiver for the latest window size, if the receiving window size is still 0, the sender refreshes the start timer again

congestion control

foreword

understand:

  • Prevent excessive data injection into the network
  • Avoid overloading routers or links in the network

Note: Congestion control is a global process involving all hosts, routers, and all factors related to reducing network transmission performance. It is the result of everyone's joint efforts. In contrast, flow control is point-to-point communication control

congestion control method

  • slow start
  • congestion avoidance
  • fast retransmit
  • fast recovery

basic knowledge

  • MSS: (MAXimum Segment Size) The maximum data part size of each segment, which is determined when the connection is established, and each segment of the sender cannot exceed this value
  • cwnd: (congestion window) congestion window (the smallest window that makes the entire network just congested)
  • rwnd: (receive window) receive window (the receiver tells the sender how much data to send at most)
  • swnd: (send window) send window (the window that actually sends data)

Understanding of congestion window, receiving window, and sending window

Assuming that the receiver's window size is 3000, then the sender's sending window can send up to 3000 data, and the congestion window will change frequently. Assuming that the network is busy now and cannot send too much, the congestion window is only 2000 (not exceeding the receiving window of 3000) , then the sending window and the congestion window are equal; if the network is smooth now, the congestion window is 5000, and the receiving window of the other party is 3000, then the receiving window is equal to the sending window.

Summary: send window = min (receive window, congestion window)

Congestion Control - Slow Start

Understanding: The receiver tells the sender to let the sender send up to 3000 data, and the data part of each package cannot exceed 100; the receiver first sends 1 package when receiving the request, and then receives the confirmation from the other party, and then the sender understands (Sending one packet is fine and the network is not blocked, you can add some more), then the sender sends 2 packets, and the other party confirms, then send 4 more, and the other party confirms, then send 8 more.

Congestion window change relationship diagram with time

Summary: The initial value of cwnd (congestion window) is relatively small, and then cwnd will multiply (exponentially increase) as the data packet is confirmed by the receiver (an ACK is received).

congestion avoidance

Notice:

  • ssthresh(slow start threshold): Slow start threshold, after cwnd reaches the threshold, it increases linearly
  • Congestion avoidance (additive growth): The congestion window grows slowly to prevent the network from becoming congested prematurely
  • Judging network congestion: the data sent by the sender has not received the confirmation from the other party, then consider retransmission (some packets are lost), which can indicate that the network may be congested
  • Multiplicative reduction: When the sender perceives that the network may be congested, it reduces ssthresh to half of the peak value, and at the same time executes the slow start algorithm (cwnd returns to the initial value)

Congestion Control - Fast Retransmission

understand:

  • Receiver: Whenever an out-of-sequence packet is received, a repeated confirmation is issued immediately so that the sender knows in time that a packet has not arrived, instead of waiting for the confirmation when it sends data
  • The sender: As long as it receives three consecutive duplicate acknowledgments (a total of 4 identical acknowledgments), it should immediately retransmit the segment that has not been received by the other party, instead of continuing to wait for the retransmission timer to expire before retransmitting

fast recovery

Understanding: When the sender receives three repeated confirmations in a row, it executes the "multiplicative reduction" algorithm to reduce ssthresh to half of the peak value. The difference from the slow start is that the slow start algorithm is not executed now, that is, cwnd will not recover now to the initial value, but set the cwnd value to the value after ssthresh is halved and then start to execute the congestion avoidance algorithm (additive increase), so that the congestion window increases slowly and linearly

TCP connection management

Note: In connection management, the three-way handshake and the four-way handshake message for establishing a connection only have the TCP message header, but not the TCP message data part.

TCP establishes a connection three-way handshake

understand:

  1. The client sends a request to the server to establish a connection, the control bit syn=1, and the ACK is not equal to 1, indicating that this is a request to establish a connection. At this time, a serial number seq will be automatically generated (this serial number is random). We use x to represent this serial number. make a request to the server
  2. After the server receives the request, it will send a confirmation request to the client, and then SYN=1, ACK=1, and the confirmed serial number ack=x+1 (expect to send the next request of the original serial number) at this time The terminal randomly generates a sequence number seq=y;
  3. The client receives the confirmation from the server, the client confirms ACK=1, the client will not send any more requests, so there is no SYN, and at the same time there is a confirmation sequence number ack=y+1 (the server is expected to perform the next step), seq=x+1
  4. start data transfer

The reason why TCP establishes a connection three-way handshake

If only two handshakes are required to establish a connection, it may happen that the first request sent by the client arrives at the server some time after the connection is released due to network delay, and the server responds to this lateness after receiving the request The connection request, but the client has accepted the data just now, and will not initiate an http request, so the server will wait forever and waste resources (always in the connection establishment state)

Note: If the third handshake fails, the state of the server is syn-rcvd at this time. If the ACK from the client cannot be waited, the server will resend the SYN+ACK packet. If the server resends the SYN+ACK multiple times at this time, it will wait If the client's ACK is not received, an RST packet will be sent to forcibly close the connection

Connection process (take http get request as an example - the data part is 4 bytes)

Note: The seq and ack used here are relative

1. First, the client starts to initiate an http get request, seq is 1 (compared to the previously established connection x) and ack is 1 (relative to the previously established connection y), which means that the server is expected to start sending data;

2. Then the server receives the request to send 4 packets continuously (the previous continuous arq protocol)

The seq here is the starting byte sequence number of each data packet sent, and the ack here is to hope that the client starts sending the k+1th byte;

3. The client receives 4 TCP data segments from the other party, the TCP data part occupies 0 bytes, seq indicates that this is k+1 bytes sent to the server, and ack indicates that the server expects to send b1+b2+b3+b4 +1 byte

TCP releases the connection and waves four times

Foreword:

  • TCP is full-duplex communication
  • On the TCP/IP protocol stack, any party is allowed to initiate a disconnection request

understand:

  1. After the data transmission is completed, the client sends a release connection request FIN=1 to the server, and at the same time randomly generates the serial number u
  2. The server receives the request from the client to confirm ACK=1, and randomly generates the sequence number seq=v
  3. At this time, if the server feels that there is nothing to send to the client, the server will also send a release connection request to the client (FIN=1) to randomly generate a serial number w
  4. The client receives the request from the server to confirm it ACK=1 (tells the server to know that there is nothing to send to itself) and both parties close the connection

Why is it necessary to wave four times to release the connection?

  • The first wave: When the client sends a FIN message, it means that the client tells the server that it has no data to send, but it can still receive data from the server
  • Second wave: When the server returns an ACK message, it means that the server already knows that the client has no data to send, but the server can still send data to the client
  • The third wave: When the server also sends a FIN message, it means that the server tells the client that the server has no data to send
  • The fourth wave: When the client returns the ACK message, it means that the client already knows that the server has no data to send, and then officially disconnects the entire TCP connection

The reason why the client needs to wait for some time after sending the ACK message

If the client releases it immediately after sending the ACK, and the server does not receive the ACK from the client due to network reasons, the server will resend the FIN. The following situations may occur

  • If the client does not respond, the server will wait and even resend the FIN multiple times, wasting resources
  • The client happens to have a new application, which is assigned the same port number. The new application starts to disconnect immediately after receiving the FIN, but it originally wanted to establish a connection with the server.

confirmation of time wait

Generally, it is twice the MSL (MAXimum Segment Lifetime, maximum segment lifetime). MSL is the maximum lifetime of TCP packets on the internet. Each specific TCP implementation must choose a certain MSL value. RFC 1122 recommends 2 Minutes, which can prevent the data packets generated in this connection from being mistransmitted to the next connection (because the data packets in this connection will disappear within 2MSL time)

Guess you like

Origin blog.csdn.net/m0_60027772/article/details/129135523