Transport layer introduction

Through the learning of the IP network layer, we know that the IP protocol can send the packets sent by the source host A to the destination host B according to the destination address in the header. So why do we need a transport layer?
From the IP layer, the two ends of the communication are two hosts, which is often called point-to-point communication . But the real communication entity in the computer network is the process running in the host computer, and it is their data interaction. Therefore, the IP network layer only delivers the packet to the host, but does not deliver it to the process of real communication. Therefore, the transport layer is required to indicate that the data packet received from the network should be handed over to the process. This is often referred to as end-to-end communication .
Since the host may be running multiple network applications at the same time, it is necessary for the transport layer to be able to distinguish the network layer packets should be delivered to that application. At this time, a sign is needed to indicate that it is delivered to an application, and the port number is used for identification in the computer network . Multiple application processes can use the transport layer protocol to transmit data, that is, the transport layer has multiplexing functions , and after stripping the header of the packet after receiving the packet, the data can be correctly delivered to the target application process, so the transport layer It must also have a demultiplexing function . The following figure is the information delivery process between processes in two hosts.
Insert picture description here
The transport layer in TCP / IP uses a 16-bit port number to mark a port. The port number is divided into: system port number (0-1023), registration port number (1024-49151), port number used by the end user (49152-65535 ). The following are commonly used well-known port numbers:
Insert picture description here
where the transport layer provides logical communication between application processes. The logical communication refers to: the application layer only needs to hand over application packets to the following transport layer, and the transport layer can transmit these packets to The other party's transport layer seems to communicate data directly along the horizontal direction. But in fact there is no horizontal physical connection between the two transport layers.
The transport layer shields high-level users from the details of the core of the network below. It makes the application process see as if there is an end-to-end logical communication channel between the two transport layer entities . The transport layer needs to check the received messages for errors. When the IP packet arrives at the destination host and is delivered to the upper layer, if it is found that the port number is incorrect (that is, there is no process of this port), it will send an "port unreachable" error control message to the sender.
The two main protocols at the transport layer: User Datagram Protocol UDP and Transmission Control Protocol .

UDP protocol

Main features of UDP :

  • UDP is connectionless, that is, there is no need to establish a connection before sending data
  • UDP uses best effort delivery, which does not guarantee reliable delivery
  • UDP isMessage-oriented, That is, the UDP message sent by the sender to the application is delivered to the IP layer after the header is added (the message is neither merged nor split)
  • UDP does not have congestion control, that is, the presence of congestion on the network will not reduce the sending rate of the source host
  • UDP supports one-to-one, one-to-many, many-to-one and many-to-many interactive communication
  • UDP header overhead is small, only 8 byte
    UDP biggest advantages is its real-time high and simple , UDP header format as shown below:
    Insert picture description here
  • The pseudo header is not the real header of the UDP user datagram, it is only used for calculation and verification, and it is not submitted upward or downward.
  • The port number is used to mark the communication process
  • Length: the length of the entire user datagram
  • Checksum: Check the entire user datagram .

TCP protocol

Main features of TCP

  • TCP is a connection-oriented transport layer protocol, that is, an application must establish a TCP connection before using the TCP protocol. After transmitting data, the established TCP connection must be released.
  • Each TCP connection can only have two endpoints, and each TCP connection can only be point-to-point .
  • TCP provides reliable delivery services
  • TCP is byte-oriented . In TCP communication, the number of bytes that a packet should contain is determined based on the window value given by the other party and the current network congestion level .
    Each TCP connection has two endpoints, so what is the endpoint of the TCP connection? The endpoint of a TCP connection is called a socket or socket. The socket is spliced ​​from the port number to the IP address. In Linux, both TCP and UDP protocols use sockets for communication.

Header format of TCP segment

Insert picture description here
TCP segment headerThe first 20 bytes are fixed, The next 4N bytes are options added according to the needs, the following briefly introduces the meaning of each field.

  • Port number: Write the source port and destination port separately, used to mark the type of upper-layer application.
  • Sequence number: The sequence number is calculated using mod 2 32. The value of the sequence number field in the header refers to the sequence number of the first byte of the data sent in this segment . TCP is byte-oriented. Each byte in the byte stream transmitted in a TCP connection is numbered in sequence. The starting sequence number of the entire byte stream to be transmitted must be set when the connection is established.
  • Confirmation Number:Expect to receive the sequence number of the first data byte of the next segment of the other party
  • Data offset: It indicates how far the data start of TCP segment is from the start of TCP segment. That is the length of the TCP packet header .
  • Emergency URG: When URG = 1, it indicates that the emergency pointer field is valid. It tells the system that there is urgent data in this segment, which should be transmitted as soon as possible . The sender inserts the emergency data into the front of the message data, and the data after the emergency data is still ordinary data. At this time, it must be used in conjunction with the emergency pointer field in the header.
  • Confirm ACK: The confirmation number field is only valid when ACK = 1, otherwise the confirmation number is invalid. TCP stipulates that all transmitted segments must set ACK after the connection is established.
  • Push PUH: TCP uses push push operation. At this time, the sender sets PSH to 1 and immediately creates a segment to send out. Similarly, the receiver delivers the receiving application process as soon as possible after receiving the PSH = 1 segment. Instead of waiting for the entire cache to fill up before delivering it upwards.
  • Reset RST: RST = 1, indicating that a serious error occurred in the TCP connection, the connection must be released, and then re-establish the transport connection. This field is also used to reject an illegal segment or refuse to open a connection.
  • Synchronous SYN: used to synchronize the serial number when the connection is established. When SYN = 1, ACK = 0, it indicates that this is a connection request segment. If you agree to establish a connection, SYN = 1 and ACK = 1 in the response segment.
  • Terminate FIN: It is used to release a connection. When FIN = 1, it indicates that the sender data of this segment has been sent and requests to release the transport connection.
  • Window: The window refers to the receiving window of the party sending this segment. The window value tells the opposite party the amount of data (in bytes) the recipient currently allows the other party to send from the confirmation number in the header of this segment.
  • Checksum: The scope of the checksum check is the entire TCP message (that is, including the header and data). Similar to UDP, a 12-byte pseudo header needs to be added .
  • Urgent pointer: This field is meaningful only when URG = 1. It indicates the number of bytes of urgent data in this segment ( normal data after the end of urgent data ).Note that emergency data can be sent even when the window is zero
  • Options: variable length, up to 40 bytes. There are maximum message segment length MSS, window expansion, time stamp, selection confirmation and other options.

TCP transport connection management

TCP is a connection-oriented protocol, and transport connections are used to transmit TCP messages. There are three stages in the TCP transport process and circuit interaction, namely connection establishment, data transmission, and connection release. The connection-oriented refers to TCP communication, each TCP channel can only connect two ports, during the communication process these two ports (including the corresponding resources: transmission control block TCB) has been assigned to the process of communication Use, other applications can not be used . Corrected here, there are still some misunderstandings about the previous connection-oriented understanding.Connection-oriented means that TCP needs to establish a connection before transmitting data, determine the serial number of both parties, and the like. In the subsequent transmission, these values ​​are needed to judge whether the transmission is successful. The transmission control block TCB mentioned earlier is not the point. Every time a connection core is established, it will allocate some resources to the application. Like one-to-many communication in the server, its socket resources are public.. Now the understanding is like this, if the reader has a better understanding, please correct me.

establish connection

TCP uses a three-way handshake to establish a connection. The following issues need to be resolved during the connection establishment process:

  • To enable each party to know the existence of the other party
  • Allow both parties to negotiate some parameters (such as serial number, maximum window value, whether to use options, etc.)
  • The process of allocating (buffer size)
    handshake to the resources of the transport entity is as follows: The connection is established using the SYN field . TCP requires that the SYN segment cannot carry data, but it consumes a sequence number. The ACK segment can carry data, but if it does not carry data No serial number is consumed. Note that the second handshake can also be split into two segments, send an acknowledgment message (ie ACK = 1, ack = x + 1) and then send a synchronization segment (SYN = 1, seq = y ) Such a process becomes a four-packet handshake with the same effect. The above ACK is the header field ACK, and ack is the confirmation number.
    Insert picture description here
    After B returns data to A, it can indicate that AB exists. Why does A need to send a confirmation at last?This is mainly to prevent the invalid connection request message from being suddenly sent to B, which causes an error. For example, the first connection request message sent by A stays for a long time at a network node, and A does not receive the confirmation message and continues to send a connection request. Normal communication ... release the connection. After the connection was released, the request that had been stuck in the network for a long time was sent to B. This is because B mistakenly thought that A sent a new connection request, so he sent A confirmation segment to A and agreed to establish a connection. If the third message handshake is not used, then after B sends out an acknowledgment, a new connection is established. At this time, B thinks that a new transport connection has been established and has been waiting for A to send data. Many resources of B are thus wasted. Using this feature, it can cause an attack on the server. Many hosts send SYN requests to the server only once, and then shut down the process. The server will always wait for A's response, and waste resources. Each listen in the sock has a queue length. , It will be discarded if it exceeds. So a normal request cannot be established. This is often referred to as the SYN attack .

TCP connection release

The termination control bit FIN in the header of the send-release connection segment must be set to 1, and the same TCP stipulates that the FIN segment consumes a sequence number even if it does not carry data .
Insert picture description here
The process is:

  • A's application process first sends a connection release message segment to its TCP, stops sending data, and actively closes the TCP connection. Send the message (FIN is set to 1, seq = u, u is the sequence number of the last byte of the previously transmitted data plus 1), enter the FIN-WAIT-1 state, waiting for B's confirmation.
  • After receiving the connection release message, B sends out an acknowledgment (ACK = 1, seq = u, ack = u + 1) and enters the CLOSE-WAIT state. The TCP server process should inform the high-level application process at this time that the connection from A to B is released. At this time, the TCP connection is in a semi-closed state.
  • After receiving the confirmation from B, A enters the FIN-WAIT-2 state and waits for the connection release message from B.
  • If B has no data to send to A, its application process notifies TCP to release the connection. At this time, B sends a release message, and then B enters the LAST-ACK state, waiting for A's confirmation.
  • After receiving the B connection release message segment, A must confirm this, send the message, and then enter the TIME-WAIT state. Please note that the TCP connection has not been released yet, and the time must wait for the time set by the timer 2MSL before A enters the CLOSE state. Each time an ACK is issued, the wait timer is reset.
  • B immediately enters the CLOSE state after receiving A's confirmation.
    Why does A have to wait 2MSL in TIME-WAIT state?
    To ensure that the last ACK message sent by A can reach BThis ACK may be lost. B will not retransmit the FIN message if it does not receive the ACK message confirmation within the corresponding time. B needs to retransmit the confirmation within this waiting time. Otherwise, B cannot enter the CLOSE state.In addition, it can prevent the invalid connection request segment mentioned in the three-way handshake from appearing in this connectionBecause after 2MSL, all the segments generated during the duration of this connection will disappear from the network.
    After releasing the connection, AB will cancel the corresponding transmission control block TCB. The longest message segment life MSL in the above is basically set to 2 minutes. In addition to the time waiting timer in the TCP protocol , there is also a keep-alive timer , which is used to prevent the device from malfunctioning and not communicating normally during the communication process, thereby causing the receiving end to wait and waste resources. The keep-alive timer is usually set to 2 hours. If no datagram is received within 2 hours, a probe segment will be sent. Every 75 seconds, if there are still no responses after sending 10 probe segments continuously, Then close the connection directly.

How TCP reliable transmission works

We know that the packets sent by TCP are handed over to the IP layer for transmission, but the IP layer can only provide best-effort services, so TCP must take appropriate measures to make the communication between the two transport layers reliable. The confirmation and retransmission mechanism used here is also often called automatic retransmission request ARQ

  • Stop waiting protocol
    Stop waiting protocol is to stop sending every time a packet is sent, wait for the other party's confirmation, and then send the next packet after receiving the confirmation. The sender sets a timeout timer after sending a packet , and after the timer expires, resends the last packet sent. Before receiving the confirmation of the other party before the timer expires, the set timeout timer is cancelled. Reliable transmission protocols like this are often called automatic retransmission request ARQ
    Insert picture description here
    stop waiting protocols. The advantages are simple, but the disadvantage is that the channel utilization rate is too low. In order to improve the transmission efficiency, the sender can use the pipeline transmission instead of the inefficient stop-and-wait protocol . That is, the sender can send multiple packets in succession, without having to pause every time a packet is sent and wait for the other party's confirmation. As shown in the following figure:
    Insert picture description here
    This is the continuous ARQ protocol . The sender maintains the sending window, and then moves the sending window forward by one packet after receiving the packet confirmation. The receiver generally adopts the cumulative acknowledgment method. After receiving multiple packets, it sends an acknowledgment to the last packet that arrives in sequence .

In the TCP protocol, there are many mechanisms to ensure the stable operation of data, the most important of which is the sliding window control . Similar to the pipeline transmission described above, data packets are continuously sent when conditions permit. The receiver will tell the sender the buffer status (port size) of the receiver, and the sender sends data to the receiver according to the window.
In network communication, the application layer hands over the data that needs to be sent to the transport layer, and then handles other things. Since TCP is oriented to byte stream transmission, when will the TCP cache receive the data from the application layer when it is sent? What?
There are three TCP sending opportunities: TCP maintains a variable, which is equal to the maximum message segment length, when the data stored in the buffer reaches the MSS byte, it is sent; PUCH push, used for data that needs urgent processing, TCP receives PUSH from the application layer After the instruction, the message segment is sent immediately; TCP starts the timer, and the timer is sent when the time is up.
Flow control in TCP communication is to let the sender send rate not too fast, but let the receiver have time to receive. In fact, it is to control the size of the window. Then in order to make TCP more efficient, the Nagle algorithm is often used to send the data received in the cache first, after receiving the acknowledgment, then send all the data in the cache, and continue to wait for the receipt confirmation ... If the buffered data has reached half the size of the transmission window or the maximum length of the message segment, a message segment is sent immediately. The receiver is also the same, the receiver has enough space to accommodate a longest segment, or wait until the receiving buffer has half free space . The receiver will send a confirmation message.
In flow control, there is a zero window duration timer. As long as the TCP receiving party receives the other party's zero window notification, it starts the duration timer and the time expires. Just send a zero window detection message segment, so that the deadlock deadlock can be broken (the sender has been waiting for a non-zero window message, but this message is lost). The
TCP congestion control mechanism is similar to the flow control implementation. When the traffic in the network is too large, the rate of sending data will be controlled through the window, thereby reducing the degree of congestion. But congestion and flow control are not a concept.

Published 35 original articles · Like1 · Visits 1870

Guess you like

Origin blog.csdn.net/lzj_linux188/article/details/104500164
Recommended