[Network] Transport layer protocol - TCP/UDP

Table of contents

Let's talk about the port number

Port number range division

netstat

 pidof

UDP protocol

UDP protocol end format

Features of UDP

Datagram Oriented

UDP buffer

Precautions for using UDP

Application layer protocol based on UDP

TCP protocol

Characteristics of TCP and its purpose

TCP protocol segment format

reliability issues

Acknowledgment response mechanism (ACK)

Timeout retransmission mechanism

How to determine the retransmission timeout

flow control

Connection management - 3 handshakes and 4 waves 

tcp three-way handshake

tcp waved four times

Verify CLOSE_WAIT state

The second parameter of listen

Verify TIME_WAIT state

The method to solve the bind failure caused by the TIME_WAIT state

sliding window

congestion control

delayed response

piggybacking 

stream-oriented

sticky bag problem

Does UDP have the problem of sticky packets?

TCP exception

Realize reliable transmission with UDP

TCP summary

Protocol based on TCP application layer


The transport layer is responsible for data transfer from the sender to the receiver. There are two representative transport layer protocols in TCP/IP, namely UDP and TCP. TCP provides a reliable communication transport, while UDP is often used to allow broadcast and detail control to the application communication transport. In a word, it is very important to choose the appropriate transport layer protocol according to the specific characteristics of the communication.

Let's talk about the port number

The port number (Port) identifies different application programs for communication on a host, and multiple programs can run on a computer at the same time. The transport layer protocol uses these port numbers to identify the application program that is communicating in the machine, and transmits the data accurately.

In the TCP/IP protocol, a five-tuple such as "source IP", "source port number", "destination IP", "destination port number", and "protocol number" represents a communication (can be viewed through netstat -n)

Port number range division

  • 0-1023: well-known port numbers, HTTP, FTP, SSH and other widely used application layer protocols, their port numbers are fixed
  • 1024-65535: The port number dynamically allocated by the operating system, the port number of the client program is allocated by the operating system from this range

Some servers are very commonly used. For the convenience of use, people agree that some commonly used servers use the following fixed port numbers

  • ssh server, using port 22
  • ftp server, using port 21
  • Telnet server, using port 23
  • http server, using port 80
  • https server, use 443

cat /etc/services can view well-known port numbers, we should avoid these well-known port numbers when we write a program to use port numbers.

 A process can bind multiple port numbers but a port number can only be bound by one process

netstat

netstat is an important tool for viewing network status

Common options:

  • n refuses to display aliases, and all numbers that can be displayed are converted into numbers
  • l Only list the service status with Listen (monitoring)
  • p Displays the program name that established the relative link
  • t only shows tcp related options
  • u only show udp related options
  • a Display all options, default does not display Listen related

 pidof

It is very convenient to check the process id of the server again, through the process name, check the process id

UDP protocol

UDP (User Datagram Protocol) does not provide a complex control mechanism, and uses IP to provide connectionless communication services. And it is a mechanism to send the data sent by the application to the network as it is at the moment it is received.

UDP protocol end format

  • 16-bit UDP length, indicating the maximum length of the entire datagram (UDP header + UDP data)
  • If the checksum is wrong, it will be discarded directly

The first 8 fields are called the UDP header, and the data is called the payload. If the application layer sends Hello, then the Hello is in the data.

We can see that UDP uses a fixed-length header. Then when UDP is encapsulated, just add a UDP header directly before the payload, and when splitting, the application layer can directly extract the first 8 bytes to get the UDP header field.

The tcp/ip protocol of the network protocol stack is implemented in the kernel, and the kernel is implemented in C language

struct udp_hdr
{
    unsigned int src_port : 16;
    unsigned int dst_port : 16;
    unsigned int udp_len  : 16;
    unsigned int udp_check : 16;
};

This is the udp header field, which we call a bit field in C. When applying for a space, the bit segment will be applied in the previous type. Therefore, the width of the message is 0-31, so the header of udp is 8 bytes.

Features of UDP

The process of UDP transmission is similar to sending a letter

  • No connection: Know the IP and port number of the opposite end and transmit directly without establishing a connection
  • Unreliable: no confirmation mechanism, no retransmission mechanism, if I cannot send to the other party due to network failure, the UDP protocol layer will not return any error information to the application layer
  • Datagram-oriented: cannot flexibly control the number and quantity of reading and writing data

Datagram Oriented

The application layer sends the UDP the length of the message, and the UDP sends it as it is, neither splitting nor merging

Use UDP to transmit 100 bytes of data: If the sender calls sendto once to send 100 bytes, then the receiver must also call the corresponding recvfrom once to accept 100 bytes, instead of calling recvfrom 10 times in a loop, each time Accepts 10 bytes.

UDP buffer

  • UDP does not have a real sending buffer. Calling sendto will be directly handed over to the kernel, and the kernel will pass the data to the network protocol for subsequent transmission actions.
  • UDP has a receiving buffer, but this receiving buffer cannot guarantee that the order of received UDP packets is consistent with the sequence of sending UDP packets. If the buffer is full, the arriving UDP data will be discarded.

UDP sockets can both read and write. This concept is called full duplex .

Precautions for using UDP

We noticed that there is a maximum length of 16 bits in the UDP protocol header, which means that the maximum length of data that can be transmitted by a UDP is 64K (including the UDP header), but 64K is a very small number in today's Internet environment. If the data we need to transmit exceeds 64K, we need to manually divide the packets at the application layer, send them multiple times, and manually splice them at the receiving end.

Application layer protocol based on UDP

  • NFS: Network File System
  • TFTP: Trivial File Transfer Protocol
  • DHCP: Dynamic Host Configuration Protocol
  • BOOTP: Boot protocol (for diskless device boot)
  • DNS: Domain Name Resolution Protocol

TCP protocol

The full name of TCP is " Transmission Control Protocol ". As the name suggests , a detailed control of data transmission is required; the difference between TCP and UDP is quite large, and it fully realizes various control functions during data transmission. It can control the retransmission when the packet is lost, and can also control the sequence of the sub-packets whose order is out of order. And none of these are available in UDP. In addition, as a connection-oriented protocol, TCP sends data only when the existence of the communication peer is confirmed, so that the waste of communication traffic can be controlled.
Based on these mechanisms of TCP, highly reliable communication can also be realized on a connectionless network such as IP.

Characteristics of TCP and its purpose

In order to achieve reliable transmission over IP datagrams, many things need to be considered. TCP realizes reliable transmission through mechanisms such as checksum, sequence number, confirmation response, retransmission control, connection management, and window control.

TCP protocol segment format

  • Source/destination port number: Indicates which process the data comes from and which process it goes to
  • 32-bit serial number/32-bit confirmation serial number:
    • Serial number (32 bits): Refers to the location where the data is sent. Every time data is sent, the size of the data bytes is accumulated once. The serial number does not start from 0 or 1, but a random number generated by the computer when the connection is established as its initial value, which is sent to the receiving host through the SYN packet.
    • Confirmation sequence number (32 bits): The confirmation response sequence number is 32 bits long, which refers to the sequence number of the data that should be received next time. In fact, it refers to the data that has been received one digit before the acknowledgment number. After receiving this confirmation response, the sender can consider that the data before this serial number has been accepted normally.
  • 4-bit header length (data offset): This field is 4 bits long, unit byte, the actual size is 15*4 = 60 bytes, this field indicates which bit of the TCP packet should start to calculate the data part transmitted by TCP , of course, it can also be regarded as the length of the TCP header. Therefore, if the option field is not included, the length of the TCP header is 20 bytes long, so the minimum length of the 4-bit header is set to 5. Conversely, if the value of this field is 5, it means that from the very beginning of the TCP packet to 20 bytes So far it is the TCP header, and the rest is TCP data.
  • 6 reserved bits: This field is mainly used for future expansion.

reliability issues

Part of the reliability of TCP is reflected in the header field.

1. What is unreliable?

Packet loss, out of order, packet verification failure.....

2. How to confirm whether a message is lost or not?

If we receive a response, we confirm that it is not lost; otherwise, we are not sure!

For example: You send a message to your friend: Have you eaten? Can you confirm he received it? Actually it is impossible. But if he replies to you: eat, eat dumplings! Through his reply, we can confirm that the message we just sent has been received by the other party. Therefore, as long as we get a response, it means that the other party has received 100% of the message we just sent.

In the case of long-distance interaction, there will always be a piece of latest data that has no response, but as long as the message we send has a corresponding response, we think that the message we sent was received by the other party! And this idea is that TCP is reliable. basic idea of ​​sex. This mechanism is the confirmation response mechanism

Acknowledgment response mechanism (ACK)

TCP achieves reliable data transmission through positive acknowledgment responses (ACK). When the sender sends the data, it will wait for the confirmation response from the peer. If there is an acknowledgment response, it means that the data has successfully reached the peer, otherwise, the data may be lost.

There is a serial number field in the message. The serial number is the serial number marked with each byte of the data sent in order. The receiving end queries the serial number and the length of the data in the TCP header of the received data, and the next step should be The received sequence number is returned as a confirmation response, and in this way, through the sequence number and confirmation response, TCP practices reliable transmission

sent data

Serial Number and Confirmation Response Number 

When host A sends data (1-1000 bytes), if host B receives the 1000 bytes, the confirmation sequence number is 1001 and sent to host A, when host A receives the discovery confirmation response is 1001, host A will Knowing that host B has successfully received the first 1000 bytes, it can continue to send the following data. Each ACK has a corresponding confirmation sequence number , which means telling the sender what data I have received ; where do you start sending next time .

So why do you need two sets of sequence numbers? Why both serial number and confirmation serial number ?

Because TCP is full-duplex, I can receive messages while sending you messages.

Example: The client sends a message to the server with a sequence number of 10; the server returns a message to the client and wants to send its own message, so it must have its own sequence number! Therefore, if the server wants to give you a reply and also wants to send a message to the client, it must fill in the confirmation serial number if it wants to reply, and it must ensure reliability if it wants to send a message, so the server also needs to carry its own serial number! Therefore, you need to set it at the same time, you need a serial number and a confirmation serial number!

Timeout retransmission mechanism

  • After host A sends data to B, the data may not reach host B due to network congestion and other reasons
  • If host A does not receive an acknowledgment from B within a certain time interval, it will retransmit 

However, host A did not receive the confirmation response sent by B, or it may be because the ACK was lost.

The confirmation response returned by host B is lost during transmission due to network congestion and other reasons, and does not reach host A. Host A will wait for a period of time. If the confirmation response cannot be received within a specific time interval, host A will The data is resent, and at this time, the host B will send the confirmation reply for the second time that the data has been received. Since host B has already received data from 1 to 1000, it will give up when the same data is delivered again. And how does host B confirm that this piece of data is duplicated? It is through the serial number. If I have received the serial number in the TCP header of this piece of data, then it can be determined that this piece of data is repeated. Therefore, another function of the sequence number is to deduplicate packets.

How to determine the retransmission timeout

Retransmission timeout refers to the specific time interval to wait for an acknowledgment before retransmitting data. If the confirmation response is not received after this time, the sender will resend the data. So if it times out, how is the time determined?

  • Ideally , find a minimum time to ensure that " the confirmation response must be returned within this time ".
  • However, the length of this time varies with the network environment .
  • If the timeout is set too long , it will affect the overall retransmission efficiency ;
  • If the timeout is set too short , repeated packets may be sent frequently ;

In order to ensure high-performance communication in any environment , TCP dynamically calculates the maximum timeout period.

In Linux, the timeout is controlled in units of 0.5 seconds, so the retransmission timeout is an integer multiple of 0.5 seconds'. However, since the round-trip time of the initial data packet is not yet known, its retransmission timeout is generally set to about 6 seconds. In the BSD Unix and Windows systems, the timeout is controlled in units of 0.5 seconds, so the retransmission timeout is an integer multiple of 0.5 seconds'. However, since the round-trip time of the initial data packet is not yet known, the retransmission timeout is generally set to about 6 seconds.

After the data is resent, if the confirmation response is still not received, it will be sent again. At this time, the time for waiting for the confirmation response will be extended with an exponential function of 2 times and 4 times.

In addition, the data will not be resent infinitely and repeatedly. After reaching a certain number of retransmissions, if there is still no confirmation response returned, it will be judged that an abnormality has occurred in the network or the peer host, and the connection will be forcibly closed. And notify the application that the communication is terminated abnormally and forcibly.

flow control

When to send data, how much to send, what to do if something goes wrong, whether to add a strategy to improve efficiency -- all are determined by TCP in the OS. Therefore, the TCP protocol is called the Transmission Control Protocol

What if the client sends too fast and the server has no time to accept it? Since the receiving buffer of the server is full, redundant packets can only be discarded. However, sending a message will consume network resources, so when we send a message, we must send a message of the corresponding size according to the acceptance capability of the peer. Therefore, it is necessary to let the Client know the acceptability of the Server. Therefore, the acceptability of the server is the size of the remaining space in the accept buffer, so how to let the client know, so when the server responds, there is an attribute field in the TCP header that saves the acceptability of the server: it is exactly the size of the 16-bit window

  • The receiving end puts the buffer size that it can receive into the " window size " field in the TCP header , and notifies the sending end through the ACK end ;
  • The larger the window size field , the higher the throughput of the network ;
  • Once the receiving end finds that its buffer is almost full , it will set the window size to a smaller value and notify the sending end ;
  • After the sender receives this window , it will slow down its sending speed ;
  • If the receiver buffer is full , the window will be set to 0; at this time, the sender will no longer send data , but it needs to send a window detection data , so that the receiver can tell the sender the window size
  • The maximum value of 16 digits is 65535, so is the maximum TCP window 65535 bytes ?
    In fact , the TCP header 40- byte option also includes a window expansion factor M, and the actual window size is the value of the window field shifted left by M bits ;

 

If the receiving buffer of the receiving end is full, the sending end will not send messages, but will send window detection messages irregularly to know the real-time acceptance capability of the destination host at the receiving end.

Connection management - 3 handshakes and 4 waves 

TCP provides a connection-oriented communication transport. Connection-oriented refers to the preparation of both ends of the communication before the start of data communication.

For the establishment and disconnection of a connection, the normal process needs to send at least 7 packets back and forth to complete. Only when the 3-way handshake is completed can the connection be established successfully and formal communication be possible! After the communication is complete, it takes four waves to disconnect!

From the perspective of the server, some received messages are used to establish a connection, some are used to disconnect, and some are used to transmit data, so the messages also have categories! Therefore, as a server, it is necessary to distinguish the types of packets!

  1. SYN: As long as the message is a request to establish a connection, the SYN flag needs to be set to 1
  2. FIN: This message is a flag bit for a disconnection request, so SYN and FIN will not be set to 1 at the same time.
  3. ACK: Indicates that the message is an acknowledgment of historical messages. TCP stipulates that this bit must be set to 1 except for the SYN packet when the connection is initially established.

  4. PSH: When it is 1, it means that the received data needs to be transmitted to the upper application layer protocol immediately. When the PSH is 0, it does not need to be transmitted immediately but cached first.

  5. URG: Indicates that there is data in the package that needs to be processed urgently. Packets may arrive out of order when they are sent, and out of order arrival is unreliable. Therefore, sequence numbers are needed to make packets arrive in order. So if the data must arrive in order in TCP, that is to say, if some data have higher priority, but the sequence number is later, can’t the data be processed urgently? If such a message wants to be processed first, Then set the URG flag to 1. Cooperating with it is a 16-bit urgent pointer, which is the marking bit of the calibrated urgent data in the data. Therefore, the data in a message is not all urgent data, but the data of one byte pointed to by the 16-bit urgent pointer is urgent data.

    1. It is generally used when communication is temporarily interrupted, or when communication is interrupted. For example, when you click the stop button in the web browser, or use telent to output Ctrl+C, there will be a package with URG 1. In addition, the urgent pointer is also used as a flag to indicate the fragmentation of the data stream

  6. RST: Reconnection flag, explained below.

tcp three-way handshake

Why the three-way handshake?

Because tcp is connection-oriented, a connection must be established before communication.

The state transition of the client (the party that actively disconnects) in the three-way handshake :

  1. [CLOSED -> SYN_SENT] The client calls connect to send a synchronization segment ;
  2. [SYN_SENT -> ESTABLISHED] If the connect call is successful , it will enter the ESTABLISHED state and start reading and writing data

The server-side state transition in the three-way handshake:

  1. [CLOSED -> LISTEN] The server enters the LISTEN state after calling listen , waiting for the client to connect ;
  2. [LISTEN -> SYN_RCVD] Once the connection request ( synchronization segment ) is monitored, the connection will be put into the kernel waiting queue , and a SYN confirmation message will be sent to the client .
  3. [SYN_RCVD -> ESTABLISHED] Once the server receives the confirmation message from the client , it enters the ESTABLISHED state and can read and write data.

Why 3 times? Not once? 2 times? 4 times?

One handshake -- no:

The answer is no, because there is only one handshake, that is, the client only sends a SYN to the server, then if the client sends SYN to the server in a loop, then the server will consume a lot of effective resources to maintain this link, so only one Set the machine, you can let the server waste a lot of resources - SYN flood. So one connection is definitely not enough.

Second handshake -- no:

During the two handshakes, the client sends a SYN to the server, and the server replies with an ACK. In fact, two handshakes have the same effect as one handshake. When the client continues to send SYN floods to the server, when the server receives the request and replies with an ACK, the link will also be maintained, and the ACK returned by the server to the client is directly discarded by the client, then the effect is very similar to a handshake .

Three-way handshake -- can:

The reason why one and two times do not work is because the server first thinks that the link has been established, and the three-way handshake can give the server the last chance to confirm. That is to say, if the client sends a SYN flood to the server, the server has to maintain these links and consume resources, and when replying with ACK, the client also establishes a maintenance link, and then returns ACK to the server, so there is one more handshake, and the client The server side also consumes resources to maintain the link just like the server side, and the server side will also pull the client into the water. Therefore, even if the three-way handshake fails or is illegal, the cost of the last message loss can be grafted to the client.

6.RST: If the last ACK is lost, the client considers the connection established successfully, the server waits for the peer to reply ACK, and the server will retransmit SYN+ACK after a period of time. So if during this period, because the client thinks that the connection is established successfully, the client has already started sending messages to the server, and the server is confused when it receives the message? It’s not that the connection has not been established yet, so why did the news come? Therefore, at this time, the server immediately responds with an ACK to the data sent by the client, and sets the RST flag to 1, telling the client to close the connection and reconnect.

tcp waved four times

The state transition of the client (the party that actively disconnects) in the four-wave wave :
  • [ESTABLISHED -> FIN_WAIT_1] When the client actively calls close , it sends an end segment to the server and enters FIN_WAIT_1 at the same time;
  • [FIN_WAIT_1 -> FIN_WAIT_2] The client receives the server's confirmation of the end segment , then enters FIN_WAIT_2, and begins to wait for the server 's end segment ;
  • [FIN_WAIT_2 -> TIME_WAIT] The client receives the end segment from the server , enters TIME_WAIT, and sends LAST_ACK;
  • [TIME_WAIT -> CLOSED] The client will wait for a time of 2MSL (Max Segment Life, the maximum survival time of the message ) before entering the CLOSED state .

State transition of the server in four waves:

  • [ESTABLISHED -> CLOSE_WAIT] When the client actively closes the connection ( calling close), the server will receive the end segment , and the server will return the confirmation segment and enter CLOSE_WAIT;
  • [CLOSE_WAIT -> LAST_ACK] After entering CLOSE_WAIT , it means that the server is ready to close the connection ( the previous data needs to be processed ); when the server actually calls close to close the connection , it will send a FIN to the client . At this time, the server enters the LAST_ACK state and waits for the last ACK Arrival ( this ACK is the client's confirmation of receipt of FIN)
  • [LAST_ACK -> CLOSED] The server has received the ACK for FIN and completely closes the connection.

why waved four times

The FIN sent by the client must be received by the server, so the FIN must have an ACK, so both FINs must have an ACK, which means 4 waved hands. When the client and the server want to disconnect at the same time, send FIN at the same time when the server replies with ACK. Then the whole waving process becomes 3 waved hands.

Verify CLOSE_WAIT state

The CLOSE_WAIT state means that one end wants to disconnect, but the other party does not disconnect, then the other party will remain in the CLOSE_WAIT state. Note that you cannot accept first.

#pragma once

#include <iostream>
#include <string>
#include <cstring>
#include <sys/socket.h>
#include <sys/stat.h>
#include <arpa/inet.h>
#include <netinet/in.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <cassert>
#include <cerrno>
#include <pthread.h>
#include <unistd.h>
#include <signal.h>
class ServerTcp
{
public:
    ServerTcp(uint16_t port,const std::string& ip = "")
        :port_(port),ip_(ip),listenSock_(-1)
        {
            quit_ = false;
        }
    ~ServerTcp()
    {
        if(listenSock_>= 0)
        {
            close(listenSock_);
        }
    }
public:
    void init()
    {
        //1.创建socket
        listenSock_ = socket(PF_INET,SOCK_STREAM,0);
        if(listenSock_ < 0) exit(1);
        //2.bind 绑定
        //填充服务器信息
        struct sockaddr_in local;
        memset(&local,0,sizeof(local));
        local.sin_family = PF_INET;
        local.sin_port = htons(port_);
        ip_.empty()?(local.sin_addr.s_addr = INADDR_ANY):(inet_aton(ip_.c_str(),&local.sin_addr));
        if(bind(listenSock_,(const struct sockaddr*)&local,sizeof(local)) < 0) exit(2);
        //3.监听
        if(listen(listenSock_,2) < 0) exit(3);
        //让别人来链接你
    }
    void loop()
    {
        signal(SIGCHLD,SIG_IGN);//only linux
        while(!quit_)
        {
            sleep(1);
            struct sockaddr_in peer;
            socklen_t len = sizeof(peer);

            int serviceSock = accept(listenSock_,(struct sockaddr *)&peer,&len);
            if(quit_) break;
            if(serviceSock<0)
            {
                //获取链接失败
                std::cerr<<"accept error........." <<std::endl;
                continue;
            }
            std::cout<<"获取新链接成功: "<<std::endl;
        }
    }

    bool quitServer()
    {
        quit_ = true;
        return true;
    }

private:
    int listenSock_;
    uint16_t port_;
    std::string ip_;
    bool quit_;//安全退出
};
#include "server.hpp"

static void Usage()
{
    std::cout << "Usgae:\n\t ./server port" <<std::endl;
    exit(4);
}
int main(int argc,char* argv[])
{
    if(argc != 2) Usage();
    ServerTcp svr(atoi(argv[1]));
    svr.init();
    svr.loop();
    return 0;
}

When it is running, we can use another one and connect to the server (it is not recommended to use one machine here, so if it is one machine, the server and the client are one machine, then we will query Two fields appear, inconvenient to view)

After starting the service, connect to the server on another machine

at this time

We found that the external address of 8082 is 81, officially our other machine. Now his status is ESTABLISHED -- so we come to a conclusion that if we don't accpet now, the three-way handshake will succeed.

What will happen if we let 81 machines link 4 at the same time?

 Let's check the status and find that one of the statuses is SYN_RECV. Let's quickly see what this status is?

 We found that SYN_RECV means that the server has received your connection request, but I will not give you ACK first. So why can't the three-way handshake be completed when the fourth client reconnects? This is related to the second parameter of listen.

The second parameter of listen

Implement the following test code based on the TcpServer  just encapsulated
For servers , the second parameter of listen is set to 2, and accept is not called
At this time, start 3 clients to connect to the server at the same time , use netstat to check the server status , everything is normal . But when starting the fourth client, it is found that the server has a problem with the status of the fourth connection

The second parameter of listen is called the length of the underlying full connection queue. The algorithm is: n+1 indicates the maximum number of connections that the server can maintain without accepting. So we just can only have up to 3 clients connected to us at the same time

The status of the client is normal , but the SYN_RECV status appears on the server side instead of the ESTABLISHED status
This is because , the Linux kernel protocol stack uses two queues for a tcp connection management:
  1. Semi-link queue (used to save requests in SYN_SENT and SYN_RECV states)
  2. Full connection queue ( accpetd queue) (used to save the request that is in the established state, but the application layer does not call accept to take it away)
The length of the full connection queue will be affected by the second parameter of listen . When the full connection queue is full, the state of the current connection cannot continue to enter the established state . The length of this queue can be known from the above experiments . Two parameters + 1.

When we close the first client, we check the status again, and we find that at this time, its status has changed to CLOSE_WAIT. Combined with our flow chart of waving four times , it can be considered that the four wavings were not completed correctly.

summary:

For a large number of CLOSE_WAIT states on the server , the reason is that the server did not close the socket correctly , resulting in four waved hands not being completed correctly. This is a BUG. Just add the corresponding close to solve the problem

When all clients are closed, the state will all become CLOSE_WAIT

Verify TIME_WAIT state

Now do a test , first start the server, then start the client, then use Ctrl-C to terminate the server , then run the server again immediately , the result is

At this point we immediately let the service its CTRL-C

 This is because although the application program of the server is terminated, the connection of the TCP protocol layer is not completely disconnected, so the same server port cannot be monitored again.

  • The TCP protocol stipulates that the party that actively closes the connection must be in the TIME_WAIT state and wait for two MSLs before returning to the CLOSED state
  • We terminated the server with Ctrl-C, so the server is the one that actively closes the connection, and it still cannot listen to the same server port again during TIME_WAIT
  • MSL is specified in RFC1122 as 2 minutes. However, the implementation of each operating system is different , and the default configuration value on Centos7 is 60s.
Why is the time of TIME_WAIT 2MSL?
  1. MSL is the maximum lifetime of a TCP message , so if TIME_WAIT persists for 2MSL , it can ensure that the unreceived or late message segments in both transmission directions have disappeared ( otherwise the server restarts immediately , and may receive messages from the upper Late data for a process, but this data is likely to be wrong );
  2. At the same time, it is also theoretically guaranteed that the last message arrives reliably ( assuming that the last ACK is lost , then the server will resend a FIN. At this time, although the client process is gone, the TCP connection is still there , and the LAST_ACK can still be resent );

The method to solve the bind failure caused by the TIME_WAIT state

Re-listening is not allowed until the TCP connection of the server is completely disconnected , which may be unreasonable in some cases

  • The server needs to handle a very large number of client connections ( the lifetime of each connection may be very short , but there are a large number of clients requesting each second).
  • At this time, if the server side actively closes the connection ( for example, if some clients are not active , they need to be actively cleaned up by the server side ), a large number of TIME_WAIT connections will be generated .
  • Due to our large amount of requests, it may lead to a large number of TIME_WAIT connections , and each connection will occupy a communication quintuple ( source ip, source port, destination ip, destination port , protocol ). Among them, the server's ip and port and The protocol is fixed . If the ip and port number of the new client connection and the link occupied by TIME_WAIT are repeated , problems will occur .

solve:

Use setsockopt() to set the option SO_REUSEADDR of the socket descriptor to 1, which means that multiple socket descriptors with the same port number but different IP addresses are allowed to be created

 

sliding window

If TCP sends each data segment, it must give an ACK confirmation response, and then send the next data segment after receiving the ACK. There is a relatively large certainty in doing this, but the performance is poor, especially the time for data round-trips. long time.

To solve this problem, TCP introduces the concept of a sliding window. It also controls the degradation of network performance even in the case of long round-trip times. When the acknowledgment response is no longer for each segment, but for a larger unit, the forwarding time will be greatly shortened. That is to say, the sending host does not have to wait for the confirmation response after sending a segment, but continues to send.

  • The window size refers to the maximum value that can continue to send data without waiting for an acknowledgment response. The window size in the above figure is 4000 bytes. (4 segments)
  • When sending the first 4 segments, there is no need to wait for any ACK, just send it directly.
  • After receiving the first ACK, the sliding window moves backwards and continues to send the data of the fifth segment; and so on
  • In order to maintain this sliding window, the operating system kernel needs to create a sending buffer to record the current data that has not been answered; only the data that has been answered can be deleted from the buffer.
  • The larger the window, the higher the throughput of the network

 So if there is a packet loss, how to retransmit? There are two situations discussed here

Case 1: The data packet has arrived and the ACK is lost. In this case, it doesn't matter if part of the ACK is lost, because it can be confirmed by subsequent ACKs.

 Case 2: The data packet is directly lost

  • When a certain piece of message is lost, the sender will always receive ACK like 1001, just like reminding the sender "I want 1001"
  • If the sending host receives the same "1001" response three times in a row, it will resend the corresponding data 1001-2000
  • At this time, after the receiving end receives 1001, the ACK returned again is 7001 (because 2001-7000). The receiving end has actually received it before, and it is placed in the receiving buffer of the operating system kernel of the receiving end.

This mechanism is called "high-speed retransmission control" (also called fast retransmission). The sliding window does not necessarily slide to the right, and the sliding window may increase or decrease.

When start_index will go right:

start_index is equal to the confirmation number; end_index=start_index+window size

congestion control

Although TCP has the big killer of sliding window, it can send a large amount of data efficiently and reliably, but if a large amount of data is sent in the initial stage, it may still cause problems. Because there are many computers on the network, the current network status may already be relatively congested. If you do not know the current network status, rashly sending a large amount of data is likely to make things worse.

Therefore, in order to solve this problem, TCP introduces a slow start mechanism, which sends a small amount of data first, explores the path, finds out the current network congestion status, and then decides how fast to transmit data.

  • A concept called congestion window is introduced here
  • When sending starts, define the congestion window size as 1
  • Each time an ACK response is received, the congestion window is incremented by 1
  • Every time a data packet is sent, compare the congestion window with the window size fed back by the receiving host, and take the smaller value as the actual sending window

Therefore, the amount of data that the sender can send to the receiver at one time = min (the receiving capacity of the other party, the congestion window of the network) 

Therefore, the size of the sliding window = min (the remaining value of the opponent's window size, the congestion window of the network)

So end_index = start_index + min(window size, congestion window)

Congestion windows like the one above grow exponentially . " Slow start " simply means slow at first , but very fast growth .
  • In order not to grow so fast , the congestion window cannot simply be doubled .
  • A threshold called slow start is introduced here
  • When the congestion window exceeds this threshold , it no longer grows exponentially , but grows linearly

  • When TCP starts to start, the slow start threshold is equal to the window maximum
  • At each timeout retransmission, the slow start threshold will become half of the original value, and the congestion window will be set to 1

A small amount of packet loss, we just trigger timeout retransmission, a large amount of packet loss, we think the network is congested, when the TCP communication starts, the network throughput will gradually increase, and as the network becomes congested, the throughput will drop immediately. Congestion control, in the final analysis, is a compromise that the TCP protocol wants to transmit data to the other party as quickly as possible, but also avoids causing too much pressure on the network.

delayed response

If the host receiving the data returns an ACK response immediately, the return window may be relatively small at this time.

  • Assume that the buffer at the receiving end is 1M. 500K of data is received at one time; if the response is immediate, the returned window is 500K
  • But in fact, the processing speed of the processing end may be very fast, and 500K data will be consumed from the buffer within 10ms.
  • In this case, the processing at the receiving end is far from reaching its limit, even if the window is enlarged, it can still handle it
  • If the receiving end waits for a while before answering, for example, waits for 200ms before answering, then the window size returned at this time is 1M

We must remember that the larger the window, the greater the network throughput and the higher the transmission efficiency. Our goal is to maximize the transmission efficiency while ensuring that the network is not congested

So can all packets be delayed acknowledgment? certainly not

  1. Quantity limit: Respond every N data packets (generally take this)
  2. Time limit: answer once when the maximum delay time is exceeded
The specific number and timeout time vary depending on the operating system ; generally, N takes 2, and the timeout time takes 200ms;

piggybacking 

On the basis of the delayed response , we found that in many cases , the client server also "sends and receives" at the application layer . It means that the client says "How are you" to the server, and the server will also reply to the client. A "Fine, thank you";
Then at this time, the ACK can hitch a ride and return to the client together with the server's response of "Fine, thank you" .

stream-oriented

Create a TCP socket, and create a send buffer and a receive buffer in the kernel at the same time

  • When calling write, the data will be written to the send buffer first
  • If the number of bytes sent is too long, it will be split into multiple TCP packets and sent
  • If the number of bytes sent is too short, it will wait in the buffer first, wait for the buffer length to be almost reached, or send it out at other suitable times
  • When receiving data, the data also arrives at the receiving buffer of the kernel from the network card driver
  • Then the application can call read to get data from the receive buffer
  • On the other hand, a TCP connection has both a sending buffer and a receiving buffer, so for this connection, data can be read or written. This concept is called full-duplex

Due to the existence of the buffer, the reading and writing of the TCP program does not need to match one by one, for example:

  • When writing a 100-byte data, you can call write once to write 100 bytes, or you can call write 100 times to write 1 byte each time
  • When reading 100 bytes of data, there is no need to consider how to write it, that is, you can read 100 bytes at a time, or you can read one byte at a time, repeating 100 times.

The application layer just copies data to the send buffer, and TCP sends data according to its own rules. When it is written has nothing to do with the format it is written in. When reading, it has nothing to do with the format read, so sending and receiving has nothing to do with the format. This is called byte stream orientation.

TCP is byte-oriented and doesn't care about any data format at all. But to use this data correctly, it must have a specific format. This format is processed by the application layer (to ensure that a complete message is read)

sticky bag problem

  • The "packet" in the sticky packet problem refers to the data packet at the application layer
  • In the protocol header of TCP, there is no such field as "packet length" like UDP, but there is a field such as sequence number
  • From the perspective of the transport layer, TCP comes one by one, and puts them in the buffer according to the sequence number.
  • From the perspective of the application layer, what you see is only a series of continuous byte data
  • Then the application program sees such a series of byte data, and does not know which part to start from, it is a complete application layer data packet

Avoid the problem of sticky packets: clarify the boundaries between packets

  • For fixed-length messages, ensure that they are read at a fixed size every time.
  • For variable-length packets, you can specify a field for the total length of the packet at the header position, so that you can know the end position of the packet
  • For variable-length newspapers, you can also use a clear delimiter between packages (ensure that the delimiter and the text do not conflict)

Does UDP have the problem of sticky packets?

does not exist

  • For the UDP protocol, if the data has not been delivered by the upper layer, the UDP packet length is still there. At the same time, UDP delivers data to the application layer one by one, so there is a clear data boundary.
  • From the perspective of the application layer, when using UDP, either a complete UDP message is received, or it is not received, and there will be no "half" situation

TCP exception

Process termination: process termination will release the file descriptor, and FIN can still be sent, which is no different from normal shutdown

Machine restart: Same as process termination

Machine power down/network cable disconnected: The receiving end thinks that the link is still there. Once the receiving end has a write operation, and the receiving end finds that the connection is no longer there, it will reset. Even if there is no write operation, TCP itself has a built-in keep-alive timer The server will periodically ask whether the other party is still there, and if the other party is not there, the connection will also be released.

In addition, some protocols of the application layer also have some such detection mechanisms. For example, in the HTTP long connection, the status of the other party will be detected periodically.

Realize reliable transmission with UDP

UDP can refer to the tcp reliability mechanism to implement similar logic at the application layer

For example:

  • Introduce serial numbers to ensure data order
  • Introduce confirmation response to ensure that the peer has received the data
  • Introduce timeout retransmission, if there is no response after a period of time, the data will be resent.
  • ................

Google developed a QUIC based on UDP. Document link: (English) RFC 9000: QUIC: A UDP-Based Multiplexed and Secure Transport (quicwg.org)

QUIC, namely Quick UDP Internet Connections (Quick UDP Internet Connections), is an experimental network transport protocol proposed by Google, located in the transport layer of the OSI model. QUIC aims to solve the defects of the TCP protocol and eventually replace the TCP protocol to reduce data transmission, reduce connection establishment delay time, and speed up web page transmission.

 Features of QUIC

  1. Connection establishment with low latency
  2. multiplexing
  3. no head-of-line blocking
  4. Flexible Congestion Control Mechanism
  5. connection migration
  6. Authentication and encryption of packet header and data in the packet
  7. FEC Forward Error Correction
  8. reliable transmission
  9. other

TCP summary

TCP's strategy for ensuring reliability:

  • checksum
  • Serial number (arrives sequentially)
  • confirmation response
  • timeout resend
  • connection management
  • flow control
  • congestion control

Strategies to improve performance:

  • sliding window
  • fast retransmit
  • delayed response
  • piggybacking

Protocol based on TCP application layer

  • HTTP/HTTPS
  • SSH
  • Telnet

etc.

Guess you like

Origin blog.csdn.net/qq_58325487/article/details/129400757