Various pits in linux network programming

TCP/IP protocol related

1. Delay ack

When the protocol stack receives TCP data, it does not necessarily send an ACK response immediately, but tends to wait for a timeout or meet special conditions before sending it. For Linux implementation, these special conditions are as follows:

1) The received data has exceeded the full frame size 
2) or in the fast reply mode 
3) or out-of-sequence packets appear 
4) or there is enough data in the receiving window

If the receiver has data to write back, the ACK will also be sent with a ride. When the above conditions are not met, the receiver will delay 40ms before responding to ACK.

2. Nagle algorithm

Nagle algorithm requirements:

  • A TCP connection can only have one unconfirmed unfinished small packet at most, and no other packets can be sent before it reaches the destination.
  • Before the last small packet reaches its destination, that is, before receiving its ack, TCP will collect subsequent small packets. When the ack of the last small packet is received, TCP merges the collected small packets into one large packet and sends it out.

 

 

Through the delay ack and nagle algorithms, we think about what happens when the RTT delay is very small? If the RTT is 1ms, we need to wait (1ms + 40ms + 1ms) after sending the'l' character, but the data transmission efficiency is reduced. For this situation, there are usually the following solutions:

  • Use the writev function to aggregate writes.
  • Copy the first 4 bytes and the last 396 bytes to a single buffer at the application layer, and then call write once.
  • Use TCP_NODELAY to turn off the Nagle algorithm. 

read/write function

1. The read/write function can be set to be non-blocking, and it is blocking by default. Each socket has its own sending buffer and receiving buffer in the kernel. The read/write function just sends data to or receives from the kernel buffer. As for when the data will be sent to the network, we don't know.

2. When the kernel receive buffer is empty/send buffer is full, calling the read/write function will immediately return -1 under non-blocking conditions, errno = EAGAIN or EWOULDBLOCK (the two values ​​are equal).

3. Suppose a process a on machine A is communicating with process b on machine B: at a certain moment a is blocking the read call of the socket (or polling the socket under nonblock), when the process b terminates, regardless of the application Does the program explicitly close the socket (OS will be responsible for closing all file descriptors at the end of the process, and for sockets, it will send a FIN packet to the opposite side). But things are far from simple as imagined. Gracefully closing a TCP connection not only requires the applications of both parties to comply with the agreement, but also no errors can occur in the middle.

  3.1 If the b process terminates abnormally, the OS does the job of sending the FIN packet , and the b process no longer exists. When the machine B receives the message from the socket again, it will respond with an RST (because the process that owns the socket has terminated). When process a calls write on the socket that receives RST, the operating system will send SIGPIPE to process a, and the default processing action is to terminate the process.

  3.2  Different from the exit of process b (the OS will be responsible for sending FIN packets for all open sockets), when the OS of machine B crashes (note that it is different from manual shutdown, because the exit action of all processes can still be guaranteed during shutdown)/host When the power is off/the network is unreachable , the a process will not receive the FIN packet as a reminder of the connection termination at all .

  • If a process is blocked on read, then the result can only be waiting forever.
  • If process a writes first and then blocks in read , because it cannot receive the ack from the TCP/IP stack of machine B, TCP will continue to retransmit 12 times (the time span is about 9 minutes), and then return an error on the blocked read call: ETIMEDOUT /EHOSTUNREACH/ENETUNREACH. If machine B happens to resume the path with machine A at some point and receives a retransmitted pack of a, it will return an RST because it cannot recognize it. At this time, the blocked read call on process a will return the error ECONNREST.

4. When is the socket read/write?

      4.1 Socket is readable:

  • The socket kernel accepts that the number of bytes in the buffer is greater than or equal to its low water mark SO_RCVLOWAT. At this point, we can read the socket without blocking, and the number of bytes returned by the read operation is greater than 0.
  • The socket communication partner closes the connection, and the read operation of the socket returns 0 at this time.
  • There is a new connection request on the listening socket.
  • There is an unhandled error on the socket. At this point we can use getsockopt to read and clear the error.

      4.2 socket can be written

  • The number of available bytes in the socket core send buffer is greater than or equal to its low water mark SO_SNDLOWAT. At this point, we can write to the socket without blocking, and the number of bytes returned by the write operation is greater than 0.
  • The write operation of the socket is closed. Performing a write operation on a socket whose write operation is closed will trigger a SIGPIPE signal.
  • After the socket uses non-blocking connect to connect successfully or fail (timeout).
  • There is an unhandled error on the socket. At this point we can use getsockopt to read and clear the error.

5、EINTR (error interrupt)

  The read and write functions are functions that can be interrupted by signals. When the function is interrupted, an error (-1) will be returned and the errno variable will be set to EINTR. You can turn on the SA_RESTART option when sigaction is registered. When the function encounters an interruption, the function will not return an error but will automatically restart.

accept function

1. After the client and server have successfully established a three-way handshake, they will enter the  ESTABLISHED  state. Before the server accepts, the client closes the socket (so_LINGER option needs to be set for the socket, and the RST message will be sent instead of the normal FIN message when the socket is closed) Send an RST message, and then the server executes the accept function. According to different operating systems, the following will happen:

  •  Quietly kill the connection when accept.
  • Let accept return ECONNABORTED.
  • Successfully received. However, an ECONNRESET error is returned when reading or writing.

close function

When we close the socket with the close function, the next behavior is related to the SO_LINGER socket option.

The structure is needed when setting the SO_LINGER option

struct linger {
  l_onoff;   //开关,0为关闭SO_LINGER选项,不等于0为打开
  l_linger;  //延滞时间,单位为秒。
}

By default, the SO_LINGER option is turned off. At this time, the member l_onoff = 0. In this case, l_linger is useless.

  1. l_onoff == 0, {off, ~}, this is the default condition of the system. At this time, l_linger is ignored and close returns immediately. If there is residual data in the sending buffer, the system will send these data to the opposite end in the background, and finally send FIN to the opposite end.
  2. l_onff != 0 && l_linger == 0 , {on, 0}, the close function returns immediately, the remaining data in the sending buffer will be discarded, and an RST will be sent to the opposite end immediately.
  3. l_onff != 0 && l_linger > 0, {on, >0}, close will block until any one of the following two situations occurs, and the return value is 0:

      a) All data has been sent (including FIN), and the opposite end's ack has been obtained. In this case, it is returned in advance;

       b) The delay time is up (the time specified by l_linger, in seconds).

Setting method

struct linger lgr;
lgr.l_onoff = 1;
lgr.l_linger = 5;

setsockopt(sockfd, SOL_SOCKET, SO_LINGER, &lgr, sizeof(lgr));

(Accept) common mistakes

  • EAGAIN: 11, Resource temporarily unavailable The reason is that system resources are temporarily unavailable. The solution is to try again
  • ECONNABORTED: 103, Software caused connection abort The reason is that after the service and client process completed the "three-way handshake" for the TCP connection, the client TCP sent a RST (reset) section. From the perspective of the service process, this The connection has been queued by TCP, and RST has arrived while waiting for the service process to call accept. POSIX stipulates that the errno value at this time must be ECONNABORTED. The solution is ignored and accept again.
  • EINTR: 4, Interrupted system call The reason is that the system signal is interrupted when reading or writing. The solution is to continue reading and writing.
  • EPROTO: 71, Protocol error The cause is a protocol error. The solution is ignored.
  • EPERM: 1, Operation not permitted The operation is not permitted. The solution is ignored.
  • EMFILE: 24, Too many open files Too many open files The solution is ignored

Guess you like

Origin blog.csdn.net/u014608280/article/details/85211190