Under Linux Socket Communications in a non-blocking connect, select, recv and recvfrom, send, and sendto roughly explain, with non-connect plug rent Code, MSG_NOSIGNAL

linux exception message in the send function MSG_NOSIGNAL

When the server with ctrl + c to end the receiving process server to simulate the server is down, the service socket after the end of the process, the server shut down natural process, but the client side also turned out to be unexpectedly closed off.

Change the send function to send write and add MSG_NOSIGNAL sign, re-compile, run, interrupt server, this problem is solved very chic

Under Linux when the network connection is lost, but also to send data, not only send () return value will reflect, but also sends a message to the system anomaly , if not disposed of, the system will be out BrokePipe, quit the program, which for server to provide a stable service will cause a huge disaster. To this end, send () function may last parameter set MSG_NOSIGNAL, prohibit send () function to send regular messages to the system.

In fact, after the first send time you should receive the error code returned RST, when performing send again will send pipeline rupture signal (ESPIPE) error code, and the system will send a signal (SIGPIPE) to process and receive process an exit operation. In fact MSG_NOSIGNAL role in the process should be told to ignore signal (SIGPIPE)

connect () function

connect头文件:

        #include<sys/types.h>

        #include<sys/socket.h>

connect声明:

        int connect (int sockfd, struct sockaddr * serv_addr, int addrlen);
connect功能:

        使用套接字sockfd建立到指定网络地址serv_addr的socket连接,参数addrlen为serv_addr指向的内存空间大小,即sizeof(struct sockaddr_in)。

connect返回值:

        1)成功返回0,表示连接建立成功(如服务器和客户端是同一台机器上的两个进程时,会发生这种情况)

        2)失败返回SOCKET_ERROR,相应的设置errno,通过errno获取错误信息。常见的错误有对方主机不可达或者超时错误,也可能是对方主机没有进程监听对应的端口。

Nonblocking connect (non-block mode connect)

        Socket perform I / O operations with a non-blocking and blocking modes:

        1) in blocking mode, before the I / O operation is completed, the function performs the operation has been waiting for and will not return immediately, the thread where the function will block here.

        2) In contrast, in non-blocking mode, a socket function will return immediately, regardless of the I / O is complete, the thread where the function will continue to run.

        The client calls connect () initiates socket server connection, if the client socket descriptor to blocking mode, connect () will block until the connection is established or the connection establishment timeout (to connect the linux kernel timeout limit is 75s , Soliris 9 is a few minutes, it is often considered to be 75s to a few minutes). If non-blocking mode, then call connect () function returns immediately, if the connection can not be established immediately successful (-1), errno is set to EINPROGRESS, TCP three-way handshake is continuing at this time. At this point you can call select () detects whether non-blocking connect complete. select a specified timeout period may be shorter than the timeout connect, connection threads can be prevented at the time the connect blocked.

Select () function

select头文件:

         #include<sys/time.h> 

         #include<sys/types.h> 

         #include<unistd.h>

select声明:

         int select(int maxfdp, fd_set *readfds, fd_set *writefds, fd_set *errorfds, struct timeval* timeout);

select功能:

         本函数用于确定一个或多个套接口的状态。对每一个套接口,调用者可查询它的可读性、可写性及错误状态信息。

select参数:

         先说明两个结构体:
         第一,struct fd_set可以理解为一个集合,这个集合中存放的是文件描述符(filedescriptor),即文件句柄,这可以是我们所说的普通意义的文件,当然Unix下任何设备、管道、FIFO等都是文件形式,全部包括在内,所以毫无疑问一个socket就是一个文件,socket句柄就是一个文件描述符。fd_set集合可以通过一些宏由人为来操作:

                    FD_ZERO(fd_set *) 清空集合

                    FD_SET(int ,fd_set*) 将一个给定的文件描述符加入集合之中

                    FD_CLR(int,fd_set*) 将一个给定的文件描述符从集合中删除

                    FD_ISSET(int ,fd_set* ) 检查集合中指定的文件描述符是否可以读写

         第二,struct timeval是一个大家常用的结构,用来代表时间值,有两个成员,一个是秒数,另一个是毫秒数。

         1) int maxfdp是一个整数值,是指集合中所有文件描述符的范围,即所有文件描述符的最大值加1,不能错!

         2)fd_set * readfds是指向fd_set结构的指针,这个集合中应该包括文件描述符,我们是要监视这些文件描述符的读变化的,即我们关心是否可以从这些文件中读取数据了,如果这个集合中有一个文件可读,select就会返回一个大于0的值,表示有文件可读,如果没有可读的文件,则根据timeout参数再判断是否超时,若超出timeout的时间,select返回0,若发生错误返回负值。可以传入NULL值,表示不关心任何文件的读变化。

         3) fd_set * writefds是指向fd_set结构的指针,这个集合中应该包括文件描述符,我们是要监视这些文件描述符的写变化的,即我们关心是否可以向这些文件中写入数据了,如果这个集合中有一个文件可写,select就会返回一个大于0的值,表示有文件可写,如果没有可写的文件,则根据timeout参数再判断是否超时,若超出timeout的时间,select返回0,若发生错误返回负值。可以传入NULL值,表示不关心任何文件的写变化。

         4)fd_set * errorfds同上面两个参数的意图,用来监视文件错误异常。

         5) struct timeval * timeout是select的超时时间,这个参数至关重要,它可以使select处于三种状态,第一,若将NULL以形参传入,即不传入时间结构,就是将select置于阻塞状态,一定等到监视文件描述符集合中某个文件描述符发生变化为止;第二,若将时间值设为0秒0毫秒,就变成一个纯粹的非阻塞函数,不管文件描述符是否有变化,都立刻返回继续执行,文件无变化返回0,有变化返回一个正值;第三,timeout的值大于0,这就是等待的超时时间,即select在timeout时间内阻塞,超时时间之内有事件到来就返回了,否则在超时后不管怎样一定返回,返回值同上述。

select judgment rule:

        1) If the select () returns 0

Representation in select () timeout, failed to establish a connection within the timeout period, can also be performed again select () for testing, should the multiple time-out, the need to return a timeout error to the user.

        2) If the select () returns a value greater than 0

Then the detected readable or writable socket descriptor. Berkeley are realized from two rules relating to select and non-blocking I / O is:

        A) When the connection establishment is successful , the socket descriptor becomes writable (the connection is established, the write buffer is free, it is possible to write)

        B) When the connection is established when an error occurs , the socket descriptor into both readable and writeable (pending errors due to readable and writeable)

        Thus, when the socket descriptor is found to be read or written, it is connected to be further determines success or error. Here must be separated B) and another region connected to the normal situation, that is, after the connection is set up, the server sends data to the client, this time will return to select the same socket descriptor nonblocking both readable and writable.

        □ For Unix environment , can be detected by calling getsockopt descriptors is a successful connection or wrong (this is the "Unix Network Programming" method provided by the book, which in the Linux environment tested and found to be invalid): the linux next, regardless of whether a network error occurs, getsockopt always returns 0 does not return -1.

               A) If the connection establishment is successful, is acquired by getsockopt (sockfd, SOL_SOCKET, SO_ERROR, (char *) & error, & len) error value will be 0

               B) If a connection error is encountered, the errno value is corresponding to a connection error errno value, such ECONNREFUSED, ETIMEDOUT etc.

        □ a more efficient method of determining, tested and proven, it is effective in the Linux environment:

        Called again connect, the corresponding return failure, errno if an error is EISCONN, expressed socket connection has been established, or that the connection failed.

        Try the following: once after select, find the word to describe this moment socket read or write, execute connect again, this time always errno unchanged at EINPROGRESS, increase the timeout select the result is the same. After trying the return value is 0 in a select, or a return value, and when the errno still connect EINPROGRESS (115), executed again select + connect, i.e., the connection status detected again. At this time, errno is set EISCONN (106), connect success.

Linux下常见的socket错误码:

EACCES, EPERM:用户试图在套接字广播标志没有设置的情况下连接广播地址或由于防火墙策略导致连接失败。

EADDRINUSE 98:Address already in use(本地地址处于使用状态)

EAFNOSUPPORT 97:Address family not supported by protocol(参数serv_add中的地址非合法地址)

EAGAIN:没有足够空闲的本地端口。

EALREADY 114:Operation already in progress(套接字为非阻塞套接字,并且原来的连接请求还未完成)

EBADF 77:File descriptor in bad state(非法的文件描述符)

ECONNREFUSED 111:Connection refused(远程地址并没有处于监听状态)

EFAULT:指向套接字结构体的地址非法。

EINPROGRESS 115:Operation now in progress(套接字为非阻塞套接字,且连接请求没有立即完成)

EINTR:系统调用的执行由于捕获中断而中止。

EISCONN 106:Transport endpoint is already connected(已经连接到该套接字)

ENETUNREACH 101:Network is unreachable(网络不可到达)

ENOTSOCK 88:Socket operation on non-socket(文件描述符不与套接字相关)

ETIMEDOUT 110:Connection timed out(连接超时)

The socket is arranged a blocking mode and a non-blocking mode - Method using fcntl

Provided nonblocking mode:

First obtain flags with F_GETFL fcntl, and with F_SETFL set flags | O_NONBLOCK;        

      flags = fcntl (sockfd, F_GETFL, 0); // Get the value document flags.

      fcntl (sockfd, F_SETFL, flags | O_NONBLOCK); // set to non-blocking mode;

When simultaneously receiving and transmitting data, required flag MSG_DONTWAIT

      In recv, recvfrom and send, sendto data when the flag is set to MSG_DONTWAIT .

Is set to blocking mode:

 

Obtaining a first F_GETFL fcntl flags is provided by the F_SETFL flags & ~ O_NONBLOCK;     

     flags = fcntl (sockfd, F_GETFL, 0); // Get the value document flags.

     fcntl (sockfd, F_SETFL, flags & ~ O_NONBLOCK); // set to blocking mode;            

When simultaneously receiving and transmitting data, you need to use blocking flag

        In recv, recvfrom and send, sendto data when the flag is set to 0 , the default is blocked.  

send function -MSG_NOSIGNAL

UINT flag = MSG_NOSIGNAL;//禁止 send() 函数向系统发送常消息。

After the socket is arranged non-blocking mode, each time for a sockfd operations are non-blocking;

Non-blocking mode:

connect   

       = 0 when returning to 0, immediately created a socket link,

       <0 when it returns -1, you need to determine whether errno is EINPROGRESS (representing the current process being processed), otherwise fail.

       For example: The following will select or epoll fd monitor whether the link is established,

        select whether to connect the listener examples of success, attention getsockopt verification, because the third ACK three-way handshake may be lost, but the client believes the link has been established:


//////////////////////////////////////////////////////////////////////
//
// func description	:	建立与TCP服务器的连接
//
//-----input------
// Parameters
//     	s 	 		: 	由socket函数返回的套接字描述符
//	   	servaddr 	:   指向套接字地址结构的指针
//     	addrlen		:  结构的长度
//------output------
// Return 
//     BOOL 		:	成功返回0,若出错则返回-1
//     不要把错误(EINTR/EINPROGRESS/EAGAIN)当成Fatal
//
BOOL SocketAPI::connect_ex(SOCKET s, const struct sockaddr* servaddr, UINT addrlen)
{
	DEBUG_TRY
    if (connect(s, servaddr, addrlen) == -1)
    {
        //LOGDEBUG("[SocketAPI::connect_ex] Error errno[%d] discription[%s]", errno, strerror(errno));
#if defined(__LINUX__)	
        switch(errno)
        {
            case EALREADY:      //#define EALREADY    114 /* Operation already in progress */
            case EINPROGRESS:   //#define EINPROGRESS 115 /* Operation now in progress */
            case EINTR:         //#define EINTR        4  /* Interrupted system call */
            case EAGAIN:        //#define EAGAIN      11  /* Try again */          
            {
                    //!alter by huyf:修改非阻塞connet处理流程
                    //建立connect连接,此时socket设置为非阻塞,connect调用后,无论连接是否建立立即返回-1,同时将errno(包含errno.h就可以直接使用)设置为EINPROGRESS, 
                    //表示此时tcp三次握手仍就进行,如果errno不是EINPROGRESS,则说明连接错误,程序结束。
                    return reconnect_ex(s, servaddr, addrlen) == 0 ? TRUE : FALSE;
                    //return TRUE;
                    //!alter end:修改非阻塞connet处理流程
            }            
            //!alter end:修改非阻塞connet处理流程
            //增加已经连接的处理,此处可以直接返回告之
            case EISCONN:   //#define EISCONN     106 /* Transport endpoint is already connected */
            {
                return TRUE;
            }
            //!alter end:修改非阻塞connet处理流程
            case ECONNREFUSED:
            case ETIMEDOUT:
            case ENETUNREACH:
            case EADDRINUSE:
            case EBADF:
            case EFAULT:
            case ENOTSOCK:
            default:
            {
                //LOGERROR("[SocketAPI::connect_ex] Is Error errno[%d] discription[%s]", errno, strerror(errno));
                break;
            }
        }//end of switch
#elif defined(__WINDOWS__)
        INT iErr = WSAGetLastError();
        switch(iErr)
        {
        case WSANOTINITIALISED: 
            {
                strncpy(Error, "WSANOTINITIALISED", ERROR_SIZE);
                break;
            }

        case WSAENETDOWN:
            { 
                strncpy(Error, "WSAENETDOWN", ERROR_SIZE);
                break;
            }
        case WSAEADDRINUSE: 
            { 
                strncpy(Error, "WSAEADDRINUSE", ERROR_SIZE);
                break;
            }
        case WSAEINTR: 
            { 
                strncpy(Error, "WSAEINTR", ERROR_SIZE);
                break;
            }
        case WSAEINPROGRESS: 
            { 
                strncpy(Error, "WSAEINPROGRESS", ERROR_SIZE);
                break;
            }
        case WSAEALREADY: 
            { 
                strncpy(Error, "WSAEALREADY", ERROR_SIZE);
                break;
            }
        case WSAEADDRNOTAVAIL: 
            { 
                strncpy(Error, "WSAEADDRNOTAVAIL", ERROR_SIZE);
                break;
            }
        case WSAEAFNOSUPPORT: 
            { 
                strncpy(Error, "WSAEAFNOSUPPORT", ERROR_SIZE);
                break;
            }
        case WSAECONNREFUSED: 
            { 
                strncpy(Error, "WSAECONNREFUSED", ERROR_SIZE);
                break;
            }
        case WSAEFAULT: 
            { 
                strncpy(Error, "WSAEFAULT", ERROR_SIZE);
                break;
            }
        case WSAEINVAL: 
            { 
                strncpy(Error, "WSAEINVAL", ERROR_SIZE);
                break;
            }
        case WSAEISCONN: 
            { 
                strncpy(Error, "WSAEISCONN", ERROR_SIZE);
                break;
            }
        case WSAENETUNREACH: 
            { 
                strncpy(Error, "WSAENETUNREACH", ERROR_SIZE);
                break;
            }
        case WSAENOBUFS: 
            { 
                strncpy(Error, "WSAENOBUFS", ERROR_SIZE);
                break;
            }
        case WSAENOTSOCK: 
            { 
                strncpy(Error, "WSAENOTSOCK", ERROR_SIZE);
                break;
            }
        case WSAETIMEDOUT: 
            { 
                strncpy(Error, "WSAETIMEDOUT", ERROR_SIZE);
                break;
            }
        case WSAEWOULDBLOCK: 
            { 
                strncpy(Error, "WSAEWOULDBLOCK", ERROR_SIZE);
                break;
            }
        default:
            {
                strncpy(Error, "UNKNOWN", ERROR_SIZE);
                break;
            }
        }//end of switch		
#endif
        return FALSE;
    }
    return TRUE;	
	DEBUG_CATCHF("SocketAPI::connect_ex");	
}


//////////////////////////////////////////////////////////////////////
//
// func description :   非阻塞套接字建立连接时未立即完成的检查(tcp三次握手阶段)
//
//-----input------
// Parameters
//      s         :   由socket函数返回的套接字描述符
//------output------
// Return 
//     BOOL         :   成功返回0,若出错则返回-1
//     不要把错误(EINTR/EINPROGRESS/EAGAIN)当成Fatal
//
int SocketAPI::reconnect_ex(SOCKET s, const struct sockaddr* servaddr, UINT addrlen)
{   
    //LOGDEBUG("[SocketAPI::reconnect_ex] Get The Connect Result By Select() Errno=[%d] Discription=[%s]", errno, strerror(errno));    
    //if (errno == EINPROGRESS)    
    //{    
        //int nTimes = 0; 
        int nRet = -1;   
        //while (nTimes++ < 5)    
        {    
            fd_set rfds, wfds;    
            struct timeval tv;
            FD_ZERO(&rfds);    
            FD_ZERO(&wfds);    
            FD_SET(s, &rfds);    
            FD_SET(s, &wfds);    
                
            /* set select() time out */    
            tv.tv_sec = 1;     
            tv.tv_usec = 0;
            /*
            2.源自Berkeley的实现(和Posix.1g)有两条与select和非阻塞IO相关的规则:
            A:当连接建立成功时,套接口描述符变成可写;
            B:当连接出错时,套接口描述符变成既可读又可写;
            注意:当一个套接口出错时,它会被select调用标记为既可读又可写;
            一种更有效的判断方法,经测试验证,在Linux环境下是有效的:
                再次调用connect,相应返回失败,如果错误errno是EISCONN,表示socket连接已经建立,否则认为连接失败。
            */    
            int nSelRet = select(s+1, &rfds, &wfds, NULL, &tv);    
            switch (nSelRet)    
            {    
                case -1:    //出错
                {
                    //LOGERROR("[SocketAPI::reconnect_ex] Select Is Error... nSelRet=[%d] Errno=[%d] Discription=[%s]", nSelRet, errno, strerror(errno)); 
                    nRet = -1; 
                }   
                break;    
                case 0:    //超时
                {
                    //LOGWARNING("[SocketAPI::reconnect_ex] Select Is Time Out... nSelRet=[%d] Errno=[%d] Discription=[%s]", nSelRet, errno, strerror(errno));     
                    nRet = -1; 
                }   
                break;    
                default:    //有数据过来
                {
                    //LOGDEBUG("[SocketAPI::reconnect_ex] nSelRet=[%d] Errno=[%d] Discription=[%s]", nSelRet, errno, strerror(errno));   
                    if (FD_ISSET(s, &rfds) || FD_ISSET(s, &wfds))    //判断可读或者可写
                    {    
                        #if 0 // not useable in linux environment, suggested in <<Unix network programming>>  SO_ERROR no used 
                            int errinfo, errlen;    
                            if (-1 == getsockopt(s, SOL_SOCKET, SO_ERROR, &errinfo, &errlen))    
                            {    
                                nRet = -1;  
                                LOGERROR("getsockopt return -1.\n");  
                                break;    
                            }    
                            else if (0 != errinfo)    
                            {      
                                nRet = -1;   
                                LOGERROR("getsockopt return errinfo = %d.\n", errinfo); 
                                break;    
                            }                                       
                            nRet = 0;  
                            LOGDEBUG("connect ok?\n");     
                        #else    
                            #if 1    
                                connect(s, servaddr, addrlen);      //再次连接来判断套接字状态    
                                if (errno == EISCONN)    
                                {    
                                    //LOGDEBUG("[SocketAPI::reconnect_ex] Reconnect Finished...nSelRet=[%d]", nSelRet);    
                                    nRet = 0;    
                                }    
                                else    
                                {      
                                    //LOGWARNING("[SocketAPI::reconnect_ex] Reconnect Failed...FD_ISSET(s, &rfds)=[%d] FD_ISSET(s, &wfds)=[%d] nSelRet=[%d] Errno=[%d] Discription=[%s]", FD_ISSET(s, &rfds) , FD_ISSET(s, &wfds), nSelRet, errno, strerror(errno));     
                                    nRet = -1;    
                                }    
                            #else    //test
                                char buff[2];    
                                if (read(s, buff, 0) < 0)    
                                {    
                                    LOGERROR("connect failed. errno = %d\n", errno);    
                                    nRet = errno;    
                                }    
                                else    
                                {    
                                    LOGDEBUG("connect finished.\n");    
                                    nRet = 0;    
                                }    
                            #endif    
                        #endif    
                    } 
                }
                break;   
            } 
        }   
    //}  
    return nRet;
}

 

recv 和 recvfrom

       = 0 when the return value is 0, indicating that the peer has closed this link, we should own this close link that close (sockfd).

Moreover, because the asynchronous operation will do an event with a select or epoll trigger, so:

       1, if you use select, you should use FD_CLR (sockfd, fd_set) will sockfd removed, no longer listening.

       2, if you use epoll, the system will clear out sockfd it would no longer listening.

       > 0 when the return value is greater than 0 and less than sizeof (buffer), data representing definitely read.

(If it is equal sizeof (buffer), the data may not read, reading should continue, there can be greater than)

       <0 when the return value is less than 0, i.e., equal to -1, it is determined points:

        1, if errno is EAGAINE or EWOULDBLOCK- continue reading                                  

                Indicates no data readable temporarily, you can continue to read, or to wait for or select subsequent notifications of epoll. (EAGAINE, EWOULDBLOCK generated

         The reason: It may be read with a multi-process sockfd, a process may read data, other processes will not read data (similar shock group effect), of course,

         Single process may also happen. For this error, without a close (sockfd). Under select or epoll can wait for the next trigger,

         Continue reading. )

         2, if errno is EINTR- continue reading

                Representation is interrupted, you can continue to read, or wait for subsequent notifications epoll or select.

                Otherwise, it really is a failure to read the data. (At this point you should be close (sockfd))

 

send and sendto      

        The return value is the number of characters actually transmitted, because we know the total length to be transmitted, so, if not sent complete, we can continue to send.

          <0-1 when the return value 

We need to determine errno:

                1, if errno is EAGAINE or EWOULDBLOCK, it represents the current buffer is full, you can continue to write,

                      Or wait for the follow-up notification epoll or select, if there is a buffer, it will trigger a write operation, a feature that is often utilized.  

                 2, if errno to EINTR, representation is interrupted, you can continue to write, or wait for subsequent notifications or select the epoll.

                       Or really wrong, that errno is not EAGAINE or EWOULDBLOCK or EINTR, this time should be close (sockfd)

          > = 0

 > = 0 and not equal to the required length of the transmission, it should continue to send, if the transmission is equal to the desired length, the transmission is completed.

Guess you like

Origin blog.csdn.net/Windgs_YF/article/details/94589497