Linux network programming (used by multiple IO multiplexing select function)


Preface

This article takes you to learn about the use of the multi-channel IO multiplexing select function.

1. What is multi-channel IO multiplexing?

1. Multiplexing I/O (Multiplexing I/O) is a technique for simultaneously monitoring and processing multiple input/output (I/O) sources. It allows a process to listen to and process multiple file descriptors (sockets, files, pipes, etc.) at the same time, thereby achieving an efficient event-driven programming model.

2. In the traditional I/O model, blocking I/O or non-blocking I/O is usually used for read and write operations, and a thread or process is created for each I/O source (such as a socket connection) to handle it. . In high-concurrency scenarios, this method will lead to a sharp increase in the number of threads or processes, a waste of system resources, and high context switching overhead.

3. Multi-channel I/O multiplexing implements asynchronous I/O operations by using the mechanisms provided by the operating system, such as select, poll, and epoll (in Linux), allowing programs to monitor multiple I/O sources at the same time. The application can register multiple I/O sources into the multiplexer and wait for events to occur on the multiplexer. Once an event is ready (such as read and write is ready), the program can target these Perform corresponding operations on the event.

多路I/O复用的主要优点如下:

1. Efficiency: Through the event-driven approach, frequent switching of threads and processes is avoided, and system overhead is reduced.

2. Save resources: Use fewer threads or processes to handle multiple I/O sources, save system resources, and be able to handle a large number of concurrent connections.

3. Simplified programming model: Compared with traditional multi-threaded or multi-process programming, using multiple I/O multiplexing can simplify the programming model, making the code clearer and easier to maintain.

需要注意的是,多路I/O复用并非适用于所有情况,特别是在处理高延迟和大数据量的场景下可能不太适合。此外,不同的操作系统上多路复用器的机制和性能表现也有所不同。

Two, select function explanation

The select() function is an I/O model for multiplexing, which can monitor multiple file descriptors at the same time and perform corresponding processing when any one of the file descriptors is ready. It is widely used to implement efficient event-driven programming.

select()函数的原型如下:

int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);

参数说明:

nfds: Increase the maximum value of monitored file descriptors by 1.

readfds: A pointer to a collection of readable file descriptors.

writefds: Pointer to a collection of writable file descriptors.

exceptfds: Pointer to the set of exception file descriptors.

timeout: Specify the timeout time, select the length of blocking.

fd_set is a file descriptor set type, which is operated by macro definition and operation function.

使用select()函数的步骤如下:

Create and initialize a collection of file descriptors.

Add the file descriptors to be monitored to the corresponding file descriptor set.

Call the select() function and pass the set of file descriptors.

Check the return value, if it is negative, it means an error, if it is 0, it means timeout, if it is greater than 0, it means that there is a file descriptor ready.

Use the FD_ISSET() macro to check which file descriptors are ready and handle them accordingly.

下面是一个简单的示例代码:

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>

int main() {
    
    
    fd_set readfds;
    int max_fd, ret;

    FD_ZERO(&readfds);
    FD_SET(STDIN_FILENO, &readfds);  // 监视标准输入

    max_fd = STDIN_FILENO + 1;

    struct timeval timeout;
    timeout.tv_sec = 5;  // 设置超时时间为5秒
    timeout.tv_usec = 0;

    ret = select(max_fd, &readfds, NULL, NULL, &timeout);

    if (ret == -1) {
    
    
        perror("select");
        exit(EXIT_FAILURE);
    } else if (ret == 0) {
    
    
        printf("Timeout\n");
    } else {
    
    
        if (FD_ISSET(STDIN_FILENO, &readfds)) {
    
    
            printf("Stdin is ready for reading\n");

            // 读取标准输入数据
            char buffer[100];
            fgets(buffer, sizeof(buffer), stdin);
            printf("Received: %s", buffer);
        }
    }

    return 0;
}

3. Use select to program concurrent servers

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/select.h>



int main()
{
    
    
    int server = 0;
    struct sockaddr_in saddr = {
    
    0};
    int client = 0;
    struct sockaddr_in caddr = {
    
    0};
    socklen_t asize = 0;
    int len = 0;
    char buf[32] = {
    
    0};
    int maxfd;//最大文件描述符
    int ret = 0;
    int i = 0;

    server = socket(PF_INET, SOCK_STREAM, 0);

    if( server == -1 )
    {
    
    
        printf("server socket error\n");
        return -1;
    }

    saddr.sin_family = AF_INET;
    saddr.sin_addr.s_addr = htonl(INADDR_ANY);
    saddr.sin_port = htons(8888);

    if( bind(server, (struct sockaddr*)&saddr, sizeof(saddr)) == -1 )
    {
    
    
        printf("server bind error\n");
        return -1;
    }

    if( listen(server, 128) == -1 )
    {
    
    
        printf("server listen error\n");
        return -1;
    }

    printf("server start success\n");

    maxfd = server;

    fd_set rets;
    fd_set allrets;

    FD_ZERO(&allrets);
    FD_SET(server, &allrets);

    while( 1 )
    {
    
            
        rets = allrets;
        ret = select(maxfd + 1, &rets, NULL, NULL, NULL);//使用select函数进行监听
        if(ret > 0)
        {
    
    
            if(FD_ISSET(server, &rets))
            {
    
    
                /*有客户端连接上来了*/
                asize = sizeof(caddr);  
                client = accept(server, (struct sockaddr*)&caddr, &asize);
                FD_SET(client, &allrets);//将连接上来的客户端设置进去
                if(maxfd < client)
                {
    
    
                    /*更新最大文件描述符*/
                    maxfd = client;
                }
            }
            else if(ret == -1)
            {
    
    
                for(i = server + 1; i <= maxfd; i++)
                {
    
    
                    if(FD_ISSET(i, &rets))//判断是哪一个客户端有信息
                    {
    
    
                        len = read(i, buf, 1024);
                        if(len == 0)//客户端断开了连接
                        {
    
    
                            FD_CLR(i, &allrets);//将断开连接的客户端清除出去
                            close(i);//关闭客户端
                        }
                        else
                        {
    
    
                            printf("read len : %d read buf : %s\n", len, buf);
                            write(i, buf, len);
                        }
                    }
                }
            }
        }
        else
        {
    
    
            printf("select is err\n");
        }                

    }
    
    close(server);

    return 0;
}

注意点:

When using the select() function, the incoming and outgoing parameters such as readfds may be different. This is because the select() function modifies the set of file descriptors passed in to indicate which ones are ready.

So when using select, you need to make a backup of readfds first, so as not to lose some file descriptors.

4. Disadvantages of the select function

1. Inefficiency: The select() function has some functional limitations. It uses a linear scan to traverse the monitored file descriptor set, so it is less efficient in the case of a large number of file descriptors. Each call to select() requires passing the entire set of file descriptors to the kernel, and rechecking the entire set on return to determine which file descriptors are ready. The overhead of this linear scan increases with the number of file descriptors being monitored.

2. Limit on the number of file descriptors: There is a limit on the number of file descriptors that the select() function can monitor. On some systems, this limit may be relatively small, such as 1024. Therefore, if the number of file descriptors to be monitored exceeds the limit, other methods are required to solve it.

3. Blocking mode: The select() function is a blocking call, that is, when there is no file descriptor ready, it will always block and wait. This will prevent the program from performing other operations. Although this problem can be solved by setting a timeout, a timeout that is too short may result in incorrect timeouts, while a timeout that is too long may affect the responsiveness of the program.

4. Restricted event types: The select() function can only monitor the readability, writability and abnormal conditions of the file descriptor. If you need to monitor other types of events, such as timer events or signal events, you cannot use the select() function.

5. Reinitialization for each call: Before each call to the select() function, the file descriptor collection needs to be reinitialized and the file descriptors that need to be monitored are re-added to the collection. Such an initialization process is cumbersome, especially when the file descriptor set changes, the set needs to be manually updated.

Summarize

This article mainly explains the use of the select function for multi-channel IO taking to realize the writing of concurrent server programs.

Guess you like

Origin blog.csdn.net/m0_49476241/article/details/132351251