Let’s talk about common IO models such as BIO/NIO/AIO, DIO, multiplexing and other IO models

Let’s talk about common IO models BIO/NIO/AIO/DIO, IO multiplexing and other IO models


Insert image description here

I. Introduction

Common I/O models include blocking I/O (BIO), non-blocking I/O (NIO), asynchronous I/O (AIO), direct I/O (Direct I/O, DIO) and I/O multiplexing (I/O Multiplexing), etc. Each I/O model has its own characteristics and applicable scenarios. Today we will summarize and talk about some knowledge of common IO models. Although we cannot apply what we have learned, we can also know what it is and why.

Reference for this article

  1. "Boost application performance using asynchronous I/O" and pictures are from this article https://developer.ibm.com/articles/l-async/
  2. https://notes.shichao.io/unp/ch6/

1. What is the IO model?

The I/O (Input/Output) model refers to a pattern or paradigm for processing input and output operations in a computer system. It describes how data is transferred and processed in a computer system.

In computer systems, input and output operations usually involve data interaction with external devices (such as hard drives, networks, keyboards, monitors, etc.). The I/O model defines how this data interaction occurs and how programs communicate with input and output devices.

Common I/O models include the following:
Insert image description here

  1. Blocking I/O model (Blocking I/O): The application will be blocked when performing I/O operations until the operation is completed. When performing blocking I/O, the application waits until data is read from the device or until data is written to the device.

  2. Non-blocking I/O model (Non-blocking I/O): When an application performs an I/O operation, if there is no data to read or data cannot be written immediately on the device, the application will return immediately and continue execution. other tasks without waiting for the operation to complete.

  3. I/O multiplexing model (I/O Multiplexing): By using I/O multiplexing mechanisms (such as select, poll, epoll, etc.), applications can monitor multiple I/O operations at the same time and read and write data when there is data. processing at the same time, thus avoiding blocking. This model is suitable for situations where multiple I/O channels need to be processed simultaneously.

  4. Signal-driven I/O model: The application registers a signal processing function to receive the signal from the operating system when the data is ready, and then performs the corresponding I/O operation.

  5. Asynchronous I/O model (Asynchronous I/O): After the application initiates an I/O operation, it can continue to perform other tasks without waiting for the I/O operation to complete. When an I/O operation is completed, the system notifies the application, and the application can then handle the completed I/O operation.

Different I/O models are suitable for different application scenarios and requirements. Choosing the appropriate I/O model can improve system efficiency and performance.

2. Why is the IO model needed?

In computer systems, IO operations are relatively slow, and applications usually require frequent IO operations. Different IO models can provide different processing methods to meet different needs. Choosing an appropriate I/O model can improve the system's performance, concurrent processing capabilities, and user experience, allowing the system to efficiently handle input and output operations.
The I/O model is a pattern for processing input and output operations in a computer system. It exists for several important reasons:

  1. Efficient Resource Utilization I/O operations often involve data interaction with external devices, which are relatively slow to access. Using an appropriate I/O model can make full use of system resources and avoid wasting CPU time waiting for I/O operations to complete.

  2. Improve system throughput By rationally selecting the I/O model, the system can handle other tasks while waiting for the I/O operation to complete, thereby improving the system's concurrent processing capabilities and throughput.

  3. Supporting the multitasking I/O model enables applications to process multiple I/O channels at the same time, and can process other I/O operations while waiting for a certain I/O operation to complete, improving the concurrency performance of the system.

  4. Responsiveness and interactivity By using the non-blocking I/O model or the asynchronous I/O model, the application can continue to perform other tasks while the I/O operation is in progress, thereby improving the responsiveness of the system and the user experience.

2. Common IO models

1. Synchronous blocking IO (Blocking IO, BIO)

Synchronous blocking I/O (Blocking I/O, BIO) is a basic I/O model and the most common one.

In the synchronous blocking I/O model, when an application initiates an input or output operation, it is blocked (that is, execution is suspended) until the operation is completed. During an I/O operation, the application cannot perform other tasks and must wait for the reading or writing of data to complete before continuing to execute subsequent code.
Insert image description here

The basic workflow of the synchronous blocking I/O model:

1. 应用程序发起一个I/O操作(如读取文件、发送网络请求等)。

2. 操作系统内核接收到应用程序的请求,将控制权交给设备驱动程序。

3. 设备驱动程序开始执行I/O操作,它会将请求发送给设备(如硬盘、网络接口等)。

4. 设备开始进行读取或写入操作,这个过程可能需要一定的时间。

5. 在数据操作完成后,设备驱动程序将数据传递给操作系统内核。

6. 操作系统内核将数据传递给应用程序,并解除应用程序的阻塞状态。

7. 应用程序继续执行后续代码,处理接收到的数据。

The main feature of the synchronous blocking I/O model is that it is simple and easy to understand, but its disadvantage is that it is less efficient. When an application initiates an I/O operation, it must wait for the operation to complete, which causes the application's execution to be blocked and unable to fully utilize CPU resources. Especially in high-concurrency environments, the synchronous blocking I/O model may cause system performance degradation, because a blocking I/O operation may block the execution of other tasks.

2. Synchronous non-blocking IO (Non-blocking IO, NIO)

Synchronous non-blocking I/O (Non-blocking I/O, NIO) is an improved model compared to synchronous blocking I/O, which provides a more efficient I/O processing method.

In the synchronous non-blocking I/O model, when an application initiates an input or output operation, it is not blocked waiting for the operation to complete, but returns immediately. The application can continue to perform other tasks without waiting for the I/O operation to complete.

  • Insert image description here
    EWOULDBLOCK or EAGAIN are some common error codes used to indicate that the operation cannot be completed immediately in non-blocking I/O operations.

EWOULDBLOCKIndicates that the operation will block. When an application calls an I/O operation in a non-blocking manner, if the operation cannot be completed immediately and needs to wait, the operating system will return the EWOULDBLOCK error code. This error code indicates that the application is currently unable to operate, but does not indicate that an error has occurred.

EAGAIN Indicates that the operation cannot be completed at the moment. Similar to EWOULDBLOCK, EAGAIN is also an error code in non-blocking I/O operations that indicates that the operation cannot be completed immediately. It is usually used when certain system or network resources have been exhausted and requests are temporarily unable to be fulfilled.

Basic workflow of the synchronous non-blocking I/O model:

  1. The application initiates an I/O operation.

  2. If the operation can be completed immediately (without waiting), the operating system kernel passes the data to the application, and the application continues executing subsequent code.

  3. If the operation cannot be completed immediately (needs to wait), the operating system kernel will return an error code (such as EWOULDBLOCK or EAGAIN) to notify the application that the operation cannot be completed at this time.

4. The application can continuously query the status of the operation through polling or other means to determine when the operation can continue.This is why it is called synchronous non-blocking

  1. When the operation is complete, the operating system kernel passes the data to the application, and the application continues executing subsequent code.

The main feature of the synchronous non-blocking I/O model is that the application will not be blocked and can continue to perform other tasks, thus improving the concurrency performance of the system. Compared to the synchronous blocking I/O model, it allows the application to handle other tasks while waiting for the I/O operation to complete without wasting CPU time.

However, the synchronous non-blocking I/O model requires the application to actively query the status of the operation, which may involve mechanisms such as polling or looping, which will increase programming complexity. In addition, if the application frequently queries the status of operations, it may result in a waste of CPU resources. In order to solve these problems, more advanced I/O models have subsequently emerged, such as I/O multiplexing model, signal-driven I/O model and asynchronous I/O model, which provide a more elegant and efficient way to handle I/O. /O operation.

3. Asynchronous IO (AIO)

Asynchronous non-blocking I/O (AIO) is an advanced I/O model, which is different from synchronous blocking I/O and synchronous non-blocking I/O.

In the asynchronous non-blocking I/O model, the application can return immediately after initiating an I/O operation and continue performing other tasks without waiting for the operation to complete or querying the operation status. When an I/O operation is completed, the operating system notifies the application, and the application can handle the completed I/O operation through a callback function or other means.
Insert image description here

The basic workflow of the asynchronous non-blocking I/O model:

  1. The application initiates an asynchronous I/O operation and specifies a callback function.

  2. The operating system kernel receives the application's request and begins performing the I/O operation.

  3. The application continues to perform other tasks without waiting for the operation to complete.

  4. When the I/O operation is completed, the operating system kernel will notify the application and call the pre-specified callback function.

  5. The application processes the completed I/O operation in the callback function, obtains the operation result or performs subsequent processing.

异步非阻塞I/O模型的主要优势在于允许应用程序以非阻塞且异步的方式进行I/O操作,从而提高系统的并发性能和响应能力。应用程序可以发起多个I/O操作,而无需等待每个操作的完成,从而充分利用CPU资源处理其他任务

The key to using the asynchronous non-blocking I/O model is to handle completed I/O operations. The application needs to handle completed operations through callback functions or other methods, such as obtaining operation results, performing subsequent processing, or initiating new operations. This asynchronous processing may require more complex programming patterns and techniques, but it can provide higher performance and better scalability.

3.1. Linux AIO

In the Linux operating system, the asynchronous I/O (AIO) API mainly consists of the following components and functions

  1. aio.h header file The header file defines structures and function prototypes related to asynchronous I/O.
struct aiocb {
    
    

  int aio_fildes;               // File Descriptor
  int aio_lio_opcode;           // Valid only for lio_listio (r/w/nop)
  volatile void ∗aio_buf;       // Data Buffer
  size_t aio_nbytes;            // Number of Bytes in Data Buffer
  struct sigevent aio_sigevent; // Notification Structure

  /∗ Internal fields ∗/
  ...

};
  1. struct aiocb This is a structure used to describe asynchronous I/O operations. It contains the parameters and status information of the operation, such as file descriptor, buffer address, operation type, etc.
  2. AIO API for Linux operating system
API function describe
aio_read() Initiate an asynchronous read operation
aio_write() Initiate an asynchronous write operation
aio_error() Check the error status of asynchronous I/O operations
aio_return() Get the return value of an asynchronous I/O operation
aio_suspend() Wait for a set of asynchronous I/O operations to complete
aio_cancel() Cancel unfinished asynchronous I/O operations
aio_init() Initialize the asynchronous I/O runtime environment
aio_fini() Destroy the asynchronous I/O runtime environment
struct aiocb Structure of parameters and status information for asynchronous I/O operations
aio_buf Buffer address for asynchronous I/O operations
aio_nbytes Number of bytes for asynchronous I/O operations
aio_offset Offset for asynchronous I/O operations
aio_lio_opcode Operation types for asynchronous I/O operations
will_sigevent Signal event when an asynchronous I/O operation completes

By using the above functions and structures, developers can implement asynchronous I/O operations on Linux systems. These APIs can be called using corresponding system calls or library functions in a programming language, such as system calls in C language or using the libaio library.

3.2. Let’s write a simple example in C

Feel the AIO calling method. To put it bluntly, the AIO class library that relies on Linux is actually very simple at the application layer.

Use aio_readfunctions for asynchronous reading. Open a aaa.txtfile named and use aio_readthe function to initiate an asynchronous read operation. By setting struct aiocbthe relevant fields of the structure, including file descriptor, buffer address, number of bytes read and offset, etc. Then, use aio_readthe function to initiate an asynchronous read operation.

Before the read operation is completed, you can use aio_errorthe function to check the status of the operation. If the status is EINPROGRESS, it means that the operation is still in progress and needs to wait.

Once the asynchronous read operation completes, you can use aio_returnthe function to get the number of bytes read. If the return value is -1, the read operation failed. Output the read data and close the file.

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <aio.h>
#include <errno.h>
#include <unistd.h>

#define BUFFER_SIZE 1024

int main() {
    
    
    int fileDescriptor;
    struct aiocb aioRequest;
    char buffer[BUFFER_SIZE];

    // 打开文件
    fileDescriptor = open("aaa.txt", O_RDONLY);
    if (fileDescriptor == -1) {
    
    
        perror("Failed to open file");
        exit(1);
    }

    // 设置异步I/O请求
    aioRequest.aio_fildes = fileDescriptor;
    aioRequest.aio_buf = buffer;
    aioRequest.aio_nbytes = BUFFER_SIZE;
    aioRequest.aio_offset = 0;

    // 发起异步读取操作
    if (aio_read(&aioRequest) == -1) {
    
    
        perror("Failed to initiate aio_read");
        exit(1);
    }

    // 等待异步读取操作完成
    while (aio_error(&aioRequest) == EINPROGRESS);

    // 检查异步读取操作的结果
    ssize_t bytesRead = aio_return(&aioRequest);
    if (bytesRead == -1) {
    
    
        perror("Failed to complete aio_read");
        exit(1);
    }

    // 输出读取到的数据
    printf("Read %zd bytes:\n", bytesRead);
    printf("%.*s", (int)bytesRead, buffer);

    // 关闭文件
    close(fileDescriptor);

    return 0;
}

Now that everyone should have understood the basic functions of AIO, let's talk about the methods that can be used for asynchronous notification. 我将通过信号和函数回调来进一步了解异步通知. As we talked about in the previous paragraph, the core of AIO is the addition of callback notifications, so what exactly are callback notifications?

3.3. Implementation of asynchronous notification

Asynchronous notification refers to notifying the application through the notification mechanism when an asynchronous I/O operation is completed. It allows applications to perform other tasks while performing asynchronous I/O operations without actively polling or blocking waiting for operations to complete.

In asynchronous I/O, the implementation of asynchronous notification usually involves the following components:

3.3.1. Signal

The operating system can send signals to applications through the signaling mechanism to notify the completion of asynchronous I/O operations. Applications can use signal handling functions to handle received signals.
Signal is a mechanism used in UNIX and UNIX-like operating systems for inter-process communication (IPC) and handling asynchronous events. It is a software interrupt used to notify a process that some event or exception has occurred.

Signals can be generated by the kernel or the process itself and sent to the target process. When the target process receives the signal, it can take appropriate actions to handle the signal. Common signal actions include executing predefined signal processing functions, ignoring signals, terminating processes, etc.

Each signal has a unique numeric identifier, usually represented by an integer. For example, SIGINT stands for terminal interrupt signal, which is usually generated by the user pressing Ctrl+C on the terminal. SIGSEGV stands for Segmentation Fault Signal, which is generated when a process accesses an invalid memory address.

Applications can interact with signals in the following ways:

  1. Capturing Signals Applications can register signal handling functions that the operating system calls when a specified signal occurs. By catching a signal, an application can perform custom actions in response to the signal's occurrence.

  2. Ignore signals An application can choose to ignore a signal. When a signal is ignored, the operating system takes no action and the signal is discarded.

  3. Default Action Each signal has a default action, such as terminating the process or generating a core dump file. Applications can choose to restore the signal's default action.

Some common signals include:

  • SIGINT: Terminal interrupt signal, usually generated by the user pressing Ctrl+C on the terminal.
  • SIGSEGV: Segmentation fault signal, which is generated when a process accesses an invalid memory address.
  • SIGTERM: Termination signal, used to request the process to terminate normally.
  • SIGKILL: Forced termination signal, used to terminate the process immediately and cannot be blocked or ignored.
    Insert image description here

Let's do a simple example in C. Use signals to capture and process SIGINT signals (terminal interrupt signals).
Define handle_signalsignal processing functions. When the SIGINT signal is received, this function prints a message and exits the program.

In mainthe function, we use signalthe function to associate the SIGINT signal with handle_signalthe function, that is, register the signal processing function.

Then, the program enters an infinite loop, waiting for the signal to occur. When the user presses Ctrl+C on the terminal, a SIGINT signal is generated, and the operating system calls the registered signal processing function to handle the signal.

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>

// 信号处理函数
void handle_signal(int signum) {
    
    
    if (signum == SIGINT) {
    
    
        printf("Received SIGINT signal. Exiting...\n");
        // 可在此处执行自定义的退出操作
        exit(0);
    }
}

int main() {
    
    
    // 注册信号处理函数
    signal(SIGINT, handle_signal);

    printf("Press Ctrl+C to send SIGINT signal.\n");

    // 无限循环等待信号
    while (1) {
    
    
        // 等待信号发生
    }

    return 0;
}
3.3.2. Completion Port

A powerful asynchronous notification mechanism provided by the Windows operating system, which is especially suitable for application scenarios with intensive asynchronous I/O operations.
Completion Port is an efficient asynchronous notification mechanism mainly used in Windows operating systems. It provides an extensible way to handle completion notification and result retrieval of asynchronous I/O operations.

In a model that uses completion ports, an application associates asynchronous I/O operations with a completion port. When an asynchronous I/O operation completes, the operating system sends the completion results to the associated completion port. Applications can obtain the results of operations from the completion port by calling specific functions.

The main advantage of the completion port is that it is very efficient for large-scale asynchronous I/O operations. It uses an event-driven model that allows applications to handle the completion events of multiple asynchronous operations simultaneously without blocking or polling to wait. In addition, the completion port supports multi-threaded concurrent processing of operation results, providing better performance and scalability.

When using the completion port, the application needs to perform the following steps:

  1. Create a completion port: The application creates a completion port by calling a specific function.

  2. Associating asynchronous I/O operations: The application associates the asynchronous I/O operation with the completion port, usually using an operation-related structure or handle to identify the operation.

  3. Wait for the operation to complete: The application waits for any one or more operations to complete by calling a specific function. This function blocks the application until an operation is completed.

  4. Processing operation results: Once the operation is completed, the application can obtain the results of the operation from the completion port, including read data, error codes, etc.

3.3.3. Event-driven

The event-driven model is an efficient way to handle asynchronous I/O operations, especially suitable for a large number of concurrent operations or high-throughput scenarios. In the event-driven model, applications use event loops to listen for and handle completion events of I/O operations. The following are the main components and workflow of the event-driven model:
Insert image description here
Content reference and image source https://www.scylladb.com/glossary/event-driven-architecture/

  1. Event Loop: The event loop is the core component of the event-driven model. It runs continuously, listening for events in the event queue, and calling the corresponding callback function when the event occurs.

  2. Event: An event is a specific action or state change that occurs in the application, such as file reading completion, connection establishment success, etc. Events can be queued in the event queue, waiting for processing by the event loop.

  3. Callback function: A callback function is a piece of code for event handling. When the event loop detects that an event has occurred, it calls the corresponding callback function. Callback functions usually need to be registered with the event loop before the event occurs so that the event loop knows how to handle a specific event.

The workflow of the event-driven model is as follows:

  1. The application starts the event loop.

  2. The application registers a callback function with the event loop and initiates asynchronous I/O operations.

  3. The event loop listens for events in the event queue. Once an event is detected, the event loop calls the corresponding callback function.

  4. The callback function handles completed operations and initiates new asynchronous I/O operations if needed.

  5. The event loop continues to listen for events in the event queue until the application ends.

A major advantage of the event-driven model is the ability to handle large numbers of concurrent operations more efficiently because it avoids the overhead of creating separate threads for each operation. In high-concurrency and high-throughput scenarios, the event-driven model can significantly improve application performance.

3.3.4. Callback function (Callback)

Applications can specify a callback function when initiating asynchronous I/O operations. When the operation is completed, the system automatically calls this callback function to notify the application.
Using the asynchronous notification mechanism can improve the efficiency and scalability of asynchronous I/O and avoid unnecessary polling and blocking waiting for operations to complete. Applications can process the results or continue other tasks immediately after the operation completes without waiting for the operation to complete. . Java's asynchronous non-blocking I/O (AIO) model is implemented
using NIO.2 (introduced in Java 7) AsynchronousFileChannel, and other classes. AsynchronousSocketChannelIn the AIO model, notification callback functions handle CompletionHandlerthe results of asynchronous operations by implementing interfaces.

We use a sample program to understand
ReadCompletionHandlerthe callback class. When the asynchronous file reading operation is completed, its completedor failedmethod will be automatically called. We then AsynchronousFileChannelread the file using asynchronously and ReadCompletionHandlerpass an instance of as a parameter to readthe method. When the read operation is completed, ReadCompletionHandlerthe methods in will be automatically called to process the operation results.

  1. Implement CompletionHandlerthe interface. This interface has two methods: completedfor handling the success of the operation and failedfor handling the failure of the operation.
public class ReadCompletionHandler implements CompletionHandler<Integer, ByteBuffer> {
    
    

    @Override
    public void completed(Integer bytesRead, ByteBuffer buffer) {
    
    
        System.out.println("异步读取完成,读取了 " + bytesRead + " 字节");
        buffer.flip();
        byte[] data = new byte[buffer.limit()];
        buffer.get(data);
        System.out.println("文件内容: " + new String(data));
    }

    @Override
    public void failed(Throwable exc, ByteBuffer attachment) {
    
    
        System.err.println("异步读取失败: " + exc.getMessage());
    }
}
  1. Use AsynchronousFileChannelto read file contents asynchronously. Notification callback functions are implemented by ReadCompletionHandlerpassing an instance AsynchronousFileChannelof read.
public class AsyncFileReader {
    
    
    public static void main(String[] args) {
    
    
        Path filePath = Paths.get("example.txt");
        try (AsynchronousFileChannel fileChannel = AsynchronousFileChannel.open(filePath, StandardOpenOption.READ)) {
    
    
            ByteBuffer buffer = ByteBuffer.allocate(1024);
            long position = 0;
            fileChannel.read(buffer, position, buffer, new ReadCompletionHandler());

            // 使程序运行一段时间,以便异步操作完成
            Thread.sleep(3000);
        } catch (IOException | InterruptedException e) {
    
    
            e.printStackTrace();
        }
    }
}

4. Direct memory IO (Direct IO, DIO)

  • Basic concepts and principles: Read data directly from disk to the memory space used by the application without going through the operating system kernel buffer.
  • Advantages: Reduce the number of data copies and improve read and write performance.
  • Disadvantages: Requires operating system support, low applicability.
  • Application scenarios: Suitable for scenarios with high performance requirements such as reading and writing large files.

3. IO multiplexing

3.1. Concept and principle of multiplexing

Select, poll, and epoll are all IO multiplexing mechanisms. The so-called I/O multiplexing mechanism means that multiple descriptors can be monitored through a mechanism. Once a certain descriptor is ready (usually read-ready or write-ready), the program can be notified to perform corresponding read and write operations. But select, poll, and epoll are essentially synchronous I/O, because they all need to be responsible for reading and writing after the read and write events are ready, which means that the read and write process is blocked, while asynchronous I/O does not require them to do so themselves. Responsible for reading and writing, the implementation of asynchronous I/O will be responsible for copying data from the kernel to user space.

3.2. select model

Monitor multiple IO events through the select function and return when an event occurs. It is suitable for scenarios with a small number of connections. Monitor multiple file descriptors (such as sockets) for state changes. When we need to monitor multiple file descriptors, the select model can help us implement high-concurrency and high-performance network applications.

Select uses a descriptor set to save the file descriptors (usually sockets) that need to be monitored. When the status of a descriptor changes (for example, data is readable, writable, or abnormal), the select function returns to inform which descriptors have changed.

The main advantage of the select model is that it has good cross-platform compatibility and supports multiple operating systems. But it has some disadvantages when dealing with a large number of concurrent connections:

  1. Low efficiency: Each time the select function is called, all file descriptor sets need to be copied to the kernel space. When the number of monitored file descriptors is large, this will lead to more switching between kernel mode and user mode, affecting performance.
  2. The number of file descriptors that can be monitored is limited: Since select uses the fd_set structure to store file descriptors, the size of this structure is fixed, limiting the maximum number of file descriptors that select can monitor.
  3. After the event is triggered, it is necessary to traverse the entire file descriptor set to determine which descriptors have changed, which will cause a certain performance loss in the case of a large number of concurrent connections.

Despite this, select is still a simple and practical I/O multiplexing technology. For some application scenarios with a small number of concurrent connections, the performance of the select model is still acceptable.

3.3. poll model

The poll model is an advanced I/O multiplexing technology, similar to select, and is also used to monitor state changes of multiple file descriptors (such as sockets). The poll model has many similarities with the select model, but there are also some key improvements that make it perform better when handling large numbers of concurrent connections.

The poll model uses a polling structure array (pollfd structure array) to save the file descriptors that need to be monitored and the events of concern. When the status of a certain descriptor changes, the poll function returns to inform which descriptors have changed and their specific status changes.

The main advantages of the poll model are as follows:

  1. No limit on the number of file descriptors: Unlike select, the poll model does not use a fixed-size structure to store file descriptors, but uses a dynamically sized array of polling structures. Therefore, poll has no fixed limit on the number of file descriptors and can handle more concurrent connections.
  2. More efficient event processing: poll directly stores the descriptor of the state change and its specific state change information in the polling structure array, so after the event is triggered, we can directly access this information without traversing the entire file like select. Descriptor collection.

However, the poll model still has some efficiency issues when handling a large number of concurrent connections:

  1. Execution efficiency is limited by linear polling: when the number of concurrent connections is large, the entire polling structure array needs to be traversed to find changed descriptors, which will result in a certain performance loss.
  2. A lot of data exchange with the kernel is required: every time the poll function is called, the entire array of polling structures needs to be copied to kernel space. When the number of monitored file descriptors is large, this will lead to more switching between kernel mode and user mode, affecting performance.

The poll model is a more efficient I/O multiplexing technology than select and is suitable for handling a larger number of concurrent connections. However, it may still face some performance issues in extremely high concurrency scenarios. To address these problems, you can consider using more efficient I/O reuse technology, such as epoll (Linux) or IOCP (Windows).

At the operating system level, the poll model is mainly implemented through system calls. In Linux and Unix-like systems, the poll model is implemented through the poll() system call. This system call allows applications to poll the I/O status of multiple file descriptors (including sockets, ordinary files, etc.), thereby achieving multiplexing. The following explains the working principle of the poll model in detail from the operating system level.

  1. File descriptor In Unix and Unix-like systems, all open files, sockets, pipes, etc. are represented by a non-negative integer called a file descriptor. File descriptors are used to uniquely identify an open file and provide a unified interface for reading, writing, and operating files.

  2. The pollfd structure poll model uses a structure array named pollfd to store the file descriptors that need to be monitored and the events they are concerned about. The pollfd structure contains the following fields:

    • int fd: represents the file descriptor.
    • short events: events that indicate attention, for example, POLLIN means attention to input events (readable), POLLOUT means attention to output events (writable).
    • short revents: Indicates events that actually occur on the file descriptor. When the poll() function returns, this field will be set to the event type that occurred.
  3. poll() system call The poll model queries the status of file descriptors through the poll() system call. The prototype of the poll() function is as follows:

    int poll(struct pollfd *fds, nfds_t nfds, int timeout);
    

    Among them, fds is a pointer to the pollfd structure array, nfds represents the number of file descriptors in the array, timeout represents the waiting time (in milliseconds), -1 represents infinite waiting, and 0 represents immediate return.

When we call the poll() function, the operating system copies the pollfd structure array to kernel space and then checks the status of each file descriptor. If the status of a file descriptor changes (such as readable or writable), the operating system will set the corresponding revents field to the corresponding event type. When all file descriptors have been checked, the poll() function returns and copies the pollfd structure array back to user space. At this point, we can iterate through the array and check the revents field of each file descriptor to determine what events occurred.

From an operating system perspective, the advantages and disadvantages of the poll model

advantage:

  1. Can handle large numbers of file descriptors: Since the poll model uses dynamically sized arrays to store file descriptors, there is no fixed limit on the number of file descriptors.

  2. Directly returns the file descriptor where the event occurred: Compared with the select model, the poll model directly stores the file descriptor where the event occurred and its specific event type in the array. This allows us to process events faster without having to iterate through the entire file descriptor set like select does.

shortcoming:

  1. Linear search efficiency is low: When there are a large number of file descriptors, finding the file descriptor where the event occurred requires traversing the entire array, resulting in low efficiency.

  2. Frequent switching between user mode and kernel mode: Each time the poll() function is called, the entire pollfd array needs to be copied to the kernel space, and then copied back to the user space. This will lead to more switching between user mode and kernel mode, affecting performance.

3.4. epoll model

epollIt is an efficient I/O event processing model in the Linux kernel. selectIt is an enhanced version of the multiplexed I/O interface under Linux poll. Its main advantage is high performance and no fixed limitations when handling a large number of concurrent connections. epollThe working principle, usage, advantages and disadvantages will be explained in detail below .
Many popular open source middleware use the epoll model to achieve high-performance concurrency processing capabilities.
These middleware all use the epoll model, allowing them to exhibit excellent performance in high-concurrency environments.

  1. Nginx adopts the epoll model to achieve high concurrency and high throughput.
  2. Redis epoll model to handle a large number of concurrent connections.
  3. Haproxy is an open source load balancer and proxy server that uses the epoll model to achieve high concurrency and high availability.

3.4.1. Working principle

epollUse an event-driven mechanism that registers multiple I/O events to an epollobject and then notifies the application when the events occur. Different from selectand , it does not need to traverse the entire listening collection, but is implemented based on kernel callbacks, thus avoiding the overhead of linear scanning. Additionally, only events that have already occurred are returned, so processing is more efficient.pollepollepoll

3.4.2. How to use

Here are epollthe basic steps to use:

  • Create epollan object: Create an object by calling epoll_createa function .epoll_create1epoll
  • Register events: Use epoll_ctlfunctions to add file descriptors (such as sockets, files, etc.) that need to be monitored and their associated events (such as EPOLLIN, EPOLLOUT, etc.) to the epollobject.
  • Wait for an event: epoll_waitWait for a registered event to occur by calling a function. When an event occurs, epoll_waita ready event collection is returned.
  • Processing events: epoll_waitPerform corresponding event processing operations based on the returned event collection.
  • Logout event: If you no longer need to listen to a file descriptor, you can call it epoll_ctlto delete it from epollthe object.
  • Close epollthe object: Use closethe function to close epollthe object.

3.4.3. Advantages and Disadvantages

advantage shortcoming
Event-driven, only events that have occurred are returned, avoiding the overhead of linear scanning. Only available for Linux, not cross-platform.
There are no fixed limits and can handle large numbers of concurrent connections. The learning curve is higher. Compared with , selectand poll, epollthe API is more complex and the learning cost is higher.
For a large number of connections, epollperformance is better than selectand poll. For a small number of connections, its advantages are not obvious, and its use is relatively complicated.

3.5. kqueue model

Similar to the epoll model, but used in Unix-like systems and suitable for high concurrency scenarios.

3.6. IOCP model

IOCP (I/O Completion Ports) is a high-performance I/O model under the Windows operating system. IOCP provides efficient and scalable I/O operations for high-concurrency, high-throughput network applications. In the IOCP model, all I/O operations are performed asynchronously. The core idea is to separate I/O operations from the work threads that handle I/O completion, so that the threads can focus on processing logic.

3.6.1. Main components of IOCP

  1. Completion Port: A special operating system object used to manage and process completion notifications for asynchronous I/O operations.
  2. Completion Routine: A callback function that handles notification of completion of asynchronous I/O operations.
  3. OVERLAPPED structure: A data structure used to describe asynchronous I/O operations, including completion routines, completion keys and other related information.

3.6.2. Workflow of IOCP model

  1. Create a completion port The application creates a completion port for receiving and processing I/O completion notifications.
  2. Associating a file descriptor associates a socket (or other file descriptor) with a completion port so that I/O operations on the socket can be handled by the completion port.
  3. Initiating an asynchronous I/O operation When an I/O operation is required (such as sending, receiving data, etc.), the application initiates an asynchronous I/O operation. This operation returns immediately and the thread continues processing other tasks.
  4. The worker thread waiting for I/O completion waits for completion notification of the I/O operation by calling the GetQueuedCompletionStatus() function. When there is a completion notification, the function returns and provides relevant completion information.
  5. The processing completion notification worker thread calls the corresponding completion routine based on the completion information, processes the results of the I/O operation, and performs the next step of processing (such as continuing to initiate asynchronous I/O operations, etc.).

3.7. Comparison of advantages and disadvantages

Method to realize advantage shortcoming
select 1. Good cross-platform support and high portability. 2. Simple and easy to use, suitable for entry-level learning. 1. The number of file descriptors is limited by FD_SETSIZE, and the default is 1024. 2. System calls are expensive and require traversing the entire file descriptor set. 3. The triggering method is Level-Triggered, which can easily cause performance problems.
poll 1. Good cross-platform support and high portability. 2. No limit on the number of file descriptors. 1. System calls are expensive and also require traversing the file descriptor collection. 2. The triggering method is Level-Triggered, which can easily cause performance problems.
epoll 1. Exclusive to Linux platform, with excellent performance. 2. No limit on the number of file descriptors. 3. The system call overhead is small and can be event-driven. 4. Supports Edge-Triggered and Level-Triggered. 1. Only applicable to Linux platform, poor portability. 2. When using edge trigger mode, programming complexity is high.
kqueue 1. BSD platform exclusive, excellent performance. 2. No limit on the number of file descriptors. 3. The system call overhead is small and can be event-driven. 4. Supports Edge-Triggered and Level-Triggered. 1. Only applicable to BSD platforms (such as FreeBSD, macOS, etc.) and has poor portability. 2. When using edge trigger mode, programming complexity is high.
IOCP(Windows) 1. Exclusive for Windows platform, with excellent performance. 2. The event notification mechanism based on the completion port can be event driven. 3. Supports asynchronous operations and is suitable for high concurrency scenarios. 1. Only applicable to Windows platform, poor portability. 2. The programming complexity is high.

Reference documentation

  1. "Boost application performance using asynchronous I/O" and pictures are from this article

  2. https://notes.shichao.io/unp/ch6/

  3. http://www.linuxidc.com/Linux/2012-05/59873p3.htm

  4. http://xingyunbaijunwei.blog.163.com/blog/static/76538067201241685556302/

  5. http://blog.csdn.net/kkxgx/article/details/7717125

  6. https://banu.com/blog/2/how-to-use-epoll-a-complete-example-in-c/epoll-example.c
    Insert image description here

Guess you like

Origin blog.csdn.net/wangshuai6707/article/details/133306701