Five Linux io models: blocking IO, non-blocking IO, multiplexed IO, asynchronous IO, signal-driven IO

table of Contents

1. Blocking IO (blocking IO)

2. Non-blocking IO (non-blocking IO)

3. Multiplexing IO (IO multiplexing)

4. Asynchronous IO (Asynchronous I/O)

5. Signal driven IO (signal driven I/O, SIGIO)

6. Contrast


1. Blocking IO (blocking IO)

In Linux, all sockets are blocking by default. A typical read operation process is as follows:

When the user calls the read system call, the kernel starts the first stage of IO: preparing data. For network io, many times the data has not been reached at the beginning (for example, a complete data packet has not been received), at this time the kernel has to wait for enough data to arrive. On the user process side, the entire process will be blocked. When the kernel waits until the data is ready, it copies the data from the kernel to the user memory, and then the kernel returns the result, and the user process releases the block state and restarts.

Therefore, the characteristic of blocking IO is that both stages of IO execution (waiting for data and copying data) are blocked.

Interfaces such as send() and recv() are all blocking. Using these interfaces can easily build a server/client model. Below is a simple "one question and one answer" server.

Most socket interfaces are blocking. The so-called blocking interface means that the system call (usually the IO interface) does not return the call result and keeps the current thread blocked, and only returns when the system call obtains the result or a timeout error occurs.

In fact, unless otherwise specified, almost all IO interfaces (including socket interfaces) are blocking. This brings a big problem to network programming. For example, while calling send(), the thread will be blocked. During this time, the thread will not be able to perform any operations or respond to any network requests.

A simple improvement is to use multi-threading (or multi-process) on the server side. The purpose of multi-threading (or multi-process) is to allow each connection to have an independent thread (or process), so that the blocking of any one connection will not affect other connections. There is no specific model for whether to use multi-process or multi-thread. In the traditional sense, the cost of a process is much greater than that of a thread, so if you need to provide services to more clients at the same time, it is not recommended to use multiple processes; if a single service execution body needs to consume more CPU resources, such as large Large-scale or long-term data operations or file access, the process is more secure. Usually, pthread_create () is used to create a new thread, and fork() is used to create a new process.

We assume that the above server/client model puts forward higher requirements, that is, let the server provide multiple clients with a question-and-answer service at the same time. So there is the following model.

In the above thread/time legend, the main thread continuously waits for the client's connection request. If there is a connection, a new thread is created and the same question and answer service as the previous example is provided in the new thread.

Many beginners may not understand why a socket can accept multiple times. In fact, the designer of the socket may deliberately leave a foreshadowing for the multi-client situation, so that accept() can return a new socket. Here is the prototype of the accept interface:

int accept(int s, struct sockaddr *addr, socklen_t *addrlen);

The input parameter s is the socket handle value inherited from socket(), bind() and listen(). After executing bind() and listen(), the operating system has begun to monitor all connection requests at the specified port, and if there is a request, the connection request will be added to the request queue. Calling the accept() interface is to extract the first connection information from the request queue of socket s and create a new socket return handle similar to s. The new socket handle is the input parameter of subsequent read() and recv(). If there are currently no requests in the request queue, accept() will enter the blocking state until a request enters the queue.

The above multi-threaded server model seems to perfectly solve the requirement of providing question and answer services for multiple clients, but it is not always the case. If you want to respond to hundreds of thousands of connection requests at the same time, both multithreading and multiprocessing will seriously occupy system resources, reducing the system's response efficiency to the outside world, and threads and processes themselves are more likely to enter a suspended animation state.

Many programmers may consider using " thread pool " or " connection pool " . The "thread pool" aims to reduce the frequency of creating and destroying threads. It maintains a reasonable number of threads and allows idle threads to take on new execution tasks again. The "connection pool" maintains a buffer pool of connections, reuses existing connections as much as possible, and reduces the frequency of creating and closing connections. Both of these technologies can reduce system overhead very well, and are widely used in many large systems, such as websphere, tomcat, and various databases. However, the "thread pool" and "connection pool" technologies only relieve the resource occupation caused by frequent calls to the IO interface to a certain extent. Moreover, the so-called " pool " always has its upper limit, when requested much higher than the upper limit, " pool " system consisting of outside influence should be no less than when the pool is not much better effect. So use " pool " must be considered in response to the size of its face, and in accordance with sound should be sizing " pool " size.  

Corresponding to the thousands or even tens of thousands of client requests that may occur simultaneously in the above example, the "thread pool" or "connection pool" may relieve some of the pressure, but it cannot solve all the problems. In short, the multi-threaded model can easily and efficiently solve small-scale service requests, but in the face of large-scale service requests, the multi-threaded model will also encounter bottlenecks. You can use non-blocking interfaces to try to solve this problem.

2. Non-blocking IO (non-blocking IO)

Under Linux, you can make it non-blocking by setting the socket. When performing a read operation on a non-blocking socket, the process is as follows:

It can be seen from the figure that when the user process issues a read operation, if the data in the kernel is not ready, it will not block the user process, but immediately return an error. From the perspective of the user process, after it initiates a read operation, it does not need to wait, but gets a result immediately. When the user process judges that the result is an error, it knows that the data is not ready yet, so it can send the read operation again. Once the data in the kernel is ready, and it receives the system call from the user process again, it immediately copies the data to the user memory and then returns. Therefore, in non-blocking IO, the user process actually needs constant Proactively ask if the kernel data is ready. 

In the non-blocking state, the recv() interface returns immediately after being called, and the return value represents different meanings. As in this example,

  • The return value of recv() is greater than 0, which means that the data has been received, and the return value is the number of bytes received;
  • recv() returns 0, indicating that the connection has been disconnected normally;
  • recv() returns -1, and errno is equal to EAGAIN, indicating that the recv operation has not been completed;
  • recv() returns -1 and errno is not equal to EAGAIN, indicating that the recv operation encountered a system error errno.

The significant difference between a non-blocking interface and a blocking interface is that it returns immediately after being called. Use the following function to set a handle fd to a non-blocking state.

fcntl( fd, F_SETFL, O_NONBLOCK );

The following will give a model that only uses one thread, but can detect whether data is delivered from multiple connections at the same time, and accept the data

    It can be seen that the server thread can call the recv() interface in a loop, and can realize the data reception of all connections in a single thread. But the above model is never recommended. Because, calling recv() cyclically will greatly increase the CPU usage; in addition, in this scheme, recv() is more to detect "whether the operation is complete", and the actual operating system provides a more efficient detection "Is the operation completed" interface, such as the select() multiplexing mode, can detect whether multiple connections are active at one time, you can see that the server thread can call the recv() interface in a loop, and all connections can be implemented in a single thread Data reception work. But the above model is never recommended. Because, calling recv() cyclically will greatly increase the CPU usage; in addition, in this scheme, recv() is more to detect "whether the operation is complete", and the actual operating system provides a more efficient detection The interface of "whether the operation is complete", such as the select() multiplexing mode, can detect whether multiple connections are active at one time.

3. Multiplexing IO (IO multiplexing)

The word O multiplexing may be a bit unfamiliar, but when it comes to select/epoll, you can probably understand it. In some places, this IO method is also called event driven IO (event driven IO). We all know that the advantage of select/epoll is that a single process can process multiple network connected IO at the same time. Its basic principle is that the select/epoll function will continuously poll all the sockets it is responsible for, and when a socket has data arrived, it will notify the user process. Its process is shown in the figure:

When the user process calls select, the entire process will be blocked, and at the same time, the kernel will "monitor" all the sockets that select is responsible for. When the data in any socket is ready, select will return. At this time, the user process calls the read operation again to copy the data from the kernel to the user process.

This picture is not much different from the blocking IO picture, in fact it is even worse. Because two system calls (select and read) are needed here, and blocking IO only calls one system call (read). But the biggest advantage after using select is that users can process multiple socket IO requests simultaneously in one thread. Users can register multiple sockets, and then continuously call select to read the activated sockets, which can achieve the purpose of processing multiple IO requests in the same thread at the same time. In the synchronous blocking model, this goal must be achieved through multithreading. (One more sentence: So, if the number of connections processed is not very high, the web server using select/epoll may not have better performance than the web server using multi-threading + blocking IO, and the delay may be even greater. select/epoll The advantage of is not that it can handle a single connection faster, but that it can handle more connections.)

In the multiplexing model, each socket is generally set to non-blocking, but as shown in the figure above, the entire user process is actually blocked all the time. It's just that process is blocked by the select function, not by socket IO. So select() is similar to non-blocking IO.

Most Unix/Linux support the select function, which is used to detect the status changes of multiple file handles. The function prototype and examples of select can be found in another blog post by the blogger: Examples of select, poll, and epoll of  I/O multiplexing

The following will re-simulate the model of receiving data from multiple clients in the above example

The above model only describes the process of using the select() interface to receive data from multiple clients at the same time; since the select() interface can detect multiple handles at the same time, read status, write status, and error status, it can be easily constructed as A server system where multiple clients provide independent question and answer services.

What needs to be pointed out here is that a connect() operation on the client side will trigger a "readable event" on the server side, so select() can also detect connect() behavior from the client side.

In the above model, the most critical part is how to dynamically maintain the three parameters readfds, writefds and exceptfds of select(). As an input parameter, readfds should mark all handles of "readable events" that need to be detected, including the "parent" handle that detects connect(); at the same time, writefds and exceptfds should mark all "writeable events" that need to be detected ”And “error event” handle (use FD_SET() mark). As output parameters, the handle values ​​of all events captured by select() are stored in readfds, writefds, and exceptfds. The programmer needs to check all the flag bits (check with FD_ISSET()) to determine exactly which handles have an event.

The above model mainly simulates the service process of "one question and one answer", so if select() finds that a handle captures a "readable event", the server program should perform recv() operations in time and prepare according to the received data The data is to be sent, and the corresponding handle value is added to writefds to prepare for the next select() detection of the "writable event". Similarly, if select() finds that a handle captures a "writable event", the program should do the send() operation in time, and be ready for the next "readable event" detection preparation.

The characteristic of this model is that each execution cycle will detect one or a group of events, and a specific event will trigger a specific response. We can classify this model as an "event-driven model."

Compared with other models, the event-driven model using select() only uses single thread (process) execution, takes up less resources, does not consume too much CPU, and can provide services for multiple clients. If you try to build a simple event-driven server program, this model has certain reference value.

But this model still has many problems. First of all, the select() interface is not the best choice to implement "event-driven". Because when the handle value to be detected is large, the select() interface itself needs to consume a lot of time to poll each handle. Many operating systems provide more efficient interfaces. For example, Linux provides epoll, BSD provides kqueue, Solaris provides /dev/poll,... If you need to implement a more efficient server program, an interface like epoll is more recommended. Unfortunately, the epoll interfaces specially provided by different operating systems are very different, so it will be more difficult to use an interface similar to epoll to implement a server with better cross-platform capabilities.

Secondly, this model mixes incident detection and incident response together. Once the executive body of incident response is huge, it will be catastrophic to the entire model. In the following example, the huge execution body 1 will directly lead to the delayed execution of the execution body in response to event 2 and greatly reduce the timeliness of event detection.

Fortunately, there are many efficient event-driven libraries that can shield the above difficulties. Common event-driven libraries include the libevent library and the libev library as a replacement for libevent. These libraries will select the most appropriate event detection interface according to the characteristics of the operating system, and add technologies such as signal to support asynchronous response, which makes these libraries the best choice for building event-driven models. .
In fact, starting from 2.6, the Linux kernel has also introduced IO operations that support asynchronous response, such as aio_read, aio_write, which is asynchronous IO.

4. Asynchronous IO (Asynchronous I/O)

Asynchronous IO under Linux is used for disk IO read and write operations, not for network IO. It was introduced from kernel 2.6. Let's take a look at its process first

After the user process initiates the read operation, it can start to do other things immediately. On the other hand, from the perspective of the kernel, when it receives an asynchronous read, it will first return immediately, so it will not generate any block to the user process. Then, the kernel will wait for the completion of the data preparation, and then copy the data to the user memory. When all this is completed, the kernel will send a signal to the user process to tell it that the read operation is complete.

Asynchronous IO is truly non-blocking, it does not cause any blockage to the requesting process, so it is very important for the implementation of high-concurrency web servers.

5. Signal driven IO (signal driven I/O, SIGIO)

First, we allow the socket to perform signal-driven I/O, and install a signal processing function, the process continues to run without blocking. When the data is ready, the process will receive a SIGIO signal, you can call the I/O operation function in the signal processing function to process the data. When the datagram is ready to be read, the kernel generates a SIGIO signal for the process. We can then call read in the signal processing function to read the datagram and notify the main loop that the data is ready to be processed, or immediately notify the main loop to let it read the datagram. No matter how the SIGIO signal is processed, the advantage of this model is that the process can continue to execute without being blocked while waiting for the datagram to arrive (the first stage). Eliminates the blocking and polling of select, and when there is an active socket, it is handled by the registered handler.

For examples of the two important functions signal and sigaction for signal-driven IO, see another blog post by the blogger: linux signals: signal and sigaction

6. Contrast

After the above introduction, you will find that the difference between non-blocking IO and asynchronous IO is still very obvious. In non-blocking IO, although the process will not be blocked most of the time, it still requires the process to actively check, and when the data preparation is completed, the process also needs to actively call recvfrom again to copy the data to the user memory .

Asynchronous IO is completely different. It is like the user process handing over the entire IO operation to someone else (kernel) to complete, and then the other person will send a signal when finished. During this period, the user process does not need to check the status of the IO operation, nor does it need to actively copy data.

Guess you like

Origin blog.csdn.net/weixin_40179091/article/details/113969925