Several IO models

IO

Real IO operations involve interaction with IO devices, and the operating system restricts applications from directly interacting with devices. The IO operation we usually talk about is actually the interaction between the application program and the operating system. Generally, the System Call of the operating system is used, that is, the system call. Reading is read(), and writing is write(). Different operating systems may implement difference, but the functionality is the same. By calling read, the data is copied from the operating system kernel buffer to the application program buffer; calling write is to copy the data from the application program buffer to the operating system kernel buffer. To really interact with IO devices, the operating system initiates a system interrupt at an appropriate time, finds the interrupt vector and interacts with the corresponding device.
Therefore, IO is divided into two stages, one is the kernel and device interaction stage, and the other is the kernel and application interaction stage.

Synchronous IO and asynchronous IO

Synchronous IO means that the user thread is the party that initiates the IO request actively, and the operating system kernel is the party that passively accepts the IO request.
Asynchronous IO is the opposite of synchronous IO, which means that the operating system kernel is the party that initiates the IO request actively, and the user thread is the party that passively accepts it. The user thread registers the callback functions of various IO events with the kernel, and the kernel actively calls them.

Blocking IO and non-blocking IO

Blocking IO means that the kernel IO operation needs to be completely completed before returning to the user thread to continue execution. Blocking refers to the execution state of the user program.
Non-blocking IO means that the user program does not need to wait for the kernel IO operation to be completely completed. After calling the IO, the kernel will immediately return a status value to the user thread. The user thread can continue to execute, and the thread is in a non-blocking state.

BIO

BIO is blocking IO, synchronous blocking IO. The application starts from the IO system call until the system call returns. During this time, the user thread is blocked. After returning successfully, the thread starts processing the application buffer data.
Assuming that the BIO model is used to read a file on the disk:
(1) From the read system call, the user thread enters the blocked state.
(2) When the system kernel receives the read system call, it begins to prepare data. At the beginning, the data may not have reached the kernel buffer (a block may not have been read), and the kernel needs to wait.
(3) The kernel waits until the complete data arrives, copies the data from the kernel buffer to the application buffer, after which read returns the result (returns the number of bytes copied into the user buffer).
(4) After read returns, the user thread will recover from blocking and continue to run.

Therefore, the BIO model is in the two stages of the IO operation of the kernel, and the user thread is blocked.
For example, Xiao Wang went to a restaurant to eat and ordered a Ma La Tang, and then he kept staring at the food outlet until the boss placed a Ma La Tang at the food outlet. After that, he went to fetch Malatang and ate it.
Among them, Xiao Wang is the application program, and the boss is the operating system kernel. The action of ordering food is to initiate an IO request, and the action of getting food is to copy from the kernel buffer to the process buffer.

Since it is Xiao Wang who initiates ordering, this is synchronous; but Xiao Wang has been staring at the food pick-up port without doing anything else, indicating that he is in a blocked state, so this is a synchronous blocking IO model.

NIO

Non-Blocking IO, non-blocking IO. In the NIO model, once the application thread starts the IO system call, the following two situations will occur:
(1) If there is no data in the kernel buffer, the system call will return immediately and return a call failure message.
(2) If there is data in the kernel buffer, it is blocked until the data is copied from the kernel buffer to the user process buffer, the system call returns successfully, and the thread resumes running.
Assuming that the NIO model is used to read a file on the disk:
(1) When the kernel data is not ready, when the user thread initiates an IO request, it returns immediately. Generally, in order to read the final data, the user thread needs to continuously initiate IO system calls (while loop).
(2) After the kernel data arrives, when the user thread initiates a system call, the user thread enters the blocking state. The kernel starts copying data, copies the data from the kernel buffer to the process buffer, and the kernel returns the result.
(3) After the user thread reads the data, it unblocks the state and continues to run. That is to say, the user process needs to go through several attempts to ensure that the data is actually read in the end, and then continue to execute.

Therefore, in the NIO model, during the interaction between the operating system and the device, the user thread is in a non-blocking state. During the interaction between the operating system and the application, the user thread needs to block to copy the buffer.
It was still Xiao Wang who went to the restaurant to eat and ordered a Malatang. Before eating, he asked the boss if the meal was ready, and the boss replied no, and then he asked every 3 minutes until the meal was ready. After that, he went to fetch Malatang and ate it.

Since it was still Xiao Wang who initiated the order, this is synchronous; ask the boss and the boss will reply immediately, indicating that it is non-blocking, so this is a synchronous non-blocking IO model.

AIO

异步IO模型(Asynchronous IO,简称为AIO)。基本流程是:用户线程通过系统调用,向内核注册某个函数。内核在整个IO操作(包括数据准备、缓冲区数据复制)完成后,通知用户线程,执行后续的业务操作。在整个内核的数据处理过程中,包括内核将数据从网卡或磁盘读取到内核缓冲区、将内核缓冲区的数据复制到用户缓冲区两个阶段中,用户线程都不需要阻塞。
假设现在通过AIO模型来读取磁盘上的一个文件:
(1)当用户线程发起了read系统调用,可以继续往下执行,用户线程不阻塞,也不需要轮询来查询是否内核准备好了数据。
(2)内核开始IO的第一个阶段:与设备交互准备数据。等到数据准备好了,执行第二阶段,将数据从内核缓冲区复制到进程缓冲区。
(3)内核会给用户进程发送一个信号(Signal),或者回调用户线程注册的函数,通知用户线程read操作结束。
(4)用户线程读取进程缓冲区的数据,完成后续的业务操作。

所以,在AIO模型中内核等待数据和复制数据的两个IO阶段,用户线程都不阻塞。
假设现在饭店开始开放外卖服务,小王用外卖App点了份炒面并填写了住址,看到下单成功就干别的去了,然后外卖小哥把饭送到小王家门口,小王开心的吃了一顿美餐。

填写住址这个操作就相当于向内核中注册函数,从饭做好到外卖小哥送到家这两个过程小王也没有干等着,所以这是非阻塞的,所以整体上为异步IO模型。

IO多路复用

在IO多路复用模型中,引入了一种新的系统调用,称为选择器,功能是查询IO的就绪状态。在Linux系统中,对应的系统调用为select/epoll系统调用,epoll是select的增强版。通过该系统调用,一个进程可以监视多个文件描述符,一旦某个描述符就绪(一般是内核缓冲区可读/可写),内核能够将就绪的状态返回给应用程序。之后,应用程序根据就绪的状态,进行相应的IO系统调用。
假设现在用IO多路复用模型编写一个socket监听服务器:
(1)注册IO操作。每收到一个客户端socket连接时将其注册到select/epoll选择器中。
(2)开始就绪状态的轮询。通过选择器的查询方法,查询注册过的所有socket连接的就绪状态。通过查询的系统调用,内核会返回一个就绪的socket列表。当任何一个注册过的socket中的数据准备好了,内核缓冲区有数据(就绪)了,内核就将该socket加入到就绪的列表中。
在轮询时,负责轮询的用户线程将会一直阻塞。
(3)用户线程获得了就绪状态的列表后,根据其中的socket连接,发起read系统调用,用户线程阻塞,内核开始将数据从内核缓冲区复制到进程缓冲区。
(4)复制完成后,内核返回结果,用户线程才会从阻塞中恢复继续执行。
假设小王,小张,小李去饭店吃饭,小王点了拉面,小张点了盖浇米饭,小李点了胡辣汤,这时他们雇了一个人帮他们查看饭的状态,如果谁的饭好了就通知某人去取饭。

这个过程中,三个人雇用的这个人就充当一个选择器,他需要不停的查看饭的状态,如果小王的饭好了就通知小王,然后小王去取饭,其他两人的饭也是如此,这就是IO多路复用。

总结

BIO、NIO在高并发场景下不可用,BIO需要一个线程维护一个IO连接,NIO需要不断轮询,占用CPU资源,NIO很少单独使用。
吞吐量最好的是AIO,其次是IO多路复用。
AIO目前只有windows的IOCP,linux下支持还不完善,所以跑在linux上的高并发网络应用程序大多采用IO多路复用模型。

Guess you like

Origin blog.csdn.net/qq_32076957/article/details/128825147