Five IO models

IO is Input and Output. For the computer, we use the keyboard and mouse to give instructions to the computer as a kind of input, and the computer displays the text input by our keyboard to the monitor as an output. Or when writing a blog, sending the text information received by the computer from the keyboard to the platform is output. When we look up information and open a blog, it can also be understood as input for the computer.

The core of the operating system is the kernel, which is independent of ordinary applications and can access the protected memory space as well as all the permissions to access the underlying hardware devices. In order to ensure that the user process cannot directly operate the kernel (kernel), the operating system divides the virtual space into two parts, one is the kernel space and the other is the user space.

The IO data mentioned above is usually cached by the OS to the kernel and then copied to the user space, so the process of Input and Output can be simplified to:

External Input---"OS Kernel Space---"User Space---"User Process

User Process Input---"User Space---"OS Kernel Space---"External

Blocking IO

When a user process obtains IO data, the simplest way is serial, that is, blocking IO (BIO).

Example: For example, after the code development is completed, we submit a merge request. After submission, we go to the commitor station and find that the commitor is not at the station. We wait until he returns to the station, and then watch him review the code until he finishes the review. Do other things without worry. file

Before jdk1.4, the network connection adopts the BIO mode. When the server socket receives the request, it can no longer receive and process other requests. It can only receive other requests after the request is processed, which can be simply understood as serial . Although the model is simple, we still have the suspicion of being lazy after developing the code and waiting for the commitor to inspect it. It may be said that the efficiency is low. As can be seen from the above picture, we don't have to stay by the commitor station and wait for him. Come back, so there is non-blocking IO.

Non-blocking IO

Considering that BIO has the disadvantage of low serial efficiency, we have optimized it so that instead of waiting for the data to be ready, the user process takes the initiative to ask multiple times.

Example: For example, after the code development is completed, we submit a merge request. After submitting, we see that the commitor is not in the workstation and we know that the code cannot be completed. At this time, we will not wait by the committer workstation, but whether to go to other work Check whether he has come back, until one time to see that he has returned to his work station, then he stayed next to his work station and watched him check the code. We didn't worry about doing other things until he finished checking. file

Compared with BIO, the NIO kernel will return immediately. After returning, the application process can do other things, that is, the application process is not blocked in the first stage, but it needs to actively and constantly ask whether the kernel data is ready; the second stage is still blocked . Although NIO is better than BIO, it still requires application processes to constantly inquire, which leads to an IO reuse model.

IO multiplexing

The IO reuse model removes the active inquiry process of the application process, but hands over whether the data is ready for processing by the kernel. The kernel traverses through select/poll to check whether the data is ready, or processes it through epoll callbacks.

Example: We need to submit the front-end and back-end function codes this time. Multiple commitors are responsible for reviewing. Before the commitor review, the secretary records which codes are merged in the request. After we submit the merge and find that the commitor is not there, the secretary will record it first. I have been waiting at the commitor station. The secretary traverses to each commitor station to see if he comes back. Once he sees a commitor at the station, he informs the person of the commitor related merge code to check the code, and the programmer looks at him to check the code , We didn’t feel relieved to do other things until after his inspection was completed.

file

IO multiplexing (IO multiplexing), also known as event-driven IO, is to monitor multiple sockets in a single thread at the same time, and poll all the sockets in charge through select or poll. When a certain When data arrives on the socket, the user process is notified. IO multiplexing is essentially the same as non-blocking IO, but using the new select system call, the kernel is responsible for the polling operation that the requesting process should do. It seems that there is one more system call overhead than non-blocking IO, but supporting multiple IO improves efficiency. The process is first blocked on select/poll, and then blocked on the second stage of the read operation.

There are three main types of IO reuse: select, poll, and epoll. Select records application process requests through an array, so the number of monitoring is limited; poll uses a linked list to optimize the number of monitoring defects of select. In order to reduce the redundant traversal calls of the kernel, epoll becomes active Position passive, realized by callback, as shown below: file

Compared with select/poll, epoll has two more system calls, among which epoll_create establishes a connection with the kernel, epoll_ctl registers events, and epoll_wait blocks user processes and waits for IO events.

The differences between select, poll, and epoll are as follows:

file

The disadvantage of IO reuse is that the application process will block after request.

Signal drive IO

The biggest difference between signal-driven IO and IO multiplexing is that in the data preparation phase of IO execution, user processes are not blocked. As shown in the figure: when the user process needs to wait for data, it will send a signal to the kernel to tell the kernel what data I want, and then the user process will continue to do other things, and when the data in the kernel is ready, The kernel immediately sends a signal to the user process, saying "data is ready, check it soon". After the user process receives the signal, it immediately calls recvfrom to check the data.

Example: After we submitted the code merge request, the secretary still recorded the merge request this time. We saw that the commitor was not in the work position and immediately went back to do our own work. Once the commitor returned to the work position, the secretary notified me to go to the commitor work position. Checking the code, I was next to him watching him check the code, and I didn't worry about doing other things until he finished checking. file

In short: the signal driver optimizes the blocking in the first stage, but it still blocks when the programmer is watching the code.

Asynchronous IO

AIO, asynchronous IO truly realizes the non-blocking of the entire IO process (two stages). The user process returns immediately after issuing a system call. The kernel waits for the data preparation to be completed, then copies the data to the user process buffer, and then sends a signal to tell the user process that the IO operation is complete (compared with SIGIO, one is to send a signal to tell the user process data is ready Finished, one is the completion of IO execution).

Example: After we submitted the code integration request, we saw that the commitor was not at the workstation and immediately went back to do our own work. After the commitor returned to the workstation, we saw the merge we mentioned, and we started to review the code by ourselves. When the review was completed, Notify me that the code has been incorporated and can be released. file

to sum up

Finally, to summarize, the five IO models should be improved step by step from top to bottom. The overall classification is as follows:

file

Author: lazy AI patients | Transfer: InfoQ point here: 2020Python paying large collection of real learning **

[Take it away! Python 3.9 official Chinese document, limited time! ] ( http://dwz.date/dE6v )

[Limited time! Quick collar! 14 high-definition Python cheat sheets, essential for efficiency improvement! ] ( http://dwz.date/dE6w )

[GitHub Biaoxing 3W+, 80 Python cases, let you learn Python easily! ] ( http://dwz.date/dE64 )

Guess you like

Origin blog.csdn.net/weixin_43507410/article/details/113036171