Linux IO mode and select, poll, epoll explained

Some concepts:

Virtual space : It is the space composed of all addresses seen by the process. Virtual space Remapping of all physical addresses assigned to it by a process.

Addressing returns are related to the number of bits in the computer. Divided into kernel space and user space. For 32-bit Linux systems, the highest 1G byte is the kernel space. The lowest 3G bytes are user space.

Process blocking : This is an active behavior of the process itself. When the process enters the blocking state, it does not occupy CPU resources.

file descriptor fd : non-negative integer, is an index value. Points to the process's open file record table maintained by the kernel for each process.

Cached IO : The default IO for most filesystems is cached IO. The process is: the data is first copied to the operating system's kernel buffer (page cache), and then copied to the application's address space.

Example:

When a read operation occurs, it goes through two stages:

1 Waiting for the data to be ready

2 copying the data from the kernel to the process

There are 5 types of LInux IO modes:

Blocking IO (blocking IO)

Non-blocking IO (non-blocking IO)

IO multiplexing

Signal driven IO (signal driven IO) (not commonly used)

Asynchronous IO (asynchronous IO)

 

Blocking IO (blocking IO):

In Linux, all sockets are blocked by default. The diagram is as follows:

 

There are two stages mentioned above, and in blocking IO, both stages are blocked

 

Nonblocking IO (nonblocking IO)

The diagram is as follows:

The feature of nonblocking IO is that the user process needs to constantly ask the kernel if the data is ready.

 

IO multiplexing

Also known as event driven IO, including select, poll, epoll. A single process can handle IO for multiple network connections at the same time.

The principle is that select, poll, and epoll will continuously poll the responsible socket. When the socket has data arriving, it will notify the user process.

The diagram is as follows:

When the user process calls select, the entire process will be blocked. At the same time, the kernel will monitor all sockets that select is responsible for. When the data in any socket is ready, select will return. At this time, call read again to copy the data from the kernel to the user process

Here, we use two system calls, select and recvfrom. Compared with a system call (recvfrom) of blocking IO, the efficiency is worse. The advantage of using select is that it can handle multiple connections at the same time, rather than processing a single connection. quick.

Therefore, when the number of connections to be processed is not very high, the web server using select/epoll is not necessarily better than the web server using multi-threading + blocking IO, and the delay may be greater.

In the IO multiplexing model, for each socket, it is generally set to non-blocking. In most cases, the entire user process is blocked by the select function

 

Asynchronous IO (asynchronous IO)

The process of asynchronous IO is as follows:

 

The user process returns immediately after initiating read to do other things.

After the kernel receives the read, it will not generate any block for the user process. Instead, wait for the data preparation to complete, copy the data to the user memory, and then send a signal to the user process to tell the read operation is complete

 

Summarize:

The difference between blocking and non-blocking

Calling blocking IO will block the corresponding process until the operation is completed, while non-blocking IO returns immediately when the kernel is still preparing data

The difference between syschronous IO and asynchronous IO

The key to judging synchronous IO and asynchronous IO is whether the process is blocked in real IO operations (such as recvfrom).

In non-blocking, if the data of the kernel is not ready, the process will not be blocked at this time. However, when the data in the kernel is ready, recvfrom will copy the data from the kernel to the user memory. At this time, the process is blocked, so non-blocking and blocking are both synchronous IO

Asynchronous IO means that when the process initiates the IO operation, it returns immediately and ignores it again until it receives the IO completion signal sent by the kernel. During this process, the process is not blocked at all

The difference between synchronous and asynchronous is: how does the user process know that the data is ok (it actively checks it, or the received signal knows it); after the data is prepared in the kernel, who is responsible for copying the data to the user memory ( Copy it yourself or the kernel is responsible for copying)

Comparison diagram of each IO model:

This article is organized from the following refs:

ref: https://segmentfault.com/a/1190000003063859

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325304767&siteId=291194637