unix network communication model

In this paper, with reference to "Unix Network Programming: Volume 1" Chapter VI main content, and the combination of excellent articles written blog.

Reference Hirofumi:

https://segmentfault.com/a/1190000003063859#articleHeader17

https://zhuanlan.zhihu.com/p/63179839

https://www.cnblogs.com/fysola/p/6146063.html

 

classification

  • Blocking IO
  • Non-blocking IO
  • I / O multiplexing
  • Signal-driven IO
  • Asynchronous IO 

The first four synchronous IO, last asynchronous IO

IO mode

Most of the default file system I / O operations are cached I / O. In the Linux cache I / O mechanism, the operating system will be I / O data in the file system cache page cache ( Page Cache ), that is, the data will first be copied to the operating system kernel buffer, before it is copied from the operating system kernel buffer to the address space of the application.

So, for a IO operation, it will go through two stages:

  1. Waiting for data: data may come from other applications or network, if there is no data, the operating system has been waiting, waiting for the application to follow. (Waiting for the data to be ready)
  2. Copy data: Ready to copy data to the application workspace. (Copying the data from the kernel to the process)

In a Unix system, the operating system is an operating system call IO recvfrom (), i.e., a system call recvfrom comprises two steps, data ready and waiting for data copying. For input operations on a socket, the first step usually involves waiting for data to arrive from the network. When the awaited packet arrives, it is copied to a buffer in the kernel. The second step is to copy the data from the kernel buffer to the application process buffer.

Blocking IO model

The following figure shows the typical blocking I / O model, when a user process calls the recvfrom system call, kernel began the first phase of the IO: Prepare data (for network IO, many times the data at the beginning We have not yet arrived. For example, have not received a complete UDP packet. this time kernel must wait for enough data to come). This process needs to wait, i.e. data is copied to a buffer in the operating system kernel is a process of. In the process the user side, the whole process will be blocked (of course, is to block the process of their choice). When the kernel wait until the data is ready, it will copy data from kernel to user memory, and then returns the result kernel, the user process before lifting the state of the block, up and running again.

                                           ​

 

 

Therefore, blocking IO feature is executed in two stages IO are a block.

Here is the server code for a typical blocking calls: recv which is a blocking method, when the program runs to recv, it waits until it receives the data before execution down.

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import socket

server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(('localhost', 6999)) # 绑定要监听的端口
server.listen(5)
while True:
    conn, addr = server.accept() # 等待链接
    print(conn, addr)
    while True:
        try:
            data = conn.recv(1024)  # 接收数据
            print('recive:', data) # 打印接收到的数据
            conn.send(data.upper()) # 然后再发送数据
        except Exception as e:
            print('关闭了正在占线的链接!')
            break
    conn.close()

Non-blocking IO

Under linux, it may be provided by the socket so that it becomes non-blocking. When performing a read operation on a non-blocking socket, the flow as shown in FIG. Application for a process like this when the cycle is called a non-blocking socket recvform, we call polling (Polling) , application process continuously polls the kernel, is ready to see some action. This is often time-consuming cpu.

 

                             ​

Once the application has launched IO request, the system calls the recvfrom () is executed, and returns immediately, but the return is not the result of the completion of the IO process, but a specific error, represents the IO data is not ready, so no IO operations. Applications will be kept (ie, polling) to perform recvfrom () system call, until the data is ready, then the operating system to complete the IO operation, recvfrom () returns success. This process, the system is not ready when the data call recvfrom () returns immediately, that the application is not blocked waiting for data on top of the underlying operating system, but the poll results.

The key difference between visible and non-blocking IO IO blocking that system call recvfrom whether to return immediately. Since polling will consume a lot of CPU time, so this mode is not commonly used.

The following is a Server-side code for a non-blocking calls:


except Exception:
            pass
#!/usr/bin/python
# -*- coding: UTF-8 -*-
import socket
sk = socket.socket()
sk.bind(('127.0.0.1',8080))
sk.setblocking(False)   # 把 socket 中所有需要阻塞的方法都改变成非阻塞
sk.listen(10)
conn_list = []     # 用来存储所有来请求 server端的 conn连接
while True:
    try:
        conn, addr = sk.accept() # 不阻塞,但是没有人连接会报错
        print('conn:', addr, conn)
        conn_list.append(conn)
    except Exception:
        pass
    tmp_list = [conn for conn in conn_list]
    for conn in tmp_list:
        try:
            data = conn.recv(1024)  # 接收数据1024字节
            if data:
                print('收到的数据是{}'.format(data.decode()))
                conn.send(data)
            else:
                print('close conn', conn)
                conn.close()
                conn_list.remove(conn)
                print('remain conn:', len(conn_list))
        except Exception:
            pass

Client-side code:

import socket

client = socket.socket()
client.connect(('127.0.0.1', 9999))

while True:
    msg = input(">>>")
    if msg != 'q':
        client.send(msg.encode())
        data = client.recv(1024)
        print('收到的数据{}'.format(data.decode()))
    else:
        client.close()
        print('close client socket')
        break

Non-blocking IO model advantages : to achieve a service multiple clients at the same time, be able to do other live waiting for task completion time.

Non-blocking IO model Disadvantages: constantly polling recv, take up more CPU resources. Exception handling correspondence BlockingIOError is also invalid CPU cost. (Note: the process of blocking do not take up cpu resources )

 

I / O multiplexing

IO multiplexing is what we call the SELECT, poll, epoll , in some places also called IO this way is event driven IO. I / O multiplexing, you can select or by calling the poll, obstruction in the real I / O system calls above , rather than on the real I / O system calls.

Benefits select / epoll process is that a single network can handle a plurality of IO connections simultaneously. Its basic principle is to select, poll, epoll this function will continue to be responsible for all polling socket, a socket when the data arrives, and informs the user process. As shown below:

 

                             ​

As shown above, the application process to select call blocking , waiting for a datagram socket becomes readable. When the select returns a socket readable this condition, we call recvfrom copy read data reported to the application process buffer.

Comparison of I / O multiplexing and blocking IO does not seem there is any advantage, in fact, the use of select requires two system calls, I / O also slightly inferior. But the advantage is that we can use select to wait for more descriptors ready. Later we then specifically at I / O multiplexing advantage.

 

Drive signal I / O

For the drive signal IO, we first open socket signal-driven I / O functions. Sigaction system call by mounting a signal processing function. The system call returns immediately, we continue the process of work. (Ie, the application process is not blocked). When the data packet is ready to read, the kernel generates a signal sigio that process. Then we can call recvfrom read the data reported in the signal handler.

As shown below:

 

 

                                           ​

Drive signal is also a non-blocking IO model, compared to the above non-blocking IO model signal model does not need to poll driven IO IO examine the underlying data is ready, but passively receive signals, then calls recvfrom perform IO operating.

Compared multiplexed IO model, the signal for driving IO model is a complete process of IO, IO multiplexed in the scene model for multiple simultaneous IO time.

Asynchronous IO

Asynchronous IO (asynchronous IO) defined by the POSIX specification. POSIX asynchronous I / O function to aio_ or lio_ beginning. Working mechanism of these functions is: tell the kernel to start an action, and let the kernel throughout the operation (including the kernel to copy the data from our own buffer) to notify us when finished. The main difference between the drive signal model This model is described earlier in that: a signal IO is driven by the kernel to notify us when to initiate an IO operation, and the kernel asynchronous IO model let us IO operation completes. As shown below:

                                           ​

As shown above, after the user initiates the process of read operations, you can immediately start to do other things. On the other hand, from the kernel's point of view, when it is subjected to a asynchronous read, first of all it will return immediately, so it will not block any user processes. Then, kernel waits for data preparation is complete, and then copy the data to the user memory, when it's all complete, kernel will give the user process to send a signal, telling it read operation is complete.

 

to sum up

Synchronous and asynchronous IO IO comparison of

POSIX these two terms are defined below:

Synchronous IO: blocking the process leading to the request, knowing IO operation is complete.

Asynchronous IO: the process does not lead to blockage.

As shown below, the front four models are all synchronized IO, IO operations is because (recvfrom) to block the process.

Only asynchronous IO model matches the definition of POSIX asynchronous IO.

 

Guess you like

Origin blog.csdn.net/qq_35462323/article/details/94562424