IO model (blocking IO, non-blocking IO, signal-driven IO, asynchronous IO, multiplexed IO)

1. Five typical IO models

  • Blocking IO, non-blocking IO, signal-driven IO, asynchronous IO, multiplexing model.
  • The process of IO:
    initiate an IO call, wait for the IO condition to be ready, and then copy the data to the buffer for processing - wait/copy.
1. Blocking IO:
  • In order to complete the IO, an IO call is initiated. If the calling conditions are not met at this time, it will wait until the conditions are met and the IO call is completed.
    Blocking IO
  • The process is very simple. Only after one IO is completed can the next IO call be made. The resources are not fully utilized and are in a waiting state most of the time.
2. Non-blocking IO:
  • In order to complete the IO, a call is initiated. If the IO conditions are not currently met, an error will be returned immediately. (Usually return to perform other operations and initiate IO calls again).
    Non-blocking IO
  • Compared with blocking IO, resource utilization is more complete, but IO operations are not real-time.
3. Signal driven IO:
  • Define the signal IO processing method, perform IO operations in the processing method, notify the process with the IO ready signal, and make IO calls when the IO is ready.
    Signal driven IO
  • For signal-driven IO, IO is more real-time and fully utilizes resources, but the calling process is complicated.
4. Asynchronous IO:
  • Tell the operating system through asynchronous IO calls which IO data is copied to where. The waiting process and the copy process are completed by the operating system.
    Asynchronous IO
  • The resources are fully utilized and the process is more complex.

Note:
①Blocking: In order to complete a function, initiate a call. If the completion conditions are not currently met, it will wait forever.
② Non-blocking: In order to complete a function, initiate a call. If the completion conditions are not currently met, an error will be returned directly.
③The difference between blocking and non-blocking: whether to wait when the call is initiated and the call is not completed.
④Synchronization: Processing process, sequential processing, after one is completed, the other is completed. All functions are completed by the process itself.
⑤Asynchronous: In the processing process, the order is uncertain because the functions are completed by the operating system.
⑥Asynchronous blocking: The function is completed by others, and the call is waiting for others to complete.
⑦Asynchronous non-blocking: The function is completed by others, and the call returns immediately.

2. Multiplex IO

  • Centralized IO event monitoring for a large number of descriptors can tell the programmer/process which descriptors are ready for which events. At this time, the programmer/process can directly respond to the descriptors of the corresponding events that are ready to avoid Performing IO operations on operators that are not ready results in reduced efficiency/blocking of program flow.
  • IO events: readable events, writable events, abnormal events;
  • Multiplexing IO model: select/poll/epoll; used to monitor descriptors.
1.select model:
(1) Operation process :

① The programmer defines a descriptor set for a certain event (descriptor set for readable events/descriptor set for writable events/descriptor set for exception events), initializes and clears the set, and determines which descriptor cares about which event. This descriptor is added to the set of descriptors for the corresponding event.
②Copy the collection to the kernel for monitoring. The principle of monitoring is polling and traversal judgment. Ready for readable events: the size of the data in the receive buffer is greater than the low-water mark (quantified standard – usually defaults to one byte); Ready for writable events: the size of the remaining space in the send buffer is greater than the low-water mark (quantified standard – usually defaults to one byte) Usually defaults to one byte); readiness for exception events: whether the descriptor generates an exception.
③The monitoring call returns, indicating a monitoring error/descriptor ready/monitoring wait timeout; and when the call returns, the unready descriptors in the event monitoring descriptor set are removed from the set - (only ready is retained in the set descriptor). Because the collection is modified when returning, you need to add operators to the collection again the next time you monitor.
④The programmer polls to determine which set the descriptor is still in, determines whether the descriptor is ready for a certain event, and then performs the operation corresponding to the event; select does not directly return to the user's ready descriptor for direct operation, but It returns a virtual descriptor set, so the programmer needs to make judgment.

(2) Code operation :

① Define the set - struct fd_set - The member is an array, used as a binary bitmap - adding a descriptor means setting the bit position corresponding to the descriptor value to 1, so the number of descriptors that seclect can monitor depends on the bits of the binary bitmap The number of bits—the number of bits depends on the macro (_FD_SETSIZE, default is 1024);

  •   void FD_ZERO(fd_set* set);—初始化清空集合。
    
  •   void FD_SET(int fd, fd_set* set);—将操作符fd增加到set集合中。
    

②Initiate calling interface:

  •   int select(int nfds,fd_set* readfds,fd_set* writefds,fd_set* exceptfds,struct timeval* timeout);
      nfds:当前监控的集合中最大的描述符+1;减少遍历次数。
      readfds/writefd/exceptfds:可读/可写/异常三种事件的集合。
      timeout:时间的结构体。struct{tv_sec;tv_usec;},通过这个事件决定select阻塞/非阻塞/限制超时阻塞;
      	-若timeout为NULL,则表示阻塞监控,直到描述符就绪,或者监控出错才会返回。
      	-若timeout中的成员数据为0,则表示非阻塞,监控的时候若没有操作符就绪,则就立即超时返回。
      	-若timeout中成员数据不为0,则在指定时间内,没有就虚则超时返回。
      	返回值:返回值大于0表示就绪的描述符个数;返回值等于0表示没有棉袄舒服就绪,超时返回;返回值小于0表示监控出错。	
    

③The call returns and returns to the programmer a set of ready descriptors. The programmer deviates from the judgment of which descriptor is still in the set, which is which event is ready.

  •   int FD_ISSET(int fd, fd_set* set);—判断fd描述符是否在集合中;
    
  • Because the collection will be modified when select returns, the descriptor must be re-added every time it is monitored.

④If you no longer want to monitor the descriptor, remove the descriptor fd from the set

  •   void FD_CLR(int fd, fd_set* set);--从set集合中删除描述符fd;
    
(3) Select to monitor standard input:
                                                                                                                                                                                                                                                  
   #include<stdio.h>
   #include<unistd.h>
   #include<string.h>
   #include<fcntl.h>
   #include<sys/select.h>
   
   int main()
   {
    
    
     // 对标准输入进行监控
    // 1.定义指定事件的集合
    fd_set rfds;
    while(1)
    {
    
    
      printf("开始监控\n");
      //selent(maxfsd+1,可读集合,可写集合,异常集合,超时时间)
      struct timeval tv;
      tv.tv_sec = 3;
      tv.tv_usec = 0;
      FD_ZERO(&rfds);//初始化清空集合
      FD_SET(0,&rfds);// 将0号描述符添加到集合中
      int res = select(0+1,&rfds,NULL,NULL,&tv);
      if(res < 0){
    
    
        perror("select error");
        return -1;
      }else if(res == 0){
    
    
        printf("wait timeout\n");
        continue;
      }
      if(FD_ISSET(0,&rfds)){
    
    //判断描述符是否在集合中判断是否就绪了事件
        printf("准备从标准输入读取数据:...\n");
        char buf[1024] = {
    
    0};
        int ret = read(0,buf,1023);
        if(ret<0)
        {
    
    
          perror("read error");
          FD_CLR(0,&rfds);//移除描述符从集合中
          return -1;
        }
        printf("read buf:[%s]\n",buf);
      }
    }
    return 0;
  } 
(4) Analysis of advantages and disadvantages:
  • Disadvantages :
    ① Select has a maximum number limit for monitoring descriptors, and the upper limit depends on the macro - _FD - SETSIZE, with a default size of 1024;
    ② Monitoring in the kernel is achieved through polling traversal judgment, and the performance will increase with the descriptor. Increase and decrease
    ③ can only return the ready set, and the process needs to traverse the set to know which descriptor is ready for which event.
    ④Each monitoring requires re-adding descriptors to the collection, and each monitoring requires re-copying the collection to the kernel.
  • Advantages :
    Cross-platform portability is relatively good, and it follows POSIX standards.
2.poll model:
(1) Operation process:

① Define the monitored descriptor event structure array, and add the descriptor and event identification information to be monitored to each node of the array; ② Initiate a call to start monitoring, and copy the
descriptor event structure array to the kernel for rotation. Query traversal judgment, if ready/waiting timeout, the call returns, and in each ready event structure, the current ready event is represented;
③ Carry out polling and traverse the array to determine which event the ready event is in each node in the array. Determines whether the descriptor is ready and what to do with it.

(2) Code operation:
  •   int poll(struct pollfd* arry_fds,nfds_t nfds,int timeout)
      poll---监控采用时间结构体的形式;
      struct pollfd {
      	int fd ;---要监控的描述符;
      	short events; --- 要监控的事件POLLIN/POLLOUT;
      	short revents; --- 调用返回是填充的就绪事件
      	arry_fds---事件结构体数组,填充要监控的描述符以及事件信息;
      	nfds --- 数组中的有效节点数个数;
      	timeout---监控事件超时等待事件;
      	返回值:返回值大于0表示就绪描述符事件的个数;返回值等于0就表示等待超时,返回值小于0表示监控出错。
    
  #include <poll.h>
  #include <unistd.h>
  #include <stdio.h>
  int main() {
    
    
    struct pollfd poll_fd;//定义事件结构体 
    poll_fd.fd = 0;
    poll_fd.events = POLLIN;//输入事件                                                                                    
    for (;;) {
    
    
      int ret = poll(&poll_fd, 1, 1000);
      if (ret < 0) {
    
    
        perror("poll");
	       continue; 
      }
      if (ret == 0) {
    
    
        printf("poll timeout\n");
        continue;
      }
      if (poll_fd.revents == POLLIN) {
    
    
        char buf[1024] = {
    
    0};
        read(0, buf, sizeof(buf) - 1);
        printf("stdin:%s", buf);
      }
    }
  }
(3) Analysis of advantages and disadvantages:
  • Advantages :
    ① Using event structures for monitoring simplifies the operation process of the three events in select.
    ② There is no maximum limit on the number of monitored descriptors.
    ③There is no need to redefine event nodes every time.
  • Disadvantages :
    ①Poor cross-platform portability.
    ②Each monitoring still requires copying monitoring data to the kernel.
    ③ Monitoring in the kernel still uses polling traversal, and performance will decrease as the number of descriptors increases.
3.eppol model:
  • The most useful and best-performing multiplexing model under Linux.
(1) Operation process:

① Initiate a call to create the epoll handle epollevent structure in the kernel (this structure contains a lot of information, red-black tree + two-way linked list); ② Initiate a call to
add/delete/modify the monitored descriptor monitoring information to the epollevent structure in the kernel;
③Initiate the start of monitoring, use asynchronous blocking operations in the kernel to implement monitoring, wait for timeout/when a descriptor is ready, it will return, and return the event structure information of the user's ready descriptor; ④The process directly responds to the description in the ready event
structure Just use the symbol members to perform operations.

(2) Interface information:

① Create epoll handle eventpoll in the kernel and return the descriptor

  •   int epoll _create(int size); :返回epoll操作句柄
      	-size : 在linux2.6.2以后被忽略,只要大于0即可。
      	返回值:文件描述符--epoll的操作句柄
    

②epoll time registration function:

  •   int epoll_ctl(int epfd,int cmd, int fd, struct epoll_event* ev);
       -epfd : epoll_creat 返回的操作句柄;
       -cmd : 针对fd描述符的监控信息要进行的操作--添加/删除/修改 EPOLL_CTL_ADD/EPOLL_CTL_DEL/EPOLL_CTL_MOD;
       -fd : 要监控操作的描述符;
       -ev : fd描述符对应的事件结构体信息;
      	struct epoll_event{
      		uint32_t events; //对fd描述符要监控的事件--EPOLLIN/EPOLLOUT;
      		union{
      			int fd; // 监控操作的描述符;
      			void* ptr;//要填充的描述符信息;
      		}
      	}
    

③Collect events that have been sent in the epoll monitoring event:

  •   int epoll_wait(int epfd,struct epoll_event *evs,int max_event,int timeout)
      	-epfd : epoll操作句柄;
      	-evs : struct epoll_event 结构体的首地址;
      	-max_event : 本次监控想要获取的就绪事件的最大数量,不大于evens数组的最大节点数量,禁止越界访问。
      	-timeout : 等待超时时间--单位:毫秒。
      	返回值:返回值<0 表示监控出错 ,返回值 == 0 表示超时返回 , 返回值 > 0 表示就绪的时间的个数。
    
(3) epoll monitoring principle: asynchronous blocking operation:
  • Monitoring is completed by the system. The monitoring descriptor and corresponding event structure added by the user will be added to the red-black tree in the eventpoll structure of the kernel. Once a call is initiated to start monitoring, the operating system makes a callback for each descriptor event. Function, the function is to add the structure to the doubly linked list when the descriptor is ready for the event of interest.
  • The process itself only determines whether the doubly linked list is NULL every time and determines whether it is ready.
    ① Create a handle
    ② Add the monitored descriptor information and the structure information of the corresponding event to the kernel;
    ③ Start asynchronous blocking monitoring, and the system adds the event structure information corresponding to the ready descriptor to the doubly linked list;
    ④ By judging the doubly linked list The event structure is returned to the process.
    ⑤ The process only needs to determine the corresponding operation on the descriptor sent in the time structure based on the time information in the ready event structure.
(4) Advantages and disadvantages:
  • Advantages :
    ① There is no upper limit on the number of descriptor-free monitoring;
    ② Monitoring information only needs to be added to the kernel once;
    ③ Monitoring uses asynchronous blocking operations, and performance will not decrease with the increase of operators;
    ④ Directly returns the readiness time to the user Information, the process directly operates on the returned descriptor and event, without judging whether it is ready.
  • Disadvantages :
    Poor cross-platform portability.
(5) Triggering method in epoll:
  • Suppose there is such an example :
    we have added a tcp socket to the epoll descriptor.
    At this time, 2KB of data is written to the other end of the socket.
    Call epoll_wait, and it will return. It means that it is ready for the read operation
    and then calls read, Only 1KB of data was read
    and continued to call epoll_wait

①Horizontal triggering method:

  • The default state of epoll is LT working mode :
    when epoll detects that the event on the socket is ready, it does not need to process it immediately. Or only processes part of it;
    as in the above example, since only 1K data has been read, there is still 1K data left in the buffer. , when epoll_wait is called for the second time, epoll_wait will still return immediately and notify the socket that the read event is ready;
    epoll_wait will not return immediately until all the data on the buffer has been processed;
    that is, it supports blocking read and write and non-blocking read. Write

②Edge trigger method:

  • If we use the EPOLLET flag when adding the socket to the epoll descriptor in step 1, epoll enters the ET working mode: when
    epoll detects that the event on the socket is ready, it must be processed immediately;
    as in the above example, although only 1K has been read There is 1K data left in the buffer. When epoll_wait is called for the second time, epoll_wait will not return;
    that is to say, in ET mode, after the event on the file descriptor is ready, there is only one processing opportunity;
    ET The performance is higher than that of LT (epoll_wait returns a lot less times). Nginx uses ET mode to use epoll by default. That is
    : it only supports non-blocking reading and writing.
  • Select and poll actually work in LT mode. epoll can support both LT and ET.

Guess you like

Origin blog.csdn.net/weixin_42357849/article/details/107785775