Linux---Advanced IO

IO process

Initiate an IO call, wait for the IO condition to be ready, and copy the data to the buffer for processing.
Waiting + data copy two main operations

Five IO models

  • Blocking IO: Initiate an IO call, if the IO conditions are not met, it has been blocked and waited, and the operation is executed sequentially.
  • Non-blocking IO: Initiate an IO call and return immediately if the IO conditions are not met, and the operation needs to be called cyclically.
  • Signal-driven IO: Define an IO ready signal, and immediately send an IO call when the signal is received, and the operation is not performed in sequence.
  • Asynchronous IO: Define IO completion signal, initiate asynchronous IO call, IO is completed by the system.
  • Multiplexed IO: Centralized event monitoring for a large number of descriptors, so that the incoming vehicle can only operate on the ready descriptors, which improves efficiency and avoids operating on unready operators and causing blocking.

Blocking IO

In order to complete a certain function, initiate an IO call if the current IO conditions are not met, it will wait forever.
Insert picture description here
Advantages: The process is very simple.
Disadvantages: After an IO is completed, the next IO can be executed. The resources are not fully utilized and a lot of time is waiting.

Non-blocking IO

In order to complete an IO, initiate a call. If the current IO conditions are not met, it will return immediately, but the IO needs to be initiated cyclically.
Insert picture description here
Advantages: The process is relatively complicated, and the use of resources is more full.
Disadvantages: IO operation is not real-time enough.

Signal drive IO

Define an IO signal processing method, perform IO operations in the processing mode, signal the process when the IO is ready, and the process performs the IO operation when the IO is ready.

Insert picture description here
Advantages: IO operation is more real-time, and the use of resources is more efficient.
Disadvantages: complex operation.

Asynchronous IO

For example: the example of queuing to buy tickets.
There are more people going out on holidays, so the operation of asynchronous IO is similar to that if you ask others to queue for you and buy the ticket for you, you can finally give the ticket to you.
Therefore, the essence of asynchronous IO is that all its IO operations are done with the help of others, and it only initiates a call to tell others where to start copying data and how much data to copy. The waiting and operation of IO are all done by others.
Insert picture description here
Advantages: more adequate use of resources
Disadvantages: more responsible processes

Synchronization and mutual exclusion

Synchronize

The synchronization here is different from the synchronization in the previous process. The synchronization here refers to the sequential processing in the processing flow. After one is completed, the next is completed, and all functions are completed by the process itself.

The
sequence of asynchronous processing is not fixed, and the main functions are completed by others or by the operating system.

  • Asynchronous blocking: The function is completed by others, but when it is called, it is waiting for others to complete.
  • Asynchronous non-blocking: The function is completed by others, and it returns immediately after the call, without waiting.

The main difference between synchronization and mutual exclusion in advanced IO:

Determine whether the main function is completed by yourself, and whether the order of completion is determined.

Multiplexed IO

The IO event monitoring of a large number of descriptors can tell us which descriptors are currently ready and which events, and the process can respond to the process that is ready for the specified event, avoiding the operation of the descriptor that is not ready. The efficiency is reduced.

IO events
Readable events, writable events, abnormal events

Multiplexed IO model:

  • Role: Used to monitor descriptor events.
  • Implementation method: select, poll, epoll

select model

Operating procedures:

  • Define a set of descriptors for an event (readable, writable, abnormal event set), and initialize the set.
  • For which descriptor you care about, add this descriptor to the specified corresponding event.
  • Copy the collection to the kernel for monitoring. The principle of monitoring is polling monitoring. (Polling traversal judgment)
    Readiness of readable event: The size of the data in the receiving buffer is larger than the low water mark, usually one byte by default.
    Ready for writable events: The size of the remaining space in the send buffer is greater than the low water mark, usually one byte by default.
    Ready for exception event: Whether the descriptor has generated an exception.
  • The return of the monitoring call indicates that the monitoring error occurred, the descriptor is ready, and the monitoring timed out. When the call returns, remove the unready descriptors from the set of descriptors monitored by the event from the set,Keep only ready descriptors in the collection
  • By polling to determine which descriptor is in which set, it is determined whether the descriptor is ready for an event, and then the operation of the corresponding event is performed. (If a descriptor is in the readable set, it means that the current descriptor has met the readable event, and then perform specific operations)

note

In select, the ready descriptors are not directly returned to the user for operation, but the set of ready descriptors returned, so we need to judge by ourselves.
Because, when select returns, the descriptors that are not ready have been removed, so the next time you monitor, you need to add descriptors to the collection again.

Code operation:

  1. Define the set (struct fd_set), the member has only one array, in which the binary bits are used, and the descriptor is added to position the corresponding binary bits to one. The default size is 1024 descriptors.
  2. Initialize the empty collection:void FD_ZERO(fd_set *set)
  3. Add descriptors to the specified set. void FD_SET(int fd,fd_set* set), Add the fd descriptor to the set collection.
  4. The development initiates a monitoring call.
    int select(int nfds,fd_set * readfds,fd_set* writefds,fd_set *exceptfds,struct timeval *timeout);
  • nfds: The largest descriptor in the current monitoring set+1. Because the default monitoring descriptor size of the collection is 1024, avoid traversing useless descriptors and reduce the number of traversal.
  • readfds: Set of readable event descriptors, blank if not needed.
  • writedfs: The set of writable event descriptors, if not needed, leave it blank.
  • exceptfds: Abnormal event descriptor set, blank if not needed.
  • timeout: Time structure, select blocking, non-blocking, and limit timeout blocking are determined by time. If timeout is empty, it means blocking monitoring. It will not return until a descriptor is ready or a monitoring error occurs. If the member data in timeout is 0, it means non-blocking. If there is no descriptor ready during monitoring, it will return immediately over time. If the member data in the timeout is not 0, then within the specified time, if it is not ready, it will time out and return.
    Return value: The return value is greater than 0, indicating the number of ready descriptors; if the return value is 0, it means that no descriptor is ready, and the timeout returns; the return value is less than 0, indicating that the monitoring error occurred.
  1. The call returns, the ready descriptor is returned, and the traversal to determine which descriptor is still in the set, that is, which event is ready.
    int FD_ISSET(int fd,fd_set *set), To determine whether the fd descriptor is in the set. The essence is the encapsulation of the binary bit view.

  2. If the descriptor is not monitored, remove the descriptor.
    void FD_CLR(int fd,fd_set* set)Delete the fd descriptor in the set collection

int main()
{
    
    
	fd_set redfs;
	while(1)
	{
    
    
		cout<<"开始监控"<<endl;
		struct timeval tv;
		tv.tv_sec = 3;
		tv.tv_usec = 0;
		FD_ZERO(&redfs)
		FD_SET(0,&redfs); //将0号标准输入描述符添加到集合中
		int res = select(1,redfs,NULL,NULL,&tv); 
		if(res<0)
		{
    
    
			perror("select error")
			return -1;
		}
		else if(res == 0)
		{
    
    
			cout<<"监控超时"<<endl;
			continue;
		}
		if(FD_ISSET(0,&redfs))
		{
    
    
			cout<<"监控成功,0号事件已经响应"<<endl;
		}
	}
}

Advantages and disadvantages:

  • Cross-platform portability is better.
  • The maximum number of descriptors monitored by select is limited. FD_SETSIZE.
  • The number of monitoring in the kernel is achieved by polling traversal, so the performance will decrease as the number of descriptors increases.
  • Only the ready set can be returned, and the process needs to poll and judge to know which descriptor is ready for which event.
  • Every time you monitor, you need to add the descriptor to the collection, and every time you monitor, you need to copy the collection to the kernel.

poll model

Operating procedures:

  • Define the structure array of the monitored descriptor time, add the descriptors to be monitored and the event identification information to each node of the array.
  • Initiate a call to start monitoring, copy the descriptor structure array of the monitoring event to the kernel for polling traversal judgment, if there is a ready/waiting timeout, the call returns, and in the event structure corresponding to each descriptor, the current Ready event.
  • The process polls and traverses the array, determines which event is the ready event in each node in the array, and then decides how to operate the descriptor.

interface:

  • int poll (struct pollfd* arry,nfds_t nfds,int timeout)
    Monitoring adopts event structure.
    struct pollfd {int fd; shor events; shor revents}
    Descriptors to be monitored by fd, events to be monitored by events (POLLIN readable events, POLLOUT writable events), revens which monitoring event is ready, put it in it.
    nfds: the number of valid nodes in the array
    timeout: monitoring timeout waiting time: milliseconds
    Return value: >0 indicates the number of ready events monitored; return value==0 indicates waiting timeout, return value <0 monitoring error.

Advantages and disadvantages:

  • The use of event structure for monitoring simplifies the operation process of the three event collections in select.
  • The number of monitoring is not limited.
  • There is no need to redefine the event node every time.
  • Poor cross-platform portability.
  • Every monitoring still needs to copy data to the kernel.
  • Monitoring in the kernel still uses polling traversal.

epoll model

Operating procedures:

  • Create an epoll handle in the kernel (epollevent structure. Red-black tree + doubly linked list)
  • Add, delete, and modify the epollevent structure in the kernel
  • Start monitoring, initiate a call, use asynchronous blocking in the kernel to achieve monitoring, wait for a timeout or the descriptor ready event, the call returns, and the structure information of the ready descriptor event is returned to the user.
  • You can directly operate on the descriptor members in the ready event structure

Interface information:

  • int epoll_create(int size)Create an epoll handle, as long as the size is greater than 0.
    Return value: a file descriptor,
  • int epoll_ctl(int epfd,int cmd,int fd,struct epoll_event* ev)
    epfd: the operation handle returned by epoll_create
    cmd: operate on the monitoring information of the fd descriptor, add/delete/modify operations. EPOLL_CTL_ADD /EPOLL_CTL_DEL / EPOLL_CTL_MOD
    fd: Descriptor to be monitored
    ev: Event structure information corresponding to the descriptor.When epoll starts monitoring, if the descriptor is ready for the event that the process cares about, it will return to the user to add the corresponding event structure information, and operate through the descriptor contained in the event structure information, so the fd and ev structures are The fd is the same descriptor.
  • int epoll_wait(int epfd,struct epoll_events *evs,int max_event,int timeout)
    epfd: epoll operation handle
    evs: the first address of the struct epoll_events structure array, used to receive the event structure information corresponding to the ready descriptor.
    max_event: The maximum number of ready events that this monitoring wants to obtain, which cannot be greater than the size of the evs array.
    timeout: Timeout waiting time, in milliseconds.
    Return value: >0 means the number of ready events = 0 means waiting timeout, <0 monitoring error

Monitoring principle

Asynchronous blocking operation.
The monitoring is completed by the system. The descriptors and corresponding event structures added by the user will be added to the red-black tree in the kernel's eventpoll structure. Once the process initiates a call, the operating system will call back the events of each descriptor. When the descriptor is ready for which event, the event structure corresponding to the descriptor is added to the doubly linked list.
The process itself judges whether the doubly linked list is NULL at intervals of events, and determines whether an event is ready.
Insert picture description here

Guess you like

Origin blog.csdn.net/qq_42708024/article/details/108702707