[Computer Network] Advanced IO selection

1. What is IO?

IO: represents input and output


When the other party establishes the connection but does not send data,
and I am a thread calling read to read, it will block and wait for the data to be sent. That is,
if the reading conditions are not met, read or recv will only wait.


Whether it is copying when there is data or waiting when there is no data , the time cost of both is all calculated on the user.

From the user's perspective, IO=etc+data copy

What is efficient IO?

In unit time, the lower the proportion of waiting, the higher the IO efficiency.

When IO conditions are met, it is called IO event ready

2. Five models of IO

Conceptual understanding of five IO models

For example: Fishing is assumed to be divided into two steps, fishing = wait + fishing

Hang a fish float on the hook of the fishing rod and float it on the water to stabilize the depth of the fishing rod.
When the fish float swings up and down, you will know that a fish is currently hooked.


1. Zhang San generally likes to focus on one thing, so when Zhang San is fishing and waiting, he will keep staring at the fish float to see if there is any fish taking the bait.
After a while, the fish float moves, Zhang San pulls the fishing rod and catches the fish. Now, put it in the bucket
and continue fishing.
During the fishing process, Zhang San only stared at the fish float and did nothing else.


2. Li Si is naturally active, so when Li Si is fishing, he doesn’t pay much attention to the fish float. He will look left and right.
When he finds a fish biting the hook, he will catch the fish, put it in the bucket and
continue fishing.
While fishing, Li Si did other things besides watching the fish float.


3. Wang Wu is special. When Wang Wu is fishing, he puts a bell on the top of the fishing rod and waits for the fish to take the bait.
While waiting, Wang Wu is doing his own thing.
When Wang Wu hears the bell ringing, he pulls the fish. rod, caught the fish, put it in the bucket, and
continued the fishing action just now
. During the fishing process, Wang Wu did not look at the fish float, but only listened to the bell to judge whether there was a fish taking the bait.


4. Zhao Liu is the richest man in the area. He drives a pickup truck to fish. There are 10 fishing rods installed on the pickup truck (the richest man is here to experience life)
. Using 10 fishing rods to fish together, Zhao Liu checks from front to back to see if there are any floating fish.


5. Tian Qi is the richest man within a radius of 500 kilometers (Tian Qi is richer than Zhao Liu).
Tian Qi is very busy and has various meetings to hold every day. Moreover, Tian Qi does not want to fish, but likes to eat fish,
so he asked the driver Xiao Wang helps to go fishing.
When the fish bucket is full, call Tian Qi and someone will come to take the fish away.
While Xiao Wang is fishing, Tian Qi is also in a meeting.


Zhang San and Li Si are fishing in the same way. The difference lies in the way they wait
for the fish to take the bait. Zhang San will not move when the fish floats, but
Li Si will immediately return to do other things if the fish does not float.


Zhang San’s fishing method is called blocking IO
(the data is not ready, and the read interface called will not return)


Li Si's fishing method is called non-blocking IO
(if there is no data after testing once, it will return immediately, and it can be tested again after a period of time)


Before the fish is caught, Wang Wu knows that when the bell rings, he should pull the fishing rod.
Wang Wu's fishing method is called signal-driven IO.


Zhao Liu manages multiple fishing rods at one time. Zhao Liu's fishing method is called multiplexing or multiplexing.

Among these people, Zhao Liu's fishing efficiency is relatively high
because Zhao Liu has more fishing rods, so the probability of fish getting hooked is high, that is, the waiting time is shorter,
so Zhao Liu's fishing efficiency is relatively high.


Synchronous IO and asynchronous IO

The first four people have to go through the fishing process, so they are all called synchronous IO.

Tianqi did not participate in the fishing process, did not wait, and did not fish. It just initiated the fishing process.
Tianqi’s fishing method is called asynchronous IO.


overall understanding

Fishing can be seen as data copy
Zhang San, Li Si and others can be seen as processes ,
Tian Qi can be seen as a process, the driver Xiao Wang can be seen as the operating system ,
the fishing rod can be seen as a file descriptor
, and the fish can be seen as data.
Fish biting hook or fish float Movement and bell ringing can be regarded as IO event ready


When a process reads data on a file descriptor, if the data is not ready, the current process can only hang and wait
until IO time is available, and then the data can be copied to the corresponding upper layer.

3. Blocking IO

Blocking IO: the data is not ready and the read interface called will not return


By using the read function to read from the keyboard, when the code is written, if nothing is input, nothing will be displayed, which is blocking IO.

Enter man 2 read

Read count data from a file descriptor into the buf buffer
. If the acquisition is successful, byte data will be returned.
If the acquisition is 0, it means the end of the file is read
. If the acquisition is -1, it means failure and an error code is set.


0 indicates the standard input stream.
Read data of the size of the buffer array from the standard input stream and send it to the buffer.

After running the executable program, there is no input, which causes read to wait until data is input before copying the data.

4. Non-blocking IO

Non-blocking IO: If there is no data after testing once, it will immediately return to do other things, and it can be tested again after a while.


By using the read function to read from the keyboard, when the code is written, there is no input.
In this way, it simulates the situation where read will only wait when the reading conditions are not met.

Modify based on the above blocking IO code


setnonblock function

Type man fcntl

The first parameter is the file descriptor.
The second parameter indicates what you want to do with the file descriptor.
Get/set the file status flag (cmd= F_GETFL or F_SETFL )

By setting the file status flag, you can make a file descriptor non-blocking

Use F_GETFL to get the attributes of the current file descriptor .
Use F_SETFL to set the file descriptor status and add an O_NONBLOCK (non-blocking) parameter.

If the function returns -1, it indicates failure


Create a function setnonblock to set the file descriptor to a non-blocking state.
First use F_GETFL to obtain the attributes of the corresponding file descriptor
. If the acquisition fails, the error reason and error code are returned
. If the acquisition is successful, use F_SETFL to set the file descriptor status to non-blocking state


Why does non-blocking IO give read errors?

In the main function main, change the standard input stream to a non-blocking state
and set return prompts according to the three return values ​​of read: read success, end of file and read error.


When the standard input stream is set to a non-blocking state and
the executable program is run again, the read will fail directly.
When calling read, it is found that the data is not ready (the current read detection speed is too fast, and an error is reported before there is any input)

So once the underlying data is not ready, it will be returned in the form of an error, but it is not considered a real error.


But in this way, there is no way to distinguish whether there is a real error or there is no data at the bottom layer,
so further judgment is made through the error code .

Further judgment on error codes

EAGAIN and EWOULDBLOCK are both set by the system. The error code is 11
, which is used to determine that there is no error. However, if the error code returned in the form of an error
is true, you can continue to detect it next time.


If IO is interrupted by a signal, re-detect


When detecting that the data is not ready, return to do other things

Non-blocking IO can return to do other things when the detected data is not ready.


Define a wrapper whose parameter is void and whose return value is void, and rename it to type func_t.
Define a vetcor array whose type is func_t.


Set up three tasks, namely PrintLog OperMysql CheckNet


When creating the LoadTask function, insert the tasks into the funcs array.


In the main function main, call the LoadTask function to load the task


Create a HandlerALLTask function to traverse the vector array. The array elements are tasks.
When the data is not ready, the processing task is returned.


Complete code

mytest.cc

#include<iostream>
#include<unistd.h>
#include<fcntl.h>
#include<cstdio>
#include<cstring>
#include<vector>
#include<functional>
using namespace std;

//任务
void PrintLog()//打印日志
{
    
    
   cout<<"这是一个打印日志例程"<<endl;
}

void OperMysql()
{
    
    
  cout<<"这是一个操作数据库的例程"<<endl;
}

void CheckNet()
{
    
    
    cout<<"这是一个检测网络状态的例程"<<endl;
}

using func_t =function<void(void)>;
vector<func_t>  funcs;

void LoadTask()
{
    
    
    funcs.push_back(PrintLog);
    funcs.push_back( OperMysql);
    funcs.push_back(CheckNet);
}

void HandlerALLTask()
{
    
    
    //遍历vector数组
  for(auto& func:funcs)
  {
    
    
    func();
  }
}

void SetNonBlock(int fd)//将文件描述符设为非阻塞
{
    
    
   int fl=fcntl(fd,F_GETFL);//获取当前文件描述符的指定状态标志位
    if(fl<0)//获取失败
    {
    
    
          cerr<<"error string: "<<strerror(errno)<<"error code: "<<errno<<endl;
    }
    fcntl(fd,F_SETFL,fl | O_NONBLOCK);//将文件描述符状态设为非阻塞状态
}

int main()
{
    
    
    char buffer[64];
    SetNonBlock(0);//将标准输入流 改为非阻塞状态
    LoadTask();//加载任务
   while(true)
   {
    
    
    
      //0表示标准输入流
      ssize_t n=read(0,buffer,sizeof(buffer)-1);//检测条件是否就绪
      if(n>0)//读取成功
      {
    
    
        buffer[n-1]=0;    
        cout<<"echo# "<<buffer<<endl; 
      }
      else if(n==0)//读到文件结尾
      {
    
    
        cout<<"end file"<<endl;
      }
      else//读取失败 
      {
    
    
        if(errno==EAGAIN || errno ==EWOULDBLOCK)
        {
    
    
          //若为真,说明没出错,只是以出错返回
          //底层数据没有准备好,下次继续检测

          HandlerALLTask();//遍历数组 处理任务
          sleep(1);
          cout<<"data not  ready"<<endl;
          continue;
          
        }
        else if(errno == EINTR)
        {
    
    
            //IO被信号中断,需要重新检测
            continue;
        }
        else //真正的错误
        {
    
    
             cout<<"read error"<<"error string: "<<strerror(errno)<<"error code: "<<errno<<endl;
             break;
        }
      }
      sleep(1);

   }
    return 0;
}

 

makefile

mytest:mytest.cc
	g++ -o $@ $^ -std=c++11
	
.PHONY:clean
clean:
	rm -f mytest

5. select

Why have a choice?

File interfaces such as read/recv only have one file descriptor.
If you want an interface to wait for multiple file descriptors, interfaces such as read do not have this capability.
The operating system designs an interface select for multiplexing .


The function of select is
1. Waiting for multiple file descriptors
2. Only responsible for waiting (no ability to copy data)


select interface

Enter man select

Since select is only responsible for waiting and not copying, there is no buffer.

Understanding of the first parameter nfds

The first parameter nfds is an input parameter, which represents the largest number + 1 of the multiple file descriptors (fd) that the select is waiting for
(the essence of the file descriptor is an array subscript, and the largest value among the multiple file descriptors is File descriptor value + 1 (nfds)

What are input and output parameters

The user hands the data to the operating system, and the operating system also hands the results to the user through these output parameters.
In order to allow information transfer between the user and the operating system, the parameters are set as input and output parameters.

Understanding of the last parameter timeout

timeout is an input and output parameter


The data type of timeout is struct timeval

It can be a time structure, tv_sec represents seconds, tv_usec represents microseconds


Three values ​​can be set for the object of struct timeval

The first type of object is set to NULL, which means blocking waiting for select (any of the multiple file descriptors is not ready, and select will never return)

The second struct timeval object is defined and the variables are set to 0 , which means non-blocking waiting
for select (any of the multiple file descriptors is not ready, select will immediately error and return)

The third struct timeval object is defined, and the variables are set to 5 and 0
to indicate blocking waiting within 5s, otherwise it will be timeout (non-blocking waiting).
If a file descriptor is ready at 3s, select will return The parameter timeout represents the remaining time 2s (5-3=2)


Understanding readfds writefds exceptfds parameters

readfds writefds exceptfds these three parameters are homogeneous
readfds represents read event
writefds represents write event
excepttfds represents abnormal event

All three types are fd_set


fd_set is a bitmap structure, which represents
the position of multiple file descriptors through bits, which represents the value of the file descriptor.


If you want to use bitwise AND, bitwise OR operations on a bitmap structure, you must use the interface provided by the operating system.

FD_CLR: Clear the specified file descriptor from the specified set

FD_ISSET: Determine whether the file descriptor is added to the set

FD_SET: Add a file descriptor to the corresponding set collection

FD_ZERO: Clear the entire file descriptor


Take the readfds read event as an example

If placed in the readfds collection, the user tells the kernel that the read events corresponding to those file descriptors need to be taken care of by the kernel.
When returning, the kernel must tell the user that the read events for those file descriptors are ready.


Suppose you want the operating system to care about the events corresponding to eight file descriptors

When the user wants to tell the kernel, the user needs to define the fd_set object rfds, in which the eight bits are set to 1 and the bit
position indicates the file descriptor
number. If the bit is set to 1, the operating system needs to care about the corresponding file descriptor number.
For example: you need to care about file descriptors No. 1-8, that is, check whether they are ready.


When select returns, the kernel will tell the user that rfds is reset and the bit position corresponding to the ready file descriptor is set to 1

For example: No. 3 and No. 5 are ready, then the corresponding bit position is set to 1, indicating that the contents corresponding to file descriptors No. 3 and No. 5 are ready.


return value of select

The return value of select also has three situations.
The first one is greater than 0
, indicating that several file descriptors are ready.

The second type is equal to 0
and enters the timeout state, that is, no file descriptor is ready within 5s.

The third type is less than 0
and returns -1 if the wait fails.
For example: if you want to wait for the file descriptors with subscripts 1 and 2, but the file descriptor with subscript 2 does not exist at all, the wait will fail.


Guess you like

Origin blog.csdn.net/qq_62939852/article/details/133524323