How to understand efficient IO

Table of contents

Preface

1. How to understand efficient IO

2. Five IO models

3. Non-blocking IO

4. Writing non-blocking code

Summarize 


Preface

        Hello, nice to meet you all! Today we are going to introduce the topic of IO. IO is a very common operation in computers, such as displaying data to peripherals, or sending data from host A to host B... In order to improve performance, reducing IO time has become A topic that people are more concerned about, and today we are going to introduce how to improve IO efficiency.

1. How to understand efficient IO

IO: Essentially, it is to copy data from one party to another. For example: calling read/recv to read data is essentially to copy the data in the buffer! But the prerequisite for copying data is that there is data, so the IO process consists of two parts. , one part is waiting for the data to be ready, and the other part is copying the data. Because the efficiency of copying data is determined by the hardware itself, to achieve efficient IO, you need to reduce the waiting time!

2. Five IO models

Illustration: Fishing example:
Zhang San sat motionless by the river, waiting for the fish to take the bait.
Li Si placed the fishing rod by the river, then went to do other things, and then checked after a while to see if any fish had taken the bait.
Wu Zaiyu. Put a bell on the pole, place the fishing pole by the river, and then do other things. When the bell rings, it means that a fish has taken the bait, and then catch the fish. Zhao Liu brought a group of fishing poles, and then put them all
on By the river, they then polled to see if any fish had taken the bait.
Tian Qi: He found a man, Xiao Liu, to help him fish. When the fish caught, Xiao Liu notified Tian Qi and asked Tian Qi to take the fish away.

Among them, Zhang San corresponds to blocking IO,
Li Si corresponds to non-blocking IO,
and Wang Wu corresponds to signal-driven IO
. Zhao Liu corresponds to multi-channel switching/multiplexing IO
. Tian Qi corresponds to asynchronous IO.
Among them, Zhang San, Li Si, and Wang Wu are in There is no difference in efficiency! From a neat point of view, Li Si and Wang Wu can do other things during fishing.
Signal-driven IO: Although they wait for the signal to be sent before copying the data, they are essentially waiting!

Four methods, everyone is waiting for fishing -> belongs to synchronous IO
. The fifth method does not participate in any stage of the IO stage -> belongs to asynchronous IO.
The difference between blocking IO and non-blocking IO:
Similar points: both process data Differences in copying
: different ways of waiting

Blocking IO model:

The system call will wait until the kernel is ready with the data. All sockets are blocking by default.

Non-blocking IO model:

If the kernel has not prepared the data, the system call will still return directly and return the EWOULDBLOCK error code.
Non-blocking IO often requires programmers to repeatedly try to read and write file descriptors in a loop. This process is called polling. This is very important for the CPU. It is a huge waste and is generally only used in specific scenarios.

Signal-driven IO model:

When the kernel prepares the data, it uses the SIGIO signal to notify the application to perform IO operations.

IO multiplexing model:

Although it looks similar to blocking IO from the flow chart, in fact the core is that IO multiplexing can wait for the
ready status of multiple file descriptors at the same time. 

Asynchronous IO:

The kernel notifies the application when the data copy is completed (and the signal driver tells the application when it can start copying data). 

Summary:
Any IO process contains two steps. The first is waiting, and the second is copying. And in actual application scenarios, the time consumed by waiting is often much higher than the time spent copying. Making IO more efficient, the best The core method is to minimize the waiting time as much as possible. 

3. Non-blocking IO

A file descriptor blocks IO by default.

The fcntl() function prototype is as follows.

#include <unistd.h>
#include <fcntl.h>
int fcntl(int fd, int cmd, ... /* arg */ );

The value of the cmd passed in is different, and the parameters appended later are also different.
The fcntl function has 5 functions:
 

Copy an existing descriptor (cmd=F_DUPFD).
Get/set the file descriptor flag (cmd=F_GETFD or F_SETFD).
Get/set the file status flag (cmd=F_GETFL or F_SETFL).
Get/set asynchronous I/O ownership (cmd=F_GETOWN or F_SETOWN).
Obtain/set the record lock (cmd=F_GETLK, F_SETLK or F_SETLKW).
We only use the third function here, to get/set the file status mark, so that a file descriptor can be set to non- block

The implementation function SetNoBlock
is based on fcntl. We implement a SetNoBlock function to set the file descriptor to non-blocking.

void SetNoBlock(int fd) {
int fl = fcntl(fd, F_GETFL);
if (fl < 0) {
    perror("fcntl");
    return;
}
    fcntl(fd, F_SETFL, fl | O_NONBLOCK);
}

Use F_GETFL to get the attributes of the current file descriptor (this is a bitmap).
Then use F_SETFL to set the file descriptor back. When setting it back, add an O_NONBLOCK parameter.

4. Writing non-blocking code

Description: The user inputs data into the buffer, and then reads the data and prints it to the monitor. The monitor is essentially a file, and the corresponding file descriptor is 0. At this time, the file corresponding to the file descriptor No. 0 is set to non-blocking. Data is read when there is data, and other business can be processed when there is no data, instead of blocking waiting.

main.cc

#include"util.hpp"
#include<functional>
#include<vector>
using func_t = std::function<void()>;

#define INIT(v) do {\
    v.push_back(PrintLog);\
    v.push_back(Download);\
}while(0)
#define callback(cal) do{\
    for(auto& e: cal) e();\
}while(0);
int main()
{
    std::vector<func_t> cbs;
    INIT(cbs);
    setNoBlock(0);
    while(true) {
        char buf[1024];
        printf(">>> ");
        fflush(stdout);
        int ret = read(0,buf,sizeof(buf)-1);
        if(ret == 0) {
            std::cout<< "read end" << std::endl;
            break;
        }
        else if(ret > 0){
            buf[ret-1] = 0;
            std::cout << "echo# " << buf << std::endl;
        }
        else {
            //不输入的时候,底层没有数据,不算错误,只不过是以错误的形式返回了
            //如何区分是真的错了还是没有数据
            //EAGAIN 和 EWOULDBLOCK 都表示没有数据
            if(errno == EAGAIN || errno == EWOULDBLOCK) {
                std::cout << "没有数据" << std::endl;
                callback(cbs);
            }
            else if(errno == EINTR) continue;
            else {
                //真的错了
                std::cout << ret << "errno: "<< strerror(errno) << std::endl;
                break;
            }
        }
        sleep(1);
    }
    return 0;
}

util.hpp:

#include<iostream>
#include<cstring>
#include<errno.h>
#include<unistd.h>
#include<fcntl.h>

void setNoBlock(int fd) {
    int f1 = fcntl(fd,F_GETFL);
    if(f1 < 0) std::cerr<< "fcntl fail: " << strerror(errno) << std::endl; 
    fcntl(fd,F_SETFL,f1 | O_NONBLOCK);
}
void PrintLog() {
    std::cout << "this is a LOG" << std::endl;
}
void Download() {
    std::cout << "this is a Download" << std::endl;
}

Running screenshot:

Summarize 

        I believe that after reading this article, you will definitely understand that in order to achieve efficient IO, you must reduce the waiting time. How to reduce the waiting time? Regarding this topic, the IO multi-channel switching scheme is generally used. IO multi-channel switching includes the select model, poll model, and epoll model. Regarding these three models, I will introduce them one by one in the following articles. Thank you for reading. , that’s the end of what we introduced today.

Guess you like

Origin blog.csdn.net/qq_65307907/article/details/132323483