Pipeline communication for Linux process communication

We all know that processes are independent of each other, and in order to transmit data between processes, there is inter-process communication. Inter-process communication is divided into three types, the first type is the pipeline communication based on the file system, the second type is the local communication based on the system v standard, and the third type is based on the POSIX standard, which can realize cross-host communication. What we are going to discuss today is the first major category of pipeline communication. Before understanding pipeline communication, we must first know what a pipeline is.

The concept of pipeline

Let's take the water pipe as an example. The water in the water pipe flows from one end to the other. In the computer world, there are two processes at both ends of the pipeline, and the water flow is equivalent to the data transmitted between the two processes . So what exactly is this pipeline? We represent it with a graph:

Pipes belong to the kernel, so os provides us with system calls to create pipes, which we will talk about later. Now let's talk about why we use pipelines.

Some people have doubts that the pipeline is a memory-level file based on the file system. How do you understand this sentence?

Before answering this question, let's think about it. If we want to realize the communication between two processes based on the file system, it is very simple: create a file on the disk, and write it to a process. The file is first loaded into the memory and then the data is written to the file. Finally Flushed to disk periodically. One process reads, loads the file into memory, and reads data from memory. File IO greatly reduces the communication efficiency between processes . So the pipeline is based on the file system, and the purpose is to read and write like other files. The pipeline is also a memory-level file, because it is created in the memory. Each process only needs to deal with the memory for data transmission, and has nothing to do with the disk, which improves the efficiency of data communication.

anonymous pipe

Pipes are divided into anonymous pipes and named pipes . When a process creates a pipe through a system call, because the pipe is a memory-level file based on the file system. So when this file is opened, the file descriptor table of the process stores the descriptor of the file. The file has two descriptors, one for the read side and one for the write side:

But the process and the process are independent of each other, so how to let another process see this pipeline? We can create a child process through fork, so that the child process inherits the file descriptor table of the parent process, and the same pipeline can be found through the file descriptor table:

 In order to let the two processes each perform its own read and write tasks, we can close the file descriptors except for their tasks:

Therefore, we call the pipeline used for process communication between blood relations an anonymous pipeline . Its system call interface is pipe:

 The pipefd array stores the read and write file descriptors of the pipeline, pipefd[0] is the read-end descriptor, and pipefd[1] is the write-end descriptor. Now we try to implement communication between parent and child processes:


#include <iostream>
#include <cstdio>
#include <cstring>
#include <string>
#include <cassert>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

using namespace std;

// 父进程进行读取,子进程进行写入
int main()
{
    // 第一步:创建管道文件,打开读写端
    int fds[2];
    int n = pipe(fds);
    assert(n == 0);

    // 第二步: fork
    pid_t id = fork();
    assert(id >= 0);
    if (id == 0)
    {
        // 子进程进行写入
        close(fds[0]);
        // 子进程的通信代码
        // string msg = "hello , i am child";
        const char *s = "我是子进程,我正在给你发消息";
        int cnt = 0;
        while (true)
        {
            cnt++;
            char buffer[1024]; // 只有子进程能看到!
            snprintf(buffer, sizeof buffer, "child->parent say: %s[%d][%d]", s, cnt, getpid());
            // 写端写满的时候,在写会阻塞,等对方进行读取!
            write(fds[1], buffer, strlen(buffer));
            cout << "count: " << cnt << endl;
            // sleep(50); //细节,我每隔1s写一次
            // break;
        }

        // 子进程
        close(fds[1]); // 子进程关闭写端fd
        cout << "子进程关闭自己的写端" << endl;
        // sleep(10000);
        exit(0);
    }
    // 父进程进行读取
    close(fds[1]);
    // 父进程的通信代码
    while (true)
    {
        sleep(2);
        char buffer[1024];
        // cout << "AAAAAAAAAAAAAAAAAAAAAA" << endl;
        // 如果管道中没有了数据,读端在读,默认会直接阻塞当前正在读取的进程!
        ssize_t s = read(fds[0], buffer, sizeof(buffer) - 1);
        // cout << "BBBBBBBBBBBBBBBBBBBBBB" << endl;
        if (s > 0)
        {
            buffer[s] = 0;
            cout << "Get Message# " << buffer << " | my pid: " << getpid() << endl;
        }
        else if(s == 0)
        {
            //读到文件结尾
            cout << "read: " << s << endl;
            break;
        }
        break;

        // 细节:父进程可没有进行sleep
        // sleep(5);
    }
    close(fds[0]);
    cout << "父进程关闭读端" << endl;

    int status = 0;
    n = waitpid(id, &status, 0);
    assert(n == id);

    cout <<"pid->"<< n << " : "<< (status & 0x7F) << endl;

    return 0;
}

Pipeline read and write rules:

When there is no data to read:
O_NONBLOCK disable : The read call is blocked, that is, the process suspends execution and waits until data arrives.
O_NONBLOCK enable : The read call returns -1 , and the errno value is EAGAIN .
When the pipe is full:
O_NONBLOCK disable : The write call is blocked until a process reads the data
O_NONBLOCK enable : The call returns -1 , and the errno value is EAGAIN
If all the file descriptors corresponding to the write end of the pipe are closed, read returns 0
If the file descriptors corresponding to all pipe read ends are closed, the write operation will generate a signal SIGPIPE, which may cause the write process to receive the signal and be killed

Implement a process pool with anonymous pipes:

#include <iostream>
#include <string>
#include <vector>
#include <cstdlib>
#include <cassert>
#include <ctime>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

#define MakeSeed() srand((unsigned long)time(nullptr) ^ getpid() ^ 0x171237 ^ rand() % 1234)

#define PROCSS_NUM 10

///子进程要完成的某种任务 -- 模拟一下/
// 函数指针 类型
typedef void (*func_t)();

void downLoadTask()
{
    std::cout << getpid() << ": 下载任务\n"
              << std::endl;
    sleep(1);
}

void ioTask()
{
    std::cout << getpid() << ": IO任务\n"
              << std::endl;
    sleep(1);
}

void flushTask()
{
    std::cout << getpid() << ": 刷新任务\n"
              << std::endl;
    sleep(1);
}

void loadTaskFunc(std::vector<func_t> *out)
{
    assert(out);
    out->push_back(downLoadTask);
    out->push_back(ioTask);
    out->push_back(flushTask);
}

/下面的代码是一个多进程程序//
class subEp // Endpoint
{
public:
    subEp(pid_t subId, int writeFd)
        : subId_(subId), writeFd_(writeFd)
    {
        char nameBuffer[1024];
        snprintf(nameBuffer, sizeof nameBuffer, "process-%d[pid(%d)-fd(%d)]", num++, subId_, writeFd_);
        name_ = nameBuffer;
    }

public:
    static int num;
    std::string name_;
    pid_t subId_;
    int writeFd_;
};

int subEp::num = 0;

int recvTask(int readFd)
{
    int code = 0;
    ssize_t s = read(readFd, &code, sizeof code);
    if(s == 4) return code;
    else if(s <= 0) return -1;
    else return 0;
}

void sendTask(const subEp &process, int taskNum)
{
    std::cout << "send task num: " << taskNum << " send to -> " << process.name_ << std::endl;
    int n = write(process.writeFd_, &taskNum, sizeof(taskNum));
    assert(n == sizeof(int));
    (void)n;
}

void createSubProcess(std::vector<subEp> *subs, std::vector<func_t> &funcMap)
{
    std::vector<int> deleteFd;
    for (int i = 0; i < PROCSS_NUM; i++)
    {
        int fds[2];
        int n = pipe(fds);
        assert(n == 0);
        (void)n;
        // 父进程打开的文件,是会被子进程共享的
        // 你试着多想几轮
        pid_t id = fork();
        if (id == 0)
        {
            for(int i = 0; i < deleteFd.size(); i++) close(deleteFd[i]);
            // 子进程, 进行处理任务
            close(fds[1]);
            while (true)
            {
                // 1. 获取命令码,如果没有发送,我们子进程应该阻塞
                int commandCode = recvTask(fds[0]);
                // 2. 完成任务
                if (commandCode >= 0 && commandCode < funcMap.size())
                    funcMap[commandCode]();
                else if(commandCode == -1) break;
            }
            exit(0);
        }
        close(fds[0]);
        subEp sub(id, fds[1]);
        subs->push_back(sub);
        deleteFd.push_back(fds[1]);
    }
}

void loadBlanceContrl(const std::vector<subEp> &subs, const std::vector<func_t> &funcMap, int count)
{
    int processnum = subs.size();
    int tasknum = funcMap.size();
    bool forever = (count == 0 ? true : false);

    while (true)
    {
        // 1. 选择一个子进程 --> std::vector<subEp> -> index - 随机数
        int subIdx = rand() % processnum;
        // 2. 选择一个任务 --> std::vector<func_t> -> index
        int taskIdx = rand() % tasknum;
        // 3. 任务发送给选择的进程
        sendTask(subs[subIdx], taskIdx);
        sleep(1);
        if(!forever)
        {
            count--;
            if(count == 0) break;   
        }
    }
    // write quit -> read 0
    for(int i = 0; i < processnum; i++) close(subs[i].writeFd_); // waitpid();
}

    
void waitProcess(std::vector<subEp> processes)
{
    int processnum = processes.size();
    for(int i = 0; i < processnum; i++)
    {
        waitpid(processes[i].subId_, nullptr, 0);
        std::cout << "wait sub process success ...: " << processes[i].subId_ << std::endl;
    }
}

int main()
{
    MakeSeed();
    // 1. 建立子进程并建立和子进程通信的信道, 有bug的,但是不影响我们后面编写
    // 1.1 加载方发表
    std::vector<func_t> funcMap;
    loadTaskFunc(&funcMap);
    // 1.2 创建子进程,并且维护好父子通信信道
    std::vector<subEp> subs;
    createSubProcess(&subs, funcMap);

    // 2. 走到这里就是父进程, 控制子进程,负载均衡的向子进程发送命令码
    int taskCnt = 3; // 0: 永远进行
    loadBlanceContrl(subs, funcMap, taskCnt);

    // 3. 回收子进程信息
    waitProcess(subs);

    return 0;
}

The code of the process pool is a little difficult and needs to be implemented and understood by yourself, so we won't say much here. Now I will tell you about the pipes that do not need to communicate based on blood relationship: named pipes.

Named pipes:

It can be applied to two unrelated processes to communicate (no blood relationship), and call the fifo interface to create a pipeline file. We call this pipe a named pipe . Named pipes are a special type of file.

Create an interface for named pipes:

int mkfifo(const char *filename,mode_t mode);
mode is the permission, for example, this is 0600, so only users can use this pipeline file.
Through named pipes, we can implement one, communication between client and server:
comm.hpp:
#pragma once

#include <iostream>
#include <string>
#include <cstring>
#include <cerrno>
#include <cassert>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define NAMED_PIPE "/tmp/mypipe.106"

bool createFifo(const std::string &path)
{
    umask(0);
    int n = mkfifo(path.c_str(), 0600);
    if (n == 0)
        return true;
    else
    {
        std::cout << "errno: " << errno << " err string: " << strerror(errno) << std::endl;
        return false;
    }
}

void removeFifo(const std::string &path)
{
    int n = unlink(path.c_str());
    assert(n == 0); // debug , release 里面就没有了
    (void)n;
}

client.cc:

#include "comm.hpp"

// 你可不可以把刚刚写的改成命名管道呢!
int main()
{
    std::cout << "client begin" << std::endl;
    int wfd = open(NAMED_PIPE, O_WRONLY);
    std::cout << "client end" << std::endl;
    if(wfd < 0) exit(1); 

    //write
    char buffer[1024];
    while(true)
    {
        std::cout << "Please Say# ";
        fgets(buffer, sizeof(buffer), stdin); // abcd\n
        if(strlen(buffer) > 0) buffer[strlen(buffer)-1] = 0;
        ssize_t n = write(wfd, buffer, strlen(buffer));
        assert(n == strlen(buffer));
        (void)n;
    }

    close(wfd);
    return 0;
}

server.cc:

#include "comm.hpp"

int main()
{
    bool r = createFifo(NAMED_PIPE);
    assert(r);
    (void)r;

    std::cout << "server begin" << std::endl;
    int rfd = open(NAMED_PIPE, O_RDONLY);
    std::cout << "server end" << std::endl;
    if(rfd < 0) exit(1);

    //read
    char buffer[1024];
    while(true)
    {
        ssize_t s = read(rfd, buffer, sizeof(buffer)-1);
        if(s > 0)
        {
            buffer[s] = 0;
            std::cout << "client->server# " << buffer << std::endl;
        }
        else if(s == 0)
        {
            std::cout << "client quit, me too!" << std::endl;
            break;
        }
        else
        {
            std::cout << "err string: " << strerror(errno) << std::endl;
            break;
        }
    }

    close(rfd);

    // sleep(10);
    removeFifo(NAMED_PIPE);
    return 0;
}

The difference between anonymous pipes and named pipes:

1. An anonymous pipe is created and opened by the pipe function. Named pipes are created by the mkfififo function and opened with open
2. The only difference between FIFO (named pipe) and pipe (anonymous pipe) is in the way they are created and opened, but once these tasks are completed, they have the same semantics.

Opening rules for named pipes

If the current open operation is opening the FIFO for reading
O_NONBLOCK disable : Block until a corresponding process opens the FIFO for writing
O_NONBLOCK enable : return success immediately
If the current open operation is opening the FIFO for writing
O_NONBLOCK disable : block until the corresponding process opens the FIFO for reading O_NONBLOCK enable: return immediately and fail, the error code is ENXIO

At this point, the pipeline communication of inter-process communication is all over. I hope everyone will support us and make progress together! 

Guess you like

Origin blog.csdn.net/m0_69005269/article/details/130446585