[Interprocess Communication: Pipeline]

Table of contents

1 Introduction to interprocess communication

1.1 Purpose of interprocess communication

1.2 Development of interprocess communication  

1.3 Classification of inter-process communication

 2 pipes

2.1 What is a pipeline

2.2 Anonymous pipes

2.2.1 Use of anonymous pipes

 2.2.2 Create a process pool using anonymous pipes

2.3 Pipeline read and write rules

2.4 Features of Anonymous Pipeline

2.5 Named Pipes

2.5.1 Concept

2.5.2 Use


1 Introduction to interprocess communication

1.1 Inter-process communication purpose

  • Data transfer: One process needs to send its data to another process.
  • Resource sharing: The same resource is shared between multiple processes.
  • Notification event: A process needs to send a message to another process or a group of processes, informing it (they) that some event has occurred (such as notifying the parent process when the process terminates).
  • Process control: Some processes want to completely control the execution of another process (such as Debug process). At this time, the control process hopes to be able to intercept all traps and exceptions of another process, and to be able to know its state changes in time.

1.2 Development of interprocess communication 

  • pipeline
  • System V interprocess communication
  • POSIX interprocess communication

1.3 Classification of interprocess communication

pipeline

  • Anonymous pipe
  • named pipe
System V IPC
  • System V message queue
  • System V shared memory
  • System V semaphores
POSIX IPC
  • message queue
  • Shared memory
  • amount of signal
  • mutex
  • condition variable
  • read-write lock

 2 pipes

2.1 What is a pipeline

Pipes are the oldest form of interprocess communication in Unix .
We call a flow of data from one process to another a " pipe".

For example, our common commands | , we know that the commands we execute are essentially executing a process on Linux. Pipes are also divided into anonymous pipes and named pipes. The ones without names like the above are called anonymous pipes.

2.2 Anonymous pipes

2.2.1 Use of anonymous pipes

When we got the file descriptor, we said that the child process will inherit the file descriptor of the parent process, but the child process will not copy the file of the parent process, that is, the child process and the parent process actually see the same file. This has the premise of inter-process communication: two processes see the same resource. We can let the parent-child process complete our tasks according to our needs (for example, let the parent process write the file, and the child process reads from the file)

#include <unistd.h>
功能:创建一无名管道
原型
int pipe(int fd[2]);
参数
fd:文件描述符数组,其中fd[0]表示读端, fd[1]表示写端
返回值:成功返回0,失败返回错误代码

We can use the pipe function to help us create anonymous pipes (everyone must note that the premise of using anonymous pipes is that the pipe has been opened before the parent process creates the child process, so that the child process can inherit the file descriptor of the parent process)

 Example code:

#include<iostream>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>
#include<cerrno>
#include<cstring>
#include<string>
using namespace std;

int main()
{
    
    int pipefd[2]={0};
    int n=pipe(pipefd);
    if(n<0)
    {
        cout<<"error"<<":"<<strerror(errno)<<endl;
        return 1;
    }

    pid_t id=fork();
    if(id==0)
    {
        //child 子进程读取,父进程写入
        close(pipefd[1]);
        char buffer[1024];
        while(true)
        {
            int n=read(pipefd[0],buffer,9);
            if(n>0)
            {
                buffer[n]='\0';
                cout<<"child :"<<buffer<<endl;
            }
            else if(n==0)
            {
                cout<<"read file end"<<endl;
            }
            else 
            {
                cout<<"read error"<<endl;
            }
        }

        close(pipefd[0]);
        exit(0);
    }

    //parent 子进程读取,父进程写入
    close(pipefd[0]);
    const char* str="hello bit";
    while(true)
    {
        write(pipefd[1],str,strlen(str));
    }
    close(pipefd[1]);

    int status=0;
    waitpid(id,&status,0);
    cout<<"singal:"<<(status&0x7f)<<endl;
    return 0;
}

In this way, we have written a basic method of communicating with anonymous channels.

Noteworthy details in the code are:

  1. The system stipulates that the subscript of the array is 0 to indicate the read end, and the subscript of the array is 1 to indicate the write end .
  2. One parent-child process completes writing and one completes reading. The reading end should be closed before writing. Similarly, the writing segment should be closed before reading.

 2.2.2 Create a process pool using anonymous pipes

We can use an anonymous pipe to do some more elegant things: for example, create a process pool and use a parent process to manage multiple child processes:

contralProcess.cc:

#include <iostream>
#include <string>
#include <vector>
#include <cassert>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/types.h>
#include "Task.hpp"
using namespace std;

const int gnum = 3;
Task t;

class EndPoint
{
private:
    static int number;
public:
    pid_t _child_id;
    int _write_fd;
    std::string processname;
public:
    EndPoint(int id, int fd) : _child_id(id), _write_fd(fd)
    {
        //process-0[pid:fd]
        char namebuffer[64];
        snprintf(namebuffer, sizeof(namebuffer), "process-%d[%d:%d]", number++, _child_id, _write_fd);
        processname = namebuffer;
    }
    std::string name() const
    {
        return processname;
    }
    ~EndPoint()
    {
    }
};

int EndPoint::number = 0;

// 子进程要执行的方法
void WaitCommand()
{
    while (true)
    {
        int command = 0;
        int n = read(0, &command, sizeof(int));
        if (n == sizeof(int))
        {
            t.Execute(command);
        }
        else if (n == 0)
        {
            std::cout << "父进程让我退出,我就退出了: " << getpid() << std::endl; 
            break;
        }
        else
        {
            break;
        }
    }
}

void createProcesses(vector<EndPoint> *end_points)
{
    vector<int> fds;
    for (int i = 0; i < gnum; i++)
    {
        // 1.1 创建管道
        int pipefd[2] = {0};
        int n = pipe(pipefd);
        assert(n == 0);
        (void)n;

        // 1.2 创建进程
        pid_t id = fork();
        assert(id != -1);
        // 一定是子进程
        if (id == 0)
        {
            for(auto &fd : fds) close(fd);

            
            // 1.3 关闭不要的fd
            close(pipefd[1]);
            // 我们期望,所有的子进程读取"指令"的时候,都从标准输入读取
            // 1.3.1 输入重定向,可以不做
            dup2(pipefd[0], 0);
            // 1.3.2 子进程开始等待获取命令
            WaitCommand();
            close(pipefd[0]);
            exit(0);
        }

        // 一定是父进程
        //  1.3 关闭不要的fd
        close(pipefd[0]);

        // 1.4 将新的子进程和他的管道写端,构建对象
        end_points->push_back(EndPoint(id, pipefd[1]));

        fds.push_back(pipefd[1]);
    }
}


int ShowBoard()
{
    std::cout << "##########################################" << std::endl;
    std::cout << "|   0. 执行日志任务   1. 执行数据库任务    |" << std::endl;
    std::cout << "|   2. 执行请求任务   3. 退出             |" << std::endl;
    std::cout << "##########################################" << std::endl;
    std::cout << "请选择# ";
    int command = 0;
    std::cin >> command;
    return command;
}

void ctrlProcess(const vector<EndPoint> &end_points)
{
    // 2.1 我们可以写成自动化的,也可以搞成交互式的
    int num = 0;
    int cnt = 0;
    while(true)
    {
        //1. 选择任务
        int command = ShowBoard();
        if(command == 3) break;
        if(command < 0 || command > 2) continue;
        
        //2. 选择进程
        int index = cnt++;
        cnt %= end_points.size();
        std::string name = end_points[index].name();
        std::cout << "选择了进程: " <<  name << " | 处理任务: " << command << std::endl;

        //3. 下发任务
        write(end_points[index]._write_fd, &command, sizeof(command));

        sleep(1);
    }
}

void waitProcess(const vector<EndPoint> &end_points)
{
    // 1. 我们需要让子进程全部退出 --- 只需要让父进程关闭所有的write fd就可以了!
    // for(const auto &ep : end_points) 
    // for(int end = end_points.size() - 1; end >= 0; end--)
    for(int end = 0; end < end_points.size(); end++)
    {
        std::cout << "父进程让子进程退出:" << end_points[end]._child_id << std::endl;
        close(end_points[end]._write_fd);

        waitpid(end_points[end]._child_id, nullptr, 0);
        std::cout << "父进程回收了子进程:" << end_points[end]._child_id << std::endl;
    } 
    sleep(10);

    // 2. 父进程要回收子进程的僵尸状态
    // for(const auto &ep : end_points) waitpid(ep._child_id, nullptr, 0);
    // std::cout << "父进程回收了所有的子进程" << std::endl;
    // sleep(10);
}



int main()
{
    vector<EndPoint> end_points;
    // 1. 先进行构建控制结构, 父进程写入,子进程读取 , bug?
    createProcesses(&end_points);

    // 2. 我们的得到了什么?end_points
    ctrlProcess(end_points);

    // 3. 处理所有的退出问题
    waitProcess(end_points);
    return 0;
}

Task.hpp:

#pragma once

#include <iostream>
#include <vector>
#include <unistd.h>
#include <unordered_map>

// typedef std::function<void ()> func_t;

typedef void (*fun_t)(); //函数指针

void PrintLog()
{
    std::cout << "pid: "<< getpid() << ", 打印日志任务,正在被执行..." << std::endl;
}

void InsertMySQL()
{
    std::cout << "执行数据库任务,正在被执行..." << std::endl;
}

void NetRequest()
{
    std::cout << "执行网络请求任务,正在被执行..." << std::endl;
}


//约定,每一个command都必须是4字节
#define COMMAND_LOG 0
#define COMMAND_MYSQL 1
#define COMMAND_REQEUST 2

class Task
{
public:
    Task()
    {
        funcs.push_back(PrintLog);
        funcs.push_back(InsertMySQL);
        funcs.push_back(NetRequest);
    }
    void Execute(int command)
    {
        if(command >= 0 && command < funcs.size()) funcs[command]();
    }
    ~Task()
    {}
public:
    std::vector<fun_t> funcs;
};

I don’t know if you have noticed a problem: when we create a child process, we first read the data from the vector to close it. What is stored in this vector?

 Let's think about it: when our parent process forks for the first time, the parent process uses the file descriptor with subscript 4, the first child process uses the file descriptor with subscript 3, and the parent process passes pipefd[1 ] Write data to the first pipe, and the child process reads data in the pipe through pipefd[0], but when we create the child process for the second time, the child process will inherit the file descriptor of the parent process, that is to say , the second child process actually inherits the file descriptor with subscript 4 opened by the parent process for the first time, so what is the harm of doing so? If we first close the write end of the first pipe, and then recycle the first child process, the first child process will always be blocked there, why? Because the write end inherited from the parent process in the second child process also points to the read end of the first child process, that is, if we only close the write end of the first pipe first, it will not work, and the first child process has not ended , because its read end still points to all subsequent child processes, which causes the first child process to be blocked there when it recycles. The analysis method of the following process is the same:

 The solution is as follows: we can first close all the writing ends of all pipelines uniformly, and then recycle them one by one. In this way, the write ends of all processes are closed, and naturally they exit successfully. It is also possible to close from the write end of the last pipeline, and recycle while closing. Since the read end of the last child process only points to the write end of the last pipeline, it can exit normally.

But writing like this is a temporary solution but not the root cause. At this time, when fork creates a child process, the child process has already inherited the file descriptor of the previous process. Is there any way to close the file descriptor inherited from the parent process when the child process is created? What? The answer is yes. The vector we mentioned above can be handled very well. Every time we create a child process, we reserve the write end of the corresponding pipeline (keep it in the vector), and then each time we create it Just close the previously inherited file descriptor and close it. (The relationship here is a bit complicated, everyone must go down and summarize it by themselves)

So, treat pipelines like you treat files! The use of pipes is consistent with files, which caters to the "Linux everything is a file idea " .

2.3 Pipeline read and write rules

When there is no data to read:

  • O_NONBLOCK disable : The read call is blocked, that is, the process suspends execution and waits until data arrives.
  • O_NONBLOCK enable : The read call returns -1 , and the errno value is EAGAIN .
When the pipe is full:
  • O_NONBLOCK disable : The write call is blocked until a process reads the data
  • O_NONBLOCK enable : The call returns -1 , and the errno value is EAGAIN
  • If all the file descriptors corresponding to the write end of the pipe are closed, read returns 0
  • If the file descriptors corresponding to all pipe read ends are closed, the write operation will generate a signal SIGPIPE, which may cause the write process to exit
  • When the amount of data to be written is not greater than PIPE_BUF , linux will guarantee the atomicity of writing.
  • When the amount of data to be written is greater than PIPE_BUF , linux will no longer guarantee the atomicity of writing.

 The previous ones are easy to understand. Here’s what atomicity is: Atomicity means that if we write a sentence "hello world" into the pipeline, we will write it in if we can write it in completely, otherwise we will not write it. Input, that is, the execution of the program can only have two results: all data is written and all data is not written, and there is no half-write situation.

2.4 Features of Anonymous Pipeline

  • It can only be used to communicate between processes with a common ancestor (processes with kinship); usually, a pipe is created by a process, and then the process calls fork, and the pipe can be used between the parent and child processes thereafter.
  • Pipelines provide streaming services
  • Generally speaking, when the process exits, the pipeline is released, so the life cycle of the pipeline varies with the process
  • In general, the kernel will synchronize and mutex pipeline operations
  • The pipeline is half-duplex, and data can only flow in one direction; when two parties need to communicate, two pipelines need to be established

 

2.5 Named Pipes

2.5.1 Concept

A limitation of pipes is that they can only communicate between processes that share a common ancestor (affinity).
If we want to exchange data between unrelated processes, we can use FIFO files to do the job, which are often called named pipes. Named pipes are a special type of file.

2.5.2 Use

We know that anonymous pipes are a way for blood-related processes to establish communication, while named pipes allow processes without blood relations to establish communication. So how to make processes that have no blood relationship see the same resource?

Let's first look at such a command:

 We created a pipeline file called fifo through the mkfifo command. When we redirected the string output to this file, we found that the cursor was stuck here and did not move. It will only appear when we read it. When we Both ends are terminated when terminated:

 This is a named pipe created on the command line. And this kind of pipeline is a memory-level file and will not be flushed to disk. The generated named pipe file is just an identifier of the kernel buffer, which is used to let multiple processes find the same buffer. The essence of anonymous pipes and named pipes is a buffer in the kernel . To answer how to make different processes see the same resource, we can use the file path + file name as a method to uniquely identify the file to create a file as a named pipe.

Instead of using the command-line approach, we can also use system calls:

int mkfifo(const char *filename,mode_t mode);

 Through this we can implement a simple server-client communication program:

server.cc:

#include<iostream>
#include<cstring>
#include<string>
#include<cerrno>
#include<sys/types.h>
#include<sys/stat.h>
#include<unistd.h>
#include<fcntl.h>
using namespace std;


int main()
{
    const string fileName("myfile");
    umask(0);
    int n=mkfifo(fileName.c_str(),0666);
    if(n<0)
    {
        cerr<<"strerrno:"<<errno<<strerror(errno)<<endl;
        return 1;
    }

    cout<<"server creat fifo success"<<endl;
    int rop=open(fileName.c_str(), O_RDONLY);
    if(rop<0)
    {
        cerr<<"strerrno:"<<errno<<strerror(errno)<<endl;
        return 1;
    }
    cout<<"server open fifo success,begin ipc"<<endl;
    char buffer[1024];
    while(true)
    {
        buffer[0]=0;
        int n=read(rop,buffer,sizeof(buffer)-1);
        if(n>0)
        {
            buffer[n]=0;
            cout<<buffer<<endl;
        }
        else if(n==0)
        {
            cout<<"client exit,server also exit"<<endl;
            break;
        }
        else
        {
            cerr<<"strerrno:"<<errno<<strerror(errno)<<endl;
            return 1;
        }
    }
    close(rop);
    unlink(fileName.c_str());
    return 0;
}

client.cc:

#include<iostream>
#include<unistd.h>
#include<sys/types.h>
#include<sys/stat.h>
#include<fcntl.h>
#include<string>
#include<cerrno>
#include<cstring>
using namespace std;
int main()
{
    const string fileName("myfile");
    int wop=open(fileName.c_str(),O_WRONLY);
    if(wop<0)
    {
        cerr<<"strerrno:"<<errno<<strerror(errno)<<endl;
        return 1;
    }
    string myinfo;
    while(true)
    {
        cout<<"请输入你的消息"<<endl;
        getline(cin,myinfo);
        write(wop,myinfo.c_str(),myinfo.size());
        myinfo[strlen(myinfo.c_str())-1]=0;
    }
    close(wop);
    return 0;
}

Makefile:

.PHONY:all
all:client server

client:client.cc
	g++ -o $@ $^ -std=c++11
server:server.cc
	g++ -o $@ $^ -std=c++11

.PHONY:clean
clean:
	rm -rf client server

Effect:

Since we have established an anonymous pipeline on the server side, it is best to kill the pipeline file on the server side when we exit.

By the way, I would like to ask: When multiple processes communicate through the pipeline, can the communication not continue if the pipeline file is deleted?

Obviously not, because the pipe file only serves as an identification, the process that has opened the pipe before can still communicate normally.

Guess you like

Origin blog.csdn.net/m0_68872612/article/details/130131513