[Linux] Inter-process communication - pipeline

Table of contents

words written in front

What is interprocess communication

Why interprocess communication

Understanding the nature of inter-process communication

way of interprocess communication

pipeline

System V IPC

POSIX IPC

pipeline

what is a pipeline

 anonymous pipe

What is an anonymous pipe

The principle of anonymous pipe communication

use of pipe()

Characteristics of anonymous channel communication

extension code

 named pipe

What is a named pipe

The principle of named pipe communication

Use of mkfifo

The code simulates the named pipe communication process


words written in front

        This chapter is the first time to introduce the concept of inter-process communication, so it will first introduce the related concepts of inter-process communication and the overall framework structure.

        The focus of this article is to first introduce the basic concept of inter-process communication, and then focus on the first way of inter-process communication: pipeline.

What is interprocess communication

        Inter- Process Communication (IPC) refers to the mechanism or technology for data exchange and communication between different processes in an operating system or computer system . Since a process is an instance of a program running independently in the operating system, inter-process communication allows these independent processes to cooperate with each other, share resources and exchange data.

Why interprocess communication

        According to what we said earlier, the processes are independent of each other, and the processes are independent . Isn't the communication independent?

    The answer is correct, process communication will indeed destroy the complete independence of the process, because the purpose of process communication is to achieve data sharing, synchronization and cooperation between processes . Through process communication , each process can interact with each other and share resources, which means that they are no longer completely independent, but have certain interdependence and correlation.

        Although process communication breaks the complete independence of processes, this break is meaningful and necessary . In actual computer systems and operating systems, processes often need to work together, share resources and exchange data to complete complex tasks. Process communication provides a mechanism that enables necessary collaboration and communication between different processes, and provides corresponding synchronization and protection mechanisms to ensure the correctness and consistency of data.

        So this is a trade-off and a compromise, but in most cases the processes are independent of each other.


In summary, inter-process communication is mainly to accomplish the following functions:

  • Data transfer : One process needs to send its data to another process.
  • Resource sharing : share the same resources between multiple processes (including local sharing and remote resource sharing).
  • Process control : Some processes want to completely control the execution of another process (such as Debug process). At this time, the control process hopes to be able to intercept all traps and exceptions of another process, and to be able to know its state changes in time.
  • Notification event : A process needs to send a message to another process or a group of processes, informing it (they) that some event has occurred (such as notifying the parent process when the process terminates).

Understanding the nature of inter-process communication

        1. We know that processes are independent , which is maintained through virtual address space + page table mapping, so the cost of communication will be relatively high.

        2. Since communication, the premise is that different processes must see the same "memory" ( specific structural organization). This "memory" cannot belong to any process , and more emphasis should be placed on sharing .

Interprocess communication is not an end, but a means!


way of interprocess communication

        Generally, there are three types of communication:

  • pipeline

    • Anonymous pipe
    • named pipe
  • System V IPC

    • System V message queue
    • System V shared memory
    • System V semaphores

System V can only be used for stand-alone communication (local communication).

  • POSIX IPC

    • message queue
    • Shared memory
    • amount of signal
    • mutex
    • condition variable
    • read-write lock

POSIX IPC can perform network communication (remote resource sharing) on ​​the basis of stand-alone communication .

        The methods mentioned above, I will explain one by one in the following chapters, this is the way of inter-process communication.

Today we will first explain the inter-process communication method (1) - pipeline .

pipeline

what is a pipeline

        Pipe (Pipe) is an inter-process communication mechanism for transferring data between related processes . It is a special file descriptor that can connect the output (write end) of one process to the input (read end) of another process , so that the two processes can transmit data through the pipeline.

        That is to say, the pipeline is one-way transmission ! In real life, the natural gas pipelines and oil pipelines we see and hear are basically one-way transmission.

 anonymous pipe

What is an anonymous pipe

Anonymous Pipe (Anonymous Pipe) is a mechanism for inter-process communication, used to transfer data between sibling processes that have an affinity (such as a parent-child process) or share the same terminal.

        An anonymous pipe is a one-way data flow channel that can be used to pass data between processes. Usually, one process acts as the write end of the pipe (called the pipe write end) and writes data into the pipe; the other process acts as the read end of the pipe (called the pipe read end) and reads data from the pipe.

        The creation of anonymous pipes is  pipe() done through system calls. The use of pipe will be discussed later.

The principle of anonymous pipe communication

        Behind the pipeline communication is communication between processes through pipelines.

        We know that a process needs to be run, first loaded into memory, and then create a task_struct structure, which will have a files_struct structure, and then there is an fd_array[] array in this structure, each element points to the corresponding file struct_file, It contains the contents of the file, etc.

At this time, the child process         after our fork will recreate a task_struct , and the content inherits from the parent process . At this time, the content in fd_array[] is also inherited by the child process, that is, the file opened by the parent process is also inherited by the child process . The files they point to are the same.

        Suppose the file descriptor No. 3 of the parent process is used to read the file, and the No. 4 file descriptor is also used to write the file. After the child process inherits, fd=3 is also used to read the file, and fd=4 is also used to write into the file

         At this time, we want the parent process to write fd=4, and the child process to read fd=3, so the parent process must close the read end fd=3, and the child process closes the write end fd=4.

In this way, we have made it possible for different processes to see the same resource (through the fork child process), and completed the one-way communication between processes through the file descriptor.

In summary, the internal essence of the pipeline is roughly as follows:

        1. The parent process opens a file in read-write mode

        2. fork() creates a child process

        3. Both parties close unnecessary file descriptors

The overall picture is as follows:

use of pipe()

        Now that we know the idea, we can use the code to use the pipeline.

        First of all, how does the parent process open the file separately for reading and writing?

The pipe function is used here . The usage and prototype of the function are as follows:

        The parameter pipefd is an output parameter. We define the array externally in advance, and then pass it in. The result will be saved in this array, respectively pipefd[0], which represents the read end, and pipefd[1], which represents the write end. end.

        In the second step , we use fork to create a child process.

        Finally, the child process is used to read the content of the file and close the write end pipefd[1], and the parent process is used to write the content and close the read end at the same time.

Characteristics of anonymous channel communication

A small demo is as follows:

#include <iostream>
#include <string>
#include <cstdio>
#include <cstring>
#include <string.h>
#include <assert.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
using namespace std;

int main()
{
    // 1.创建管道
    int pipefd[2] = {0}; // pipefd[0] :读端, pipefd[1] :写端
    int n = pipe(pipefd);
    assert(n != -1);
    #ifdef DEBUG
    #endif
    // cout << pipefd[0] << "  " << pipefd[1] << endl;
    // 2.创建子进程
    pid_t id = fork();
    assert(id != -1);

    if (id == 0)
    {
        // 子进程
        // 3.构建单向通信的信道,父进程写入,子进程读取
        // 3.1关闭子进程不需要的fd
        close(pipefd[1]);
        char buffer[1024];
        while (true)
        {
            ssize_t s = read(pipefd[0], buffer, sizeof(buffer) - 1);
            if (s > 0)
            {
                buffer[s] = 0;
                cout << "child get a message [" << getpid() << "] Father# " << buffer << endl;
            }
        }
        exit(0);
    }

    // 父进程
    // 构建单向通信的信道
    // 3.1 关闭父进程不需要的fd
    close(pipefd[0]);
    string message = "我是父进程,我正在给你发消息";
    int count = 0;
    char send_buffer[1024];
    while (true)
    {
        // 3.2 构建一个变化的字符串
        snprintf(send_buffer, sizeof(send_buffer), "%s[%d] : %d", message.c_str(), getpid(), count++);
        // 3.3 写入
        write(pipefd[1], send_buffer, strlen(send_buffer));
        // 3.4 sleep
        sleep(1);
    }

    pid_t ret = waitpid(id, nullptr, 0);
    assert(ret > 0);
    close(pipefd[1]);
    return 0;
}

Then at this point we compile and run:

 It can be seen that the child process has read all the content written by the parent process.        

Here is a static picture, no effect can be seen, the information is actually printed every 1 second .

        We see that only the child process is reading, but we have not added any sleep to the child process. In theory, the child process should keep printing.

        The display is a file, and the pipe is also a file. When the parent and child processes write to the display at the same time, it is not said that one process will wait for the other, but each prints its own message and interferes with each other. This is a lack of access control . The pipe file provides access control , so that the child process has to wait for the parent process to write before reading it.

        In this way, if we let the child process read sleep for 10 seconds first, during which the parent process writes every 1 second, after 10 seconds, the child process starts to read, but it will write all the contents of the file written by the parent process 10 times read out. This means that there is no direct relationship between the number of writes and the number of reads. That is, the pipeline is oriented to the byte stream , and how to read it requires a custom protocol, which will be described later,

        Here is a summary of the characteristics of the pipeline:

  • Pipes are used for inter-process communication between blood-related processes --- often used for parent-child communication
  • Pipelines provide access control by letting processes coordinate
  • The pipeline provides a byte stream-oriented communication service --- byte stream-oriented --- realized through a custom protocol
  • The pipeline is based on the file , and the life cycle of the file follows the process, that is, the life cycle of the pipeline also follows the process !
  • The pipe is a one-way communication , which is a special way of half-duplex communication.

        The last item above mentioned the concept of half-duplex, here is an explanation:

        Half-duplex : Both parties to the communication can only transmit data in one direction at the same time , that is, two participants cannot send and receive data at the same time. In half-duplex communication, both communicating parties must alternately use the shared communication channel. For example, when one person speaks on a walkie-talkie, the other person must stop receiving before they can respond. Typical half-duplex communication methods include walkie-talkies and satellite radios.

        Full-duplex: Full-duplex communication allows data to be transmitted in both directions at the same point in time . This means that two participants in the communication are able to send and receive data at the same time without the need to alternate the use of the communication channel. In full-duplex communication, both communicating parties can simultaneously send and receive, and the data transmission between them does not interfere with each other . For example, a telephone call is a typical full-duplex communication scenario, where both parties can speak and listen to each other at the same time.

         By the way, let’s summarize several situations of the pipeline:

        a . Writing is fast, reading is slow , and when it is full, it cannot be written anymore

        b. Writing is slow and reading is fast . When there is no data in the pipeline, reading must wait

        Both are provided by Access Control .

        c. Write off, read and continue to read , it will mark that the end of the file has been read

        d. Write to continue writing, read off , OS will terminate the writing process

extension code

        Using anonymous pipes , create multiple child processes, and then the parent process dispatches random tasks,

        The overall code flow is: the parent process first loads () the loading method, and then the for loop creates multiple processes. After each creation is completed, the process must be associated with the parent process (pipefd[1]) to facilitate the parent process to manage these child process.

        Each child process calls the waitCommand function, which will be blocked in read, waiting for the parent process to write, and then the parent process starts to distribute tasks. When it is the corresponding child process, the child process will execute the corresponding task, and then continue to wait in the while loop.

A total of two files, the first file ProcessPool.cc file

#include <iostream>
#include <vector>
#include <ctime>
#include <unistd.h>
#include <stdlib.h>
#include <assert.h>
#include <sys/types.h>
#include <sys/wait.h>
#include "Task.hpp"
using namespace std;

#define PROCESS_NUM 5
int waitCommand(int waitFd, bool quit) // 如果对方不发任务就阻塞
{
    uint32_t command = 0;
    ssize_t s = read(waitFd, &command, sizeof(command));
    if (s == 0)
    {
        quit = true;
        return -1;
    }
    assert(s == sizeof(command));

    return command;
}
void sendAndWakeup(pid_t who, int fd, uint32_t command)
{
    write(fd, &command, sizeof(command));
    cout << "main process: call process " << who << " execute " << desc[command] << " through " << fd << endl;
}

int main()
{
    load();
    // pid : pipefd
    vector<pair<pid_t, int>> slots;
    // create multiple child process
    for (int i = 0; i < PROCESS_NUM; i++)
    {
        // 创建管道
        int pipefd[2] = {0};
        int n = pipe(pipefd);
        assert(n == 0);

        pid_t id = fork();
        assert(id != -1);
        // 让子进程读取
        if (id == 0)
        {
            // 关闭写端
            close(pipefd[1]);
            // child process
            while (true)
            {
                // 等命令
                bool quit = false;
                int command = waitCommand(pipefd[0], quit); // 如果对方不发任务就阻塞
                if (quit)
                    break;
                // 执行对应的命令
                if (command >= 0 && command < handlerSize())
                {
                    callbacks[command]();
                }
                else
                {
                    cout << "非法 command" << endl;
                }
            }
            exit(1);
        }
        // father process
        close(pipefd[0]);
        slots.push_back(pair<pid_t, int>(id, pipefd[1]));
    }
    // 父进程派发任务
    srand((unsigned long)time(nullptr) ^ getpid() ^ 2311156L);
    while (true)
    {   
        //选择一个任务 
        int command = rand() % handlerSize();
        //选择一个进程,采用随机数的方式,选择进程来完成任务,随机数的方式负载均衡
        int choice = rand() % slots.size();
        // 把任务给指定的进程
        sendAndWakeup(slots[choice].first, slots[choice].second, command);
        sleep(1);
        int select;
        //以下是手动派发任务
        // int command = 0;
        // cout << "######################################" << endl;
        // cout << "1.show functions        2.send command" << endl;
        // cout << "######################################" << endl;
        // cout << "Please Select > ";
        // cin >> select;
        // if (select == 1)
        //     showHandler();
        // else if (select == 2)
        // {
        //     cout << "Enter your Command > ";
        //     // 选择任务
        //     cin >> command;
        //     // 选择进程
        //     int choice = rand() % slots.size();
        //     // 把任务给指定的进程
        //     sendAndWakeup(slots[choice].first, slots[choice].second, command);
        // }
        // else
        // {
        // }
    }

    // 关闭fd,结束所有进程
    for (auto &slot : slots)
    {
        close(slot.second);
    }
    // 回收所有子进程
    for (auto &slot : slots)
    {
        waitpid(slot.first, nullptr, 0);
    }

    return 0;
}

        The second file is the Task.hpp file, which mainly includes the task loading and task execution methods.

        

#pragma once
#include<iostream>
#include<vector>
#include<string>
#include<unordered_map>
#include<unistd.h>
#include<functional>
using namespace std;

typedef function<void()> func;

vector<func> callbacks;
unordered_map<int,string> desc;


void readMySQL()
{
    cout << "sub process[" << getpid() << "] 执行数据库被访问的任务\n" << endl;
}
void executeURL()
{
    cout << "sub process[" << getpid() << "] 执行url解析任务\n" << endl;
}
void cal()
{
    cout << "sub process[" << getpid() << "] 执行加密任务\n" << endl;
}
void save()
{
    cout << "sub process[" << getpid() << "] 执行数据持久化\n" << endl;
}

void load()
{
    desc.insert({callbacks.size(),"readMySQWL:读取数据库"});
    callbacks.push_back(readMySQL);

    desc.insert({callbacks.size(),"executeURL:解析URL"});
    callbacks.push_back(executeURL);

    desc.insert({callbacks.size(),"cal:进行加密计算"});
    callbacks.push_back(cal);

    desc.insert({callbacks.size(),"save:进行数据的文件保存"});
    callbacks.push_back(save);
}
void showHandler()
{
    for(auto& iter: desc)
    {
        cout << iter.first << "\t" << iter.second << endl;
    }
}
int handlerSize()
{
    return callbacks.size();
}

 

In this way, each time the parent process will randomly dispatch random tasks to the child process:

 

 named pipe

         Unlike anonymous pipes, named pipes do not require affinity between processes, nor do they need to share the same terminal . Any process can communicate with a named pipe by opening its read and write ends.

What is a named pipe

Named Pipe (Named Pipe) is a communication mechanism between independent processes for data transfer between unrelated processes .

        Named pipes communicate by creating a special file in the file system . This special file is called a FIFO (First-in, First-out) or named pipe.

The principle of named pipe communication

        Like the anonymous channel, if you want the two parties to communicate, you must first let both parties see the same resource ! It is essentially the same as an anonymous pipe, but the way to see resources is different .

        Anonymous pipes see the same resources through parent-child process inheritance , also called pipe files. This file is purely memory-level, so it has no name and is called anonymous pipes.

The named pipe is a special file         on the disk . This file can be opened , but the data in the memory will not be refreshed to the disk after opening . There is a path on the disk, and the path is unique, so both parties can see the same resource, that is, the pipeline file, through the path of the file.

Here's the flow for named pipes:

  1. Create a named pipemkfifo()  : Create a special file in the file system by calling a system call  . This file is a named pipe. When creating a named pipe, you need to specify the name of the pipe and the required permissions.

  2. Open a named pipe : A process calls a system call  open() to open a named pipe and obtains a file descriptor. A process can open the read and write ends of a named pipe by opening a file with the same name.

  3. Process Communication: Once a named pipe is opened, processes can communicate using file descriptors. Each process can choose the read-side or the write-side to interact with the named pipe.

  4. Data transfer : The process  read() reads data from the named pipe by calling the system call on the read side, and writes the data into the named pipe by calling  write() the system call on the write side. The read end and write end can send and receive data through the file descriptor.

  5. Closing a named pipe : When a process has finished communicating,  resources can be released by calling close() close the named pipe's file descriptor . When all references to a named pipe are closed, the pipe's filesystem entry is deleted.

Use of mkfifo

        It is mentioned above that mkfifo needs to be used to create this special file so that independent processes can communicate through it. Let's take a look at its usage.

mkfifo [选项] 文件名

         Very simple to use, we generally don’t use the option, so just mkfifo + file name

        We create a name_pipe file in the current path.

Note that the front of the permission is p, which means a pipeline file.


We echo a message to this pipeline file at this time:

        We found that it is blocked here. This is because one party has written to the pipe file, but the other party has not read it, so we create a new window at this time, and then read the content in name_pipe:

 

         In this way, the information is successfully read out, which is the simple use of mkfifo.

The code simulates the named pipe communication process

        In fact, the process is similar to the anonymous pipeline, but the means to see the same resource are different.

        The mkfifo mentioned above is created by instructions, but if I want to use code, how can I achieve it? Here is a mkfifo function:

         The first parameter is the path to the created pipeline file, the second is the permissions.

         When created successfully, mkfifo returns 0, otherwise returns -1.

The overall process is like this:

First of all, we can be divided into server and client, and the server is responsible for

        1. Create a pipeline file and open it

        2. Perform normal communication with the client

        3. Finally close and delete the pipeline file

while the client

        1. First open the pipeline file

        2. Then carry out the normal communication process with the server

        Here for convenience, we have added a log so that we can see every step of the action.

        So there are four files in total, comm.hpp, client.cc, server.cc, Log.hpp.

comm.hpp

#pragma once
#include<iostream>
#include<cstdio>
#include<cstring>
#include<unistd.h>
#include<vector>
#include<string>
#include<sys/types.h>
#include<sys/stat.h>
#include<fcntl.h>

using namespace std;

#define MODE 0666
#define SIZE 128
string ipcPath = "./fifo.ipc";

client.cc

#include "comm.hpp"
int main()
{
    //1.获取管道文件
    int fd = open(ipcPath.c_str(),O_WRONLY);
    if(fd < 0)
    {
        perror("open");
        exit(1); 
    }
    //2.通信过程
    string buffer;
    while(true)
    {
        cout << "please Enter Message Line :> ";
        getline(cin,buffer);
        write(fd,buffer.c_str(),buffer.size());
    }


    //3.关闭文件
    return 0;
}

server.cc

#include"comm.hpp"
#include"Log.hpp"
int main()
{
    //1.创建管道文件
    if(mkfifo(ipcPath.c_str(),MODE) < 0)
    {
        perror("mkfifo");
        exit(1);
    }
    Log("创建管道文件成功",Debug) << "step 1" << endl;
    //2.正常的文件操作
    int fd = open(ipcPath.c_str(),O_RDONLY);
    if(fd < 0)
    {
        perror("open");
        exit(2);
    }
    Log("打开管道文件成功",Debug) << "step 2" << endl;

    //3.编写正常的通信代码
    char buffer[SIZE]; 
    while(true)
    {
        memset(buffer,'\0',sizeof(buffer));
        ssize_t s = read(fd,buffer,sizeof(buffer)-1);
        if(s > 0)
        {
            cout << "client say> " << buffer << endl;
        }
        else if(s == 0)
        {
            //end of file
            cerr << "read emd of file, client quit, server quit too!" << endl;
            break;
        }
        else
        {
            //read error
            perror("read");
        }
    }
    //4.关闭文件
    close(fd);
    Log("关闭管道文件成功",Debug) << "step 3" << endl;
    unlink(ipcPath.c_str());//通信完毕就删除文件
    Log("删除管道文件成功",Debug) << "step 4" << endl;
    return 0;
}

Log.hpp

#pragma once
#include <iostream>
#include <ctime>
#include<string>
using namespace std;

#define Debug 0
#define Notice 1
#define Warning 2
#define Error 3

string msg[] = {
    "Debug ",
    "Notice",
    "Warning",
    "Error"
};

ostream& Log(string message,int level)
{
    cout << " | " << (unsigned)time(NULL) << " | " << msg[level] << " | " << message;

    return cout;
}

 Then we compile and run again, we can create another Makefile to compile all the files directly, the content is as follows:

.PHONY:all
all:client server

client:client.cc
	g++ -o $@ $^ -std=c++11
server:server.cc
	g++ -o $@ $^ -std=c++11

.PHONY:clean
clean:
	rm -rf client server

At this point we can directly make. Then you will get two executable files client and server.

We open two windows, first run server.

 The first step of creating the pipeline is complete, and then we run the client in another window.

After running, it shows that opening the file is also successful. At this time, we input on the client and the server can read:

 Then we ctrl + c to exit the client, at this time the server will also break out of the loop, and then end.

 In this way, the code flow using named pipe communication is completed.

 

Guess you like

Origin blog.csdn.net/weixin_47257473/article/details/132058820