[Linux] How to implement the stand-alone version of QQ, let's look at the pipeline of inter-process communication

After learning the pipeline, you can realize simple qq~

Article directory


foreword

Why interprocess communication? Because the following things are required:

Data transfer: one process needs to send its data to another process
Resource sharing: The same resource is shared between multiple processes.
Notification event: A process needs to send a message to another process or a group of processes, informing it (they) that some event has occurred (such as notifying the parent process when the process terminates).
Process control: Some processes want to completely control the execution of another process (such as Debug process). At this time, the control process hopes to be able to intercept all traps and exceptions of another process, and to be able to know its state changes in time.

1. Pipeline

1. Anonymous pipes

First of all, as we said earlier, processes are independent, but to achieve communication between two processes, the two processes must see the same resource. What should we do? How can two independent processes see the same resource? ? In fact, this operation is directly or indirectly provided by the operating system. Let's take a look at the pipeline first, and then draw a picture to explain how to let the process see the same resource.

The who command is a process, and wc is also a process that completes inter-process communication through pipes, so how to prove that these two are processes? Let's prove it:

First, we created 3 processes with the sleep command, and then added an & symbol to make these processes run in the background, otherwise we would not be able to input commands if they were running in the foreground, and then found that the three sleep processes were all there and they were Both parent processes are bash. Comparing sleep to the who and wc instructions just now proves interprocess communication.

Let's explain the principle of the pipeline:

 First of all, there is a process, that is, task_struct. Each process has a corresponding file descriptor table, and there is a corresponding array in the file descriptor table. The standard input 0, standard output 1, and standard error 2 are stored in the array, and each The process descriptor will store the address of the corresponding struct file. When communicating between processes, the system will provide a memory file. This memory file will not be refreshed on the disk. This file is called an anonymous file. When we open a file, then we fork to create a child process, the child process also has task_struct, and the child process will inherit the file descriptor table of the parent process (but will not copy the file object opened by the parent process), and the address of the file is stored in the file descriptor table They are all the same, so the file descriptor table of the child process also points to the file of the parent process. Because of this, a file is opened for reading and writing in the parent process, and the child process also opens for reading and writing and the parent process opens The same file, which allows two processes to see the same resource. But this kind of pipeline can only achieve one-way communication. For example, we close the write end of the parent process, close the read end of the child process and let the child process write to the two processes to achieve one-way communication. The reason why the pipe can only communicate in one direction is that the file has only one buffer, one writing location and one reading location, so it can only communicate in one direction. If you want two-way communication, open two pipes! The pipeline mentioned above is an anonymous pipeline.

Let's write a pipeline program:

 First of all, 0 1 2 is the standard input, standard output, and standard error opened by default, and 3 is the reading end, and 4 is the writing end. If we want to implement a one-way channel, the parent-child process must read and write.

First, we create the .cc file and makefile that we will use later in VScode.

To create a pipeline, first we need to know the function pipe to create a pipeline:

One parameter of this function is a one-dimensional array. This one-dimensional array has only two elements. This parameter is also called an output parameter (these two elements are the read end and the write end in our example).

Now we complete the main body:

#include <iostream>
#include <string>
#include <cassert>
#include <unistd.h>
#include <vector>
#include "task.hpp"
#include <sys/wait.h>
#include <sys/types.h>
using namespace std;
int main()
{
    //1.1创建管道
    int pipefd[2] = {0};
    int n = pipe(pipefd);
    if (n<0)
    {
       std::cout<<"pipe error,"<<errno<<":"<<strerror(errno)<<std::endl;
       return 1;
    } 
    std::cout<<"pipefd[0]:"<<pipefd[0]<<std::endl;
    std::cout<<"pipefd[1]:"<<pipefd[1]<<std::endl;
    //1.2创建进程
    pid_t id = fork();
    if (id==-1)
    {
       //创建子进程失败
       exit(-1);
    }
    if (id==0)
    {
       //子进程
       close(pipefd[0]);
       const std::string namestr = "hello,我是子进程";
       int cnt = 1;
       char buffer[1024];
       while (true)
       {
           snprintf(buffer,sizeof buffer,"%s,计数器:%d,我的pid:%d\n",namestr.c_str(),cnt++,getpid());
           write(pipefd[1],buffer,strlen(buffer));
           sleep(1);
       }
       //写入成功后将写端关闭
       close(pipefd[1])
       exit(0);
    }
    //父进程
    close(pipefd[1]);
    //关闭不需要的fd,让父进程读取,子进程写入。
    char buffer[1024];
    while (true)
    {
       int n = read(pipefd[0],buffer,sizeof(buffer-1));
       if (n>0)
       {
           buffer[n] = '\0';
           std::cout<<"我是父进程,child give me message:"<<buffer<<std::endl;
       }
    }
    return 0;

}

 Our program is very simple. First create a pipeline. If the creation fails, an error will be reported and the error code will be printed. Then we will print the read end and write end (if there is no problem, the read end is 3 and the write end is 4), and then the second step we create To create a child process, it is also necessary to judge the failure of the creation. If the creation is successful, we first close the reading end of the child process, and then create a 1024 buffer to write the string into the buffer. Sleep can make the phenomenon more obvious. After writing, in order to ensure the safety of the pipeline, we close the write terminal just used. In the parent process, the same reason is changed to read. Let's run the code below:

 We do see two processes, and indeed a simple one-way communication is done, as the child process gives its data to the parent process. Can we know the characteristics of the pipeline through the above example code?

1. One-way communication

2. The essence of the pipeline is a file, because the life cycle of fd follows the process, and the life cycle of the pipeline also follows the process.

3. Pipeline communication is usually used for processes with "blood relationship" and inter-process communication. It is often used for parent-child communication----pipe opens the pipe, and the name of the pipe is not clear, so it is an anonymous pipe.

Next, let's modify the code. Just now we let the child process send slower (sleep for 1 second), now we let the parent process read slower to see what it looks like:

 Now let's run it:

 Through the results, we found that the number of writes and the number of reads are not strictly matched . So this is also a feature of the pipeline.

 Next, let's modify the code again, let the parent process read back to normal, let the child process write slower, let's see the effect again:

 

Through the program running results, we found that when writing is slow, reading also slows down, that is to say, writing will affect reading.

Next, let's modify the code again. We let the child process continue to write a character, and the parent process waits for 10 seconds before reading:

 

 

From the above figure, we can see that when the child process writes 65535 pieces of data, which is 2 to the 16th power, the parent process prints directly, and the time does not reach 10 seconds, so the reason for printing is because the pipeline has capacity , when our write end fills the pipeline file, we cannot continue to write. This corresponds to the fact that the read end has read all the pipeline data before, and if the write end does not send, we can only wait. So another feature of the pipeline comes: it has a certain coordination ability, so that the reader and the writer can communicate according to certain steps (with its own synchronization mechanism).

Next, let's try to close the write end and close the read end, what will happen?

 The return value of the read function can determine how much has been read. For example, if the return value is greater than 0, it means that all the data in the buffer has been read. If the return value is equal to 0, it means that the end of the file has been read. Otherwise, it is a read exception. Let’s run the code below:

The result is that after the child process writes a character and exits, the parent process reads \0 and then reads to the end of the file and exits. So what happens if we keep writing on the writing end and close the reading end? Because the behavior here is meaningless, the operating system will kill the process that has been writing (because the operating system will not maintain meaningless, inefficient, or wasteful things, the operating system will kill the process that has been writing , the process will be terminated by a signal (signal No. 13), so how to prove that it will be killed by the operating system, let the parent process wait for the child process and then take a look:

 Through the results, we found that this is indeed the case, the child process was terminated by the operating system with signal No. 13. The above is the relevant knowledge of anonymous pipes. Next, we use a batch of pipes to implement the code for controlling the process:

 First, we create a ctrlProecess.cc folder, and then write a makefile. After the preparations are done, we can write the code according to the idea. Our first step is to build the control structure, the parent process writes, and the child process reads , and since there are multiple pipelines, we directly use a loop to control:

 When we judge whether the child process is successfully created, we directly use assert assertion. In fact, we should use if statement to judge here, but since this function basically does not make mistakes and we just demonstrate the code, we use assertion. When the return value is equal to 0, then it must be a child process, otherwise it is a parent process, so how do we communicate, let the parent process write, and the child process read, so we close the unnecessary ports first.

Now that the preparatory work has been completed, how to let the parent process manage the pipelines and processes created by itself? Remember what we said earlier about describing first, organizing? The answer is this.

 In order to describe these created pipelines, we use a class to encapsulate them. The member variables include the pid of the child process and the port required for the parent process to write, and then we initialize these variables in the constructor.

class EndPoint
{
public:
     pid_t _child_id;
     int _write_fd;
     EndPoint(int id,int fd)
          :_child_id(id)
          ,_write_fd(fd)
     {
    
     }
     ~EndPoint()
     {

     }

};

Because there are multiple pipelines, we will put the custom object just now into the vector for management. Isn’t this the first description, the task of organizing? Please remember to add the required header files. Next, we will define the vector to function at the very beginning.

Our purpose is to let the parent process write and the child process read, so the child process must be described in the place of the parent process:

 After putting the child processes and write-end objects into the vector, the parent process gets a batch of Endpoint objects. This structure contains the write-end of the file descriptor to be written and the new child process. The following job is to let the child process read instructions. We want the child process to read from the standard input when reading the instructions. To read the knowledge we have learned before from the standard input, that is input Redirection, there are many ways to redirect input, we mainly use the function method:

 Remember the dup2 function we learned before, it can come in handy here. Our purpose is to have the child process read from the standard input, so the first parameter old of the function is pipefd[0] because we want to redirect pipefd[0] to 0 (standard input), so the second The parameter new is 0, and then we let the child process start to wait for the command. Here we directly write a function WaitCommand() to write all the codes into this function, and put the code of the newly created process into a creatProcess function. Letting each function complete the corresponding function will make the code look more concise:

 Next, first write a read function and an infinite loop in the function that allows the child process to execute the method. Let's see if the code can create the child process normally, and then write the subsequent code:

 

 After running, we can see that there is nothing wrong with the function of creating a process written before. Next, let's design a function to let the child process execute. Before completing this function, create a new header file task.hpp, and then create some simple printing tasks:

 We first define a vector in the Task class. The type stored in the vector is a function pointer type, and each function pointer points to a function method, so that we can call different methods through instructions. For the instruction command, we also As mentioned above, it is actually a bitmap. We define three macro values ​​to represent three printing methods, and let the corresponding macro value on the command parameter & be completed according to which method to call. We add the header file to the .cc file, and then define a global Task class:

Next, we continue to implement the subprocess task function:

 In the sub-process function, our default command is int type, and then judge whether the read is successful through the return value of the read function. If the return value is equal to 4 bytes, it means that the command is read, then call the corresponding function method. If the return value is equal to 0, then It means that the current sub-process is going to exit and directly break, and the rest is the failure to read and exit directly. Of course, we can’t just execute it once, so we add a loop to let the sub-process perform tasks repeatedly:

 So what else does the next parent process need to do? The parent process needs to select the task, select the process and then issue the task. Let's implement it as follows:

 The task here can be randomly selected, but we only use the LOG task for testing, and then directly use the random function to select the process when selecting the process. The task is to let the parent process write, and the write is the instruction corresponding method, and then we add getpid to the previously promised code, and print out the pid of the child process:

 The following is the code after running:

 In this way, we have completed the functions to be realized. Now we optimize the corresponding functions so that the selection of tasks can be interactive. The first step is to allow users to select tasks. Here we first implement a selection panel:

 After finishing the selection panel, we can let the user choose the corresponding task:

Next, we make the function of selecting processes sequential, so that there will be no situation where a process is randomly executed each time:

 Next, let's realize which task is being executed every time we get a task:

  We first create a static int variable to serve as the number of processes, and then we initialize the number to 0 outside the class, the process name of string type receives the characters in the buffer, and the function name can return a process name. Then we also print it in the task selection block:

 Then if the process exits, there will also be a prompt, so we add a print prompt to the child process:

 When n is equal to 0, it means that the parent process has closed the writing end, and then print it to tell the user. Then we simplify the code again, and put the control function code written in the main function just now into the ctrlProcess function:

 I wrote here and found an error on line 124, because the returned name function needs to add const so that static variables can also be used (because the parameter of our ctrlProcess function is a const object)

 

 Let's run the program below:

 After the program runs, we find that input 3 cannot exit, and the program does not print when other commands are input, so we use the if statement to judge:

 In addition to this problem, we also need to solve the problem of child process exit. Remember the zombie process we said before. If the child process is not waited by the parent process, the child process will become a zombie state. In order to prevent this situation, we will write another Process waiting function function:

 Before reclaiming the child process, we need to close all the write terminals of the parent process, and then after waiting for 10 seconds, we let the parent process recycle each child process, thus completing the simple recycle function of the child process.

When we run it, we find that our requirements can be fulfilled, and the child process can be recycled normally after exiting.

I don’t know if anyone has any questions, why can’t we just recycle the child process while closing it in the same loop, as shown in the figure below:

 Let's run the code like this first:

 We found that the program is stuck in the recycling process after writing in this way, why is this? In fact, it is related to the child process. We have long said that the child process will inherit the file descriptor opened by the parent process (this is the principle of the pipeline). The pitfall here is that when the child process is created for the second time and after the second time, This child process will inherit the ports of all previous parent processes, that is to say, the later child processes will inherit more ports, as shown in the following figure:

In other words, the reason for the card owner just now is that the sub-process was recycled before all the write ends were closed, which caused blocking. And when we use a loop to close the port of the parent process alone, this problem will not occur, because we close the write port of the parent process first, and then recycle the child process after all are closed.

Since the above explanations are all in the form of pictures, let's send out the example code this time:

The first is in the ctrlProcess.cc file:

#include <iostream>
#include <string>
#include <cassert>
#include <unistd.h>
#include <vector>
#include "task.hpp"
#include <sys/wait.h>
#include <sys/types.h>
using namespace std;
const int gnum = 3;
Task t;
class EndPoint
{
private:
     static int number;
public:
     pid_t _child_id;
     int _write_fd;
     std::string processname;
     EndPoint(int id,int fd)
          :_child_id(id)
          ,_write_fd(fd)
     {
         char namebuffer[64];
         snprintf(namebuffer,sizeof(namebuffer),"process-%d[%d:%d]",number++,_child_id,_write_fd);
         processname = namebuffer;
     }
     std::string &name()
     {
        return processname;
     }
     ~EndPoint()
     {

     }

};
int EndPoint::number = 0;
//子进程要执行的方法
void WaitCommand()
{
    while (true)
    {
        int command = 0;
        int n = read(0,&command,sizeof(int));
        if (n==sizeof(int))
        {
            t.Execute(command);
        }
        else if(n==0)
        {
            std::cout<<"父进程让我退出,我就退出了: "<<getpid()<<std::endl;
            break;
        }
        else 
        {
            break;
        }
    }
   
}
void CreatProcess( vector<EndPoint>* end_points)
{
     //1.先进行构建控制结构,父进程写入,子进程读取
    for (int i = 0;i<gnum;i++)
    {
        //1.1创建管道
        int pipefd[2] = {0};
        int n = pipe(pipefd);
        assert(n==0);
        (void)n;
        //1.2创建进程
        pid_t id = fork();
        assert(id!=-1);
        //返回值等于0一定是子进程
        if (id==0)
        {
            //先关闭不要的fd
            close(pipefd[1]);
            //我们期望所有的子进程读取指令的时候,都从标准输入读取
            //1.3.1输入重定向
            dup2(pipefd[0],0);  //读0就像读管道一样
            //1.3.2让子进程开始等待获取命令
            WaitCommand();
            close(pipefd[0]);
            exit(0);
        }

        //一定是父进程
        //1.3关闭不要的fd
        close(pipefd[0]);

        //1.4将新的子进程和他的管道写端,构建对象
        end_points->push_back(EndPoint(id,pipefd[1]));

    }
}
int ShowBoard()
{
    std::cout<<"################################"<<std::endl;
    std::cout<<"##0. 执行日志任务 1. 执行数据库任务##"<<std::endl;
    std::cout<<"##2.执行请求任务  3.退出#########"<<std::endl;
    std::cout<<"################################"<<std::endl;
    std::cout<<"请选择#";
    int command = 0;
    std::cin>>command;
    return command;
}
void ctrlProcess(vector<EndPoint>& end_points) 
{
    //2.1我们可以写成自动化的,也可以写成交互式的
    int cnt = 0;
    int num = 0;
    while (true)
    {
        //1.选择任务
        int command = ShowBoard();
        if (command==3)
        {
            break;
        }
        if (command<0|| command>2)
        {
            continue;
        }
        //2.选择进程
        int index = cnt++;
        cnt%=end_points.size();
        std::cout<<"选择了进程:"<<end_points[index].name()<<" | 处理任务"<<command<<std::endl;
        //3.下发任务
        write(end_points[index]._write_fd,&command,sizeof(command));

        //sleep(1);
    }
}
void ExitProcess()
{
    exit(0);
}
void waitProcess(const vector<EndPoint>& end_points)
{
     //1.我们需要让子进程全部退出 -- 只需要让父进程关闭所有的写端
     //for (const auto &ep:end_points)
     for (int end = end_points.size()-1;end>=0;end--)
     {
        std::cout<<"父进程让所有的子进程全部退出"<<std::endl;
        //先关闭最后一个写端倒着一直关闭,因为子进程会继承父进程的文件描述符所以
        //后面的子进程都会链接到第一个父进程的写端,如果关第一个无法全部关闭,会造成
        //阻塞
        close(end_points[end]._write_fd);
        std::cout<<"父进程回收了所有的子进程"<<std::endl;
        waitpid(end_points[end]._child_id,nullptr,0);

     }
     //std::cout<<"父进程让所有的子进程全部退出"<<std::endl;
     sleep(10);
     //2.父进程要回收子进程的僵尸状态
     //for (const auto& ep:end_points)
     //{
     //   waitpid(ep._child_id,nullptr,0);
     //}
     //std::cout<<"父进程回收了所有的子进程"<<std::endl;
     //sleep(10);
}
int main()
{
    //1.先进行构建控制结构,父进程写入,子进程读取
    vector<EndPoint> end_points;
    CreatProcess(&end_points);
    //2.我们得到了什么?
    ctrlProcess(end_points);
    //3.处理所有的退出问题
    waitProcess(end_points);
    return 0;
}

The following is in the header file task.hpp:

#include <iostream>
#include <vector>
#include <unistd.h>
typedef void(*fun_t)();  //函数指针


void PrintLog()
{
    std::cout<<"pid:"<<getpid()<<",打印日志任务,正在被执行..."<<std::endl;
}
void InsertMySQL()
{
    std::cout<<"执行数据库任务,正在被执行..."<<std::endl;
}
void NetRequest()
{
    std::cout<<"执行网络请求任务,正在被执行..."<<std::endl;
}
#define COMMAND_LOG 0
#define COMMAND_MYSQL 1
#define COMMAND_REQUEST 2
class Task
{
public:
    Task()
    {
        funcs.push_back(PrintLog);
        funcs.push_back(InsertMySQL);
        funcs.push_back(NetRequest);
    }
    ~Task()
    {

    }
    void Execute(int command)
    {
        if (command >= 0&& command < funcs.size())
        {
            funcs[command]();
        }
    }

public:
    std::vector<fun_t> funcs;
};

Summarize

Pipeline read and write rules:

When there is no data to read:
O_NONBLOCK disable : The read call is blocked, that is, the process suspends execution and waits until data arrives.
O_NONBLOCK enable : The read call returns -1 , and the errno value is EAGAIN .
When the pipe is full:
O_NONBLOCK disable : The write call is blocked until a process reads the data
O_NONBLOCK enable : The call returns -1 , and the errno value is EAGAIN
If all the file descriptors corresponding to the write end of the pipe are closed, read returns 0
If the file descriptors corresponding to all pipe read ends are closed, the write operation will generate a signal SIGPIPE, which may cause the write process to exit
When the amount of data to be written is not greater than PIPE_BUF , linux will guarantee the atomicity of writing.
When the amount of data to be written is greater than PIPE_BUF , linux will no longer guarantee the atomicity of writing.
Pipe Features:
It can only be used to communicate between processes with a common ancestor (processes with kinship); usually, a pipe is created by a process, and then the process calls fork, and the pipe can be used between the parent and child processes thereafter.
Pipelines provide streaming services
Generally speaking, when the process exits, the pipeline is released, so the life cycle of the pipeline varies with the process
In general, the kernel will synchronize and mutex pipeline operations
The pipeline is half-duplex, and data can only flow in one direction; when two parties need to communicate, two pipelines need to be established

Guess you like

Origin blog.csdn.net/Sxy_wspsby/article/details/130393865