Learning System Programming No.19 [Control Process of Inter-process Communication]

introduction:

Beijing time: 2023/4/13/8:00, eight people early, eight souls early, there is not enough time, the introduction of Momo is just right, I have been learning Linux and Linux-related system-level knowledge for a long time, unconsciously , I found that I have reached the 19th article, and I have already surpassed the 18th article of C++. It seems that Brother Hang’s money is about to be repaid. The difficulty is increasing, maybe it is because the classes in school are increasing, maybe it is because I am getting worse and worse, anyway, there are many related factors, this may be life, it can’t be so smooth, but I firmly believe in the goal, just move forward foolishly , the class will end one day, the money will be paid back one day, the university will graduate one day, and life will end one day, so, no matter how much money you owe, cherish the present and grasp the present (this kind of words, compulsory education Bury my heart deeply, hahaha!) Just go forward bravely! No matter how much he has, God has the biggest arrangement, ok, I am a professional in playing with my mouth, I got 07 points, and I went to class, but before that, let's write the learning content first, and we will go deep into the pipeline in this blog, and do what we can Let's learn the knowledge of shared memory, the knowledge of pipeline is extensive and profound, we are not in the climate yet!

insert image description here

In-depth pipeline knowledge

In the last blog, we got acquainted with one of the most classic scenarios of inter-process communication, the scenario of anonymous pipelines, and also wrote code to build this scenario, but what we learned was only the shallowest knowledge about pipelines So this blog, let us go deep into the knowledge about the pipeline, thoroughly understand how the processes communicate, and in what scenarios and environments

In the last blog on anonymous pipes
, we learned about the use of anonymous pipes to build an inter-process communication environment. The essence is to allow blood-related processes to share the same "resource". Of course, the resource here refers to the anonymous pipe memory-level file object resource. Through this file object, blood relationship processes can communicate at this time, so the essence of inter-process communication is not how to communicate, but how to build a channel scheme for inter-process communication

Inter-process communication code implementation (simple)

insert image description here
As in the above code, at this point we have realized the communication between the parent process and the child process, allowing them to realize the function of data transmission (the child process can pass data to the parent process)

Phenomena and conclusions of pipeline communication

(On the premise that the child process writes data and the parent process reads data)

  1. Write data remains unchanged, when reading data, read slower

The phenomenon is
insert image description here
shown in the figure: conclusion: in pipeline communication, the number of writes and the number of reads are not strictly matched , which means that there is no big relationship between reading and writing. You write yours, and I read mine. When you write to that position, I will read from that position when I read

  1. The read data remains unchanged, and the write data is slower

The phenomenon is as
insert image description here
shown in the figure: the conclusion: the writing becomes slower, which leads to the slower reading, that is, after readreading all the data in the pipeline, if the other party does not transmit the data, we can only wait

  1. The write end keeps writing data (no buffer buffer is used), and the read end waits

The phenomenon is as
insert image description here
shown in the figure: It is concluded that the pipeline file also has a size (memory) , so when we writefill the pipeline file, we cannot continue to write at this time. The essence is that the file is full and cannot be written. Only After the data is read by the reader (at this time, all the data in the pipeline file can be read at one time), the writer can continue to write

  1. The write end closes after writing data once, and the read end keeps reading data

The phenomenon is as shown in the figure
insert image description here
: It is concluded that when the reading end reads the data in the pipeline file for the first time, it reads the data written by the writing end. When it is Xread for the second time, because the writing end is closed, So readwhat is read is an empty pipe file, and 0 is returned at this time, which means that the data with a size of 0 bytes has been read, that is, the end of the file has been read.

  1. The write end is always writing, and the read end is closed

The phenomenon is shown in the figure:
insert image description hereinsert image description here

It is concluded that the operating system will terminate the process directly, because the operating system will not maintain meaningless and resource-wasting processes, and knows that the operating system terminates the process through a signal (signal: 13 )

And note : when we read data at this time, because we directly use pipe to create two file descriptors pointing to the same file object, and one opens the file for writing, and the other opens the file for reading, so pay attention That is, at this time, the read and write are only read-only and write-only, so at this time, since it is read-only open, after we read the data in the file, after the file is closed, the data in the file The data will be cleared, which can be well linked with the above knowledge!

Pipeline read and write rules

1. When there is no data to read, O_NONBLOCK disable: The read call is blocked, that is, the process suspends execution until data arrives. O_NONBLOCK
enable: The read call returns -1, and the errno value is EAGAIN.
2. When the pipe is full, O_NONBLOCK disable: the write call is blocked until a process reads the data. O_NONBLOCK enable: The call returns -1, and the errno value is EAGAIN
3. If all the file descriptors corresponding to the write end of the pipe are closed, read returns 0
4. If the file descriptors corresponding to all pipe read ends are closed, the write operation will generate a signal SIGPIPE, which may cause the write process to exit
5. When the amount of data to be written is not greater than PIPE_BUF, linux will guarantee that the written atomicity.
6. When the amount of data to be written is greater than PIPE_BUF, Linux will no longer guarantee the atomicity of writing.

Interested students can refer to this link to learn about what is an atomic problem: Detailed Explanation of Atomic Problems

Characteristics of pipeline communication (obtained from the above phenomenon)

  1. one-way communication
  2. Because the life cycle of the file descriptor is related to the process, the process is destroyed, the file corresponding to the process is closed, the process is created, the file corresponding to the process is opened, and since the pipeline is essentially a file, the life cycle of the pipeline is also determined by the process
  3. Pipeline communication is used to carry out inter-process communication between processes with "blood relationship", but it is commonly used to communicate between parent and child processes, because their file descriptor tables are essentially the same, and more precisely because of their child processes It is the process pcb that inherits the parent process, so it inherits the pointer to the file descriptor table in the process pcb, so as long as they have a "blood relationship", their file descriptor tables are the same
  4. Anonymous pipe, we don't know what the name of this pipe is, because it is essentially a memory-level file, created by the kernel and maintained by the operating system
  5. In pipeline communication, the number of writes and the number of reads are not strictly matched
  6. The pipeline can make inter-process communication have a certain coordination ability, so that reading and writing can communicate according to certain steps, with its own synchronization mechanism

If it is full, you cannot continue writing. The purpose is to avoid data being overwritten . Interested students can learn about the concepts of mutual exclusion and synchronization. Refer to this link: Mutual exclusion and synchronization understanding

注意: A single pipe is a half-duplex (one-way reading and writing), full-duplex (reading while writing, reading and writing at the same time)

Process control through pipes

After getting the above knowledge, we have a further understanding of the knowledge of the pipeline, and through the above-mentioned reading and writing rules of the pipeline and the characteristics of pipeline communication, we can now control the process by ourselves according to the principle, through the pipeline way to realize the control of one parent process over multiple child processes

The schematic diagram is as follows:
insert image description here

The specific code is as follows:

#include <iostream>
#include <string>
#include <vector> //目的:就是为了用vector数组的形式将所有的EndPoint管理起来
#include <cassert>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/wait.h>
#include <sys/types.h>

using namespace std;

const int pipename = 5; // 这个表示的就是我们需要控制的进程个数(本质就是通过循环创建来控制而已)

class EndPoint // 目的:通过先描述再组织进行管理
{
    
    
private:
    static int number; // 这个位置好奇可以去复习一下(切记在类外初始化就没什么大问题)

public:                 // 类内的成员对象,最好是带一个_ ,不然写拷贝构造容易区分不了
    pid_t _child_id;    // 代表的就是我要管理的子进程对象
    int _write_fd;      // 作为父进程,在管理管道的时候,我们不需要考虑别的,只要考虑要往那个管道里面写就行了,所以此时这个参数表示的就是具体向那个管道里面写
    string processname; // 给进程取一个名字
public:
    EndPoint(int id, int fd) : _child_id(id), _write_fd(fd) // 构造函数初始化(直接用对应传上来的id和fd进行初始化进行,这样就确定了写入那个管道,和最终是那个进程进行读取)
    {
    
                                                           // 设计成 process-0[pid:fd]
        char namebuffer[64];
        snprintf(namebuffer, sizeof(namebuffer), "process-%d[%d:%d]", number++, _child_id, _write_fd);
        processname = namebuffer;
    }
    const string &name() const // 使用函数的方法,把对象供给外部使用(但是该对象此时并不是私有,所以也可以直接用类对象调用)
    {
    
    
        return processname;
    }
    ~EndPoint()
    {
    
    
    }
};
int EndPoint::number = 0;

//---------------------------------------------------------------------------------------------------------
// 目的:使用函数指针搞定任务,让子进程可以通过操作码调用想要执行的任务

typedef void (*func_t)(); // 重定义一个函数指针,这个是C语言定义的写法,本质上写法:typedef void(*)() func_t;
// 此时这个函数指针,目的是为了帮我们创建不同的函数接口模仿不同的工作任务,并且让这些任务交给子进程去完成

void PrintLog()
{
    
    
    cout << "pid:" << getpid() << ", 打印日志任务,正在被执行..." << endl;
}
void InsertMySQL()
{
    
    
    cout << "pid:" << getpid() << ",执行数据块任务,正在被执行..." << endl;
}
void NetRequest()
{
    
    
    cout << "pid:" << getpid() << ", 执行网络请求任务,正在被执行..." << endl;
}

// 约定,每一个command都是4字节(并且可以使用枚举的方法)
#define COMMAND_LOG 0 // 这些下标的来源是因为我们已经手动的把,函数接口插入到了vector数组中,所以才可以根据这些操作码来调用对应指定的函数(本质还是下标而已)
#define COMMAND_MYSQL 1
#define COMMAND_REQUEST 2

class Task
{
    
    
public:
    Task() // 向函数指针对象中加载内容,也就是进行初始化
    {
    
    
        funcs.push_back(PrintLog);
        funcs.push_back(InsertMySQL);
        funcs.push_back(NetRequest);
    }
    void Execute(int command) // 此时由于上述我们定义了很多的命令操作码,所以此时想要执行那个对应的命令,只需要将该命令的操作码传过来就行了
    {
    
    
        if (command >= 0 && command < funcs.size())
        {
    
    
            funcs[command](); // 判断,发现是合法请求,那么此时就通过command下标,执行函数指针对象中对应下标的函数接口,也就是执行相应的功能
        }
    }
    ~Task()
    {
    
    
    }

public:
    vector<func_t> funcs;
};

//----------------------------------------------------------------------------------------------------------

Task task; // 直接定义一个全局的对象供给子进程使用

void WaitCommand() // 这个接口可以让子进程不断的去执行对应的命令,也就是上述的各种函数接口
{
    
    
    while (true) // 明白子进程等待命令,不是等待一次就可以的,它是需要循环等待,然后才可以循环执行(同时执行不同的功能)
    {
    
    
        int command;
        int n = read(0, &command, sizeof(int)); // sizeof(int),每次读数据都规定只读4个字节,因为本质只是在读取具体的命令而已,本质就是在读取vector下标对应的函数指针
        if (n == sizeof(int))                   // 此时read的返回值为4,表示它读到了一个命令,也就是读取成功的意思(并且因为会有读取失败的情况,所以不使用assert,而是使用判断)
        {
    
    
            task.Execute(command);
        }
        else if (n == 0) // 此时n=0表示的就是读不到数据,也就是没数据读了,也就是说明对段关闭(写端关闭)
        {
    
    
            cout << "父进程让我退出,我就退出了:" << getpid() << endl;
            break; // break函数只可以运用于循环函数中,不可以运用于if函数
        }
        else
        {
    
    
            break;
        }
    }
}

void CreateProcess(vector<EndPoint> *end_points)
{
    
    
    for (int i = 0; i < pipename; ++i)
    {
    
    
        // 1.1创建管道
        int pipefd[2] = {
    
    0};  // 因为在最后我们把父进程以读方式打开的文件和以写方式打开的文件都关闭了,所以此时等下次循环之后,再次创建读写文件描述符时
        int n = pipe(pipefd); // 此时我们就会有一个问题,就是不知道那个读写文件描述符是和那个进程相匹配(想要解决这个问题,此时就需要有一个结构体),本质就是利用这个结构体,把创建出来管道和进程进行管理(通过先描述,再组织的方式)
        assert(n == 0);
        // 1.2创建进程
        pid_t id = fork();
        assert(id != -1);
        if (id == 0)
        {
    
     // 一定是子进程
            // 1.3 构建单向通信,关闭不需要的文件描述符
            close(pipefd[1]);
            // 这个位置一定要明白一个点,就是因为进程间具有独立性(写时拷贝),所以父进程的vector和子进程是没有什么关联的
            // 子进程此时并不需要如何处理,唯一要处理的就是应该如何读数据,并且由于    所以只需要默认让子进程去标准输入中进行读取就行
            // 1.3.1 输入重定向
            dup2(pipefd[0], 0); // 将管道文件重定向到标准输入中,让子进程可以直接从标准输入中进行读取(也可以不进行重定向,只要把对应的文件描述符传过去也是可以的)
            // 1.3.2 子进程开始等待获取命令(也就是准备读取管道中的数据)
            WaitCommand(); // 不需要参数,因为重定向之后,waitcommand读取数据,就是直接从标准输入中读取

            close(pipefd[0]);
            exit(0); // 并且此时通过在这个直接退出,子进程就不会继续向后执行代码,干扰到父进程了
        }
        // 一定是父进程
        close(pipefd[0]);

        // 1.4 将新的子进程和他的管道写端,构建对象(不要想的太复杂,想不明白就想想一对一传送),关键点就是注意,写入的是那个文件描述符,和是那个进程进行的读取
        end_points->push_back(EndPoint(id, pipefd[1])); // 此时这个传参表示的就是我要写入的管道和进程的读取
        // 也就表示我们此时已经将对应的管道和进程插入到了end_points这个结构体数组类型中,也就表示最终,我们可以通过这个数组,来控制各个子进程
        // 并且注意,此时我们还是在循环条件里面,所以此时vector数组中 0 1 2 3 4 下标中存放的数据,就是我们对应的写入管道和读取的进程了(因为我们是一个结构体)
        // 所以此时就导致,我们想要向那个管道中写入数据或者是向那个进程传输数据,此时就只需要去vector数组找到它对应的下标就行了
    }
}

int ShowBoard()
{
    
    
    cout << "------------------------------------------------" << endl;
    cout << "------0.执行日志任务-------1.执行数据块任务-----" << endl;
    cout << "------2.执行请求任务-------3.退出程序-----------" << endl;
    cout << "------------------------------------------------" << endl;
    cout << "请选择你要执行的任务:";

    int command = 0;
    cin >> command;
    return command;
}

void CtrlProcess(const vector<EndPoint> &end_points) // 记住这边不可以直接用指针,用引用会更好
{
    
    
    int cnt = 0;
    // 注意:此时如下的命令读取,我们可以写成自动的,也可以写成交互式的,此时我们就把它改成交互式的就行
    while (true) // 通过循环控制vector下标,让子进程按照顺序执行对应功能
    {
    
    
        // 1.选择任务
        // int command = COMMAND_LOG; // 经过#define此时这个命令本质上就是下标0对应的函数指针
        int command = ShowBoard(); // 交互式,直接选择命令
        if (command < 0 || command > 3)
        {
    
    
            cout << "选择有误,请重新选择:" << endl;
            continue;
        }
        if (command == 3)
        {
    
    
            break;
        }

        // 2.选择进程
        // int index = rand() % end_points.size();//就是产生一个随机数,然后去%我的进程个数,最后得到的肯定是一个小于进程个数的随机数(可以自己演算一下)
        int index = cnt++;        // 此时这边是可以直接把进程pid通过遍历vector给打出来,然后选择执行那个,但是这里使用的是轮询的方法,让进程轮询执行任务
        cnt %= end_points.size(); // 轮询经典写法(细节拉满),也可以理解是一个归零操作
        cout << "选择了进程:" << end_points[index].name() << " | 处理任务:" << command << endl;

        // 3.下发任务
        write(end_points[index]._write_fd, &command, sizeof(command)); // 表示的就是向那个管道种写命令(此时表示的就是向index这个进程对应的管道写)

        sleep(3);
    }
}
void WaitProcess(const vector<EndPoint> &end_points) // 该接口目的:就是为了保证所有的子进程退出,并且不处于僵尸状态(被回收)
{
    
    
    for (int end = end_points.size() - 1; end >= 0; --end) // 根据继承问题,所以此时需要倒着回收,不然会导致子进程一直阻塞住
    {
    
    
        // 1.要让子进程全部退出,只需要让父进程关闭写端就行(进程读写规则确定)

        cout << "父进程让所有的子进程全部退出:" << end_points[end]._child_id << endl;
        close(end_points[end]._write_fd); // 此时管道的写端就是在end_points对象中
        
        // 2.父进程回收子进程僵尸状态
        waitpid(end_points[end]._child_id, 0, 0);
        cout << "父进程回收了所有的子进程:" << end_points[end]._child_id << endl;
    }

    sleep(10); // 不着急退出
}

int main()
{
    
    
    // 1.先进行控制进程的结构构建(本质就是创建5个管道和5个进程)
    vector<EndPoint> end_points; // 这个参数代表的就是父进程要管理的每一个写端(本质就是具体将数据写到那一个子进程中)
    CreateProcess(&end_points);

    // 2.当调用完上述构建进程控制的接口,此时父进程就把所有的管道文件(fd)和子进程(id)给录入到了vector数组中,此时父进程继续向下执行,此时就可以通过这个数据去控制进程了
    CtrlProcess(end_points); // 交互式控制进程实现

    // 3.代码走到这里表示的就是我们不想控制进程了,想要退出了(此时就要做好善后工作)
    WaitProcess(end_points);

    // 4.代码大致搞定,测试一下就行

    return 0;
}

And note that the above code involves several knowledge points that are relatively remote, as follows:

Detailed explanation of function pointers Interested students can review it

Redefinition of function pointer:
The C language stipulates that it should be written like this: the way typedef void(*func_t)();I think it is written: typedef void(*)() func_t;, so there is a difference between the two. It should be noted that when redefining a function pointer, it is not allowed to write according to our own understanding method, but to Write according to the method specified in the C language. Interested students can refer to the following link: Redefine function pointer

insert image description here

Summary: I haven’t written such a long code for a long time. The code of process control is the most advanced code I have come into contact with so far. At first I thought it was difficult to understand, but after I spent a certain amount of time, So, So! And found that this code is very happy to write, and has a certain sense of accomplishment! Writing the code also gave me a very comfortable feeling, sleep, withdraw!

Guess you like

Origin blog.csdn.net/weixin_74004489/article/details/130120929