Linux articles [7]: Process program replacement

Table of contents

1. Process program replacement

1. Concept (the principle is below)

2. Why program replacement

3. The principle of program replacement:

4. How to perform program replacement

Let's use the first execl as a demonstration:

example:

(1) All the codes behind the successful execution of execl will not be executed

(2) So this program replaces the function execl, is it necessary to judge the return value? Why?

Return example failure 1: The path is wrong

 Return failure example 2: the option is wrong

(3) Introduce process creation - sub-process execution program replacement, will it affect the parent process? ?

5. Extensive testing of various interfaces

Naming comprehension (with v and with l)

memory skills:

with e and with p

(1)execv

(2)execlp

EDIT (3) execvp

(4)execle

Use system interface to call cpp program in c program

6. Simulate the implementation of the shell

hello.c

makefile

myshell.c

2. Built-in commands

1. Built-in commands - take chdir as an example

3. Environment variables in process replacement

Supplementary simulation shell to color ls:

The complete code of the simulated shell:


1. Process program replacement

1. Concept (the principle is below)

The child process executes the code fragment of the parent process. What if we want the created child process to execute a new program?
Need to use: process program replacement
 

2. Why program replacement

When we generally design servers (linux programming), we often need child processes to do two types of things
1. Let the child process execute the code fragment of the parent process (server code)
2. Let the child process execute a brand new program in the disk (shell , I want the client to execute the corresponding program, through our process, execute the process code written by others, etc.), c/c++ -> c/c++/Python/Shell/Php/Java...

3. The principle of program replacement:

1. Load the program in the disk into the memory structure
2. Re-establish the page table mapping, whoever performs the program replacement will re-establish his mapping (child process)
Effect: Let our parent process and child process be completely separated, and let the child The process executes a brand new program!

 

Did this process create a new process?
No! The PCB and other structures of the child process have not changed, but the page table mapping relationship has changed.

After the program is replaced successfully, the new program will exit directly after running the new program ; after the program is replaced successfully, the original process will not exit, and the new program will be run using the original process

We can only call the interface, why?
Because this process is actually the operation of moving data from one hardware to another, this operation can only be done by the OS operating system

4. How to perform program replacement

man execl View the functions for program replacement:

18f55bb0eb2b496ab9b2ba4019dd723d.png

Let's use the first execl as a demonstration:

ae86d3781266486f9c5187757178b16f.png

Specific example: execl("usr/bin/pwd", "ls", "-l", "-a", NULL); If we want to execute 
a brand new program (essentially a file on the disk), we need to do A few things:
1. First find out where is the program? ——Where is the program? (Example: You can check the path of pwd through which "pwd")
2. The program may carry options for execution (or not) —— How to execute it?
        So tell the OS clearly, how do I want to execute this program? Do you want to have options?

The red line part executes the first question, and the green line part executes the second question

How to write the command line (ls -l -a), how to fill in this parameter "ls", "-l", "-a", the last must be NULL, indicating that the parameter is passed [How to execute the program]

example:

327e672cdcb340948f5d97fc69d29fed.png

(1) All the codes behind the successful execution of execl will not be executed

4956d8944de84cc59e4e73d71f1bd175.png

Is the printf behind the code? Why is it not implemented?

Because once execl is replaced successfully, all the code and data of the current process will be replaced !
The printf behind has actually been replaced long ago! The code does not exist anymore

(2) So this program replaces the function execl, is it necessary to judge the return value? Why?

int ret= execl(...);

Answer: No need to judge the return value (but the return value is still required), because once the replacement is successful, there will be no return value, and the return statement will not be executed, because the return value of int ret is also the code and data of the current process, once execl is replaced If it succeeds, all the codes and data of the current process are replaced , and execl directly executes the code of the ls command. If there is a return value, it must be that the program replacement failed, and it must continue to execute backwards!! At most, the reason for the replacement failure can be obtained through the return value!

Return example failure 1: The path is wrong

b76db24524954fb3af3fe63ae3332ce0.png

 Return failure example 2: the option is wrong

1e3b7015951744f3997a89bbbf49043e.png

(3) Introduce process creation - sub-process execution program replacement, will it affect the parent process? ?

37fb8e3fed9c477898a78fd854b1d72c.png

Will the child process execute program replacement, will it affect the parent process? ?

No, because processes are independent.
Why and how? ? Copy-on-write occurs at the data level! When the program is replaced, we can understand it as: both code and data have occurred copy-on-write to complete the separation of parent and child!

5. Extensive testing of various interfaces

Naming comprehension (with v and with l)

These function prototypes seem to be easy to confuse, but it is easy to remember as long as you master the rules.

l(list) : Indicates that the parameter takes a list

v(vector) : array for parameters

p(path): There is p to automatically search the environment variable PATH

e(env): Indicates that it maintains environment variables by itself

memory skills:

At the end of execl, l is a list, and the parameters are passed through the list --> variable parameter packs, passed one by one.

At the end of execv, v is a vector, and the array is passed as a parameter --> what is passed is an array of pointers.

8a3abef935e14dc9911ce63a5beae742.png

with e and with p

Those with e can pass environment variables (execle, execvpe), but they will overwrite the original environment variables of the system, and hand over the environment variables passed by themselves to the process; without e, they inherit the environment variables of the system by default; all with p It can have its own path, just pass the command name directly (execlp, execvp, execvpe)

18f55bb0eb2b496ab9b2ba4019dd723d.png

(1)execv

int execv(const char *path, char *const argv[]);        

path is still the path of the program, and the parameter  argv[] is an array of pointers that store the instructions to be implemented

4b2b1eff01d34b1b9713c06a40dd71cc.png

 Execv VS execl is only different in the way of passing parameters!! execl is passing variable parameters, execv is passing pointer arrays

f74fcc9112954b268110a1b08e832a22.png

(2)execlp

int execlp(const char *file, const char *arg, ...); just pass the program name with p

file: The program to execute. When executing the command, where is the default search path? In the environment variable PATH
named with p, you can not have the path, just tell which program you want to execute!
execlp("ls", "ls" , "-a", "-1", NULL)

There are two Is, with different meanings: the first ls tells you the program to be executed, and the latter ls has -a, etc. are the execution methods

f27b940a28bd404eb8aa58826d36259f.png  (3)execvp

int execvp(const char *file, char *const argv[]); same as above

5b579586b62442b7898514a90f1b9755.png

(4)execle

 int execle(const char *path, const char *arg, ..., char * const envp[]);

char * const envp[]: Add environment variables to the target process, which is overwritten. If you pass execle("./mycmd", "mycmd", NULL, env_); it will cause all the original environment variables to be overwritten and become invalid, so you need to use the global variable environ to pass in all the environment variables, and the environment variables you define must be Add it manually

环境变量的指针声明
    extern char**environ;
……
execle("./mycmd", "mycmd", NULL, environ);

042b5959402f4f728dc508855c6e2186.png

Overview code:

mycmd.cpp:

#include <iostream>
#include <stdlib.h>

int main()
{
    std::cout << "PATH:" << getenv("PATH") << std::endl;
    std::cout << "-------------------------------------------\n";
    std::cout << "MYPATH:" << getenv("MYPATH") << std::endl;
    std::cout << "-------------------------------------------\n";

    std::cout << "hello c++" << std::endl;
    std::cout << "hello c++" << std::endl;
    std::cout << "hello c++" << std::endl;
    std::cout << "hello c++" << std::endl;
    std::cout << "hello c++" << std::endl;
    std::cout << "hello c++" << std::endl;
    std::cout << "hello c++" << std::endl;
    std::cout << "hello c++" << std::endl;
    return 0;
}
myexec.c:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>


int main()
{
    //环境变量的指针声明
    extern char**environ;

    printf("我是父进程,我的pid是: %d\n", getpid());
    pid_t id = fork();
    if(id == 0){
        //child
        //我们想让子进程执行全新的程序,以前是执行父进程的代码片段
        
        printf("我是子进程,我的pid是: %d\n", getpid());
        char *const env_[] = {
            (char*)"MYPATH=YouCanSeeMe!!",
            NULL
        };
        //env_: 添加环境变量给目标进程,是覆盖式的
        //execle("./mycmd", "mycmd", NULL, env_);
可利用extern新增式添加环境变量:
        execle("./mycmd", "mycmd", NULL, environ);
      
        exit(1); //只要执行了exit,意味着,execl系列的函数失败了
    }
    // 一定是父进程
    int status = 0;
    int ret = waitpid(id, &status, 0);
    if(ret == id)
    {
        sleep(2);
        printf("父进程等待成功!\n");
    }
    return 0;
}

861411a85968406b99a76a93d23ce03d.png

Use system interface to call cpp program in c program

At present, the programs we execute are all system commands. What if we want to execute the C/C++ programs written by ourselves? ?
How do we execute programs written in other languages?
 

f1e9555620c44a5197c3a92ec8569e87.png

275a091b81ee4720a9ecec35888a678c.png

   // 一定是父进程
    int status = 0;
    int ret = waitpid(id, &status, 0);
    if(ret == id)
    {
        sleep(2);
        printf("父进程等待成功!\n");
    }
    return 0;
}

 Why are there so many interfaces? ——Because it needs to adapt to the application scenario.

Why is execve separate? ——Actually, only execve is a system call, and the others are encapsulation of the system interface, which must be called to execve at the end

e1d7cd5dc710416c8f18399739849877.png

6. Simulate the implementation of the shell

hello.c

#include <stdio.h>

int main()
{
    printf("hello my shell\n");
    return 0;
}

makefile

myshell:myshell.c
	gcc -o $@ $^ -std=c99    //编不过就加-std=c99  
.PHONY:clean
clean:
	rm -f myshell

myshell.c

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

#define SEP " "    //可以是多个,比如" ,." ——空格,逗号,句号 隔开
#define NUM 1024
#define SIZE 128

char command_line[NUM];
char *command_args[SIZE];

int main()
{
    //shell 本质上就是一个死循环
    while(1)
    {
        //不关心获取这些属性的接口, 搜索一下
        //1. 显示提示符
        printf("[张三@我的主机名 当前目录]# ");
        fflush(stdout);
        //2. 获取用户输入
        memset(command_line, '\0', sizeof(command_line)*sizeof(char));
        fgets(command_line, NUM, stdin); //键盘,标准输入,stdin, 获取
//到的是c风格的字符串, 尾部会加上'\0',从stdin获取NUM个字节放入地址command_line
        command_line[strlen(command_line) - 1] = '\0';// fgets时,最后敲回车也会
//输入进command_line中,所以要清空这个\n,把\n设置成\0
        //3. "ls -a -l -i" -> "ls" "-a" "-l" "-i" 字符串切分
        command_args[0] = strtok(command_line, SEP);
        int index = 1;
        // (1)= 是故意这么写的
        // strtok 截取成功,返回字符串起始地址;截取失败,返回NULL
        // (2)上面strtok已截取ls,想继续截取,参数1应给NULL
        while(command_args[index++] = strtok(NULL, SEP));

        //for debug为了打印看一下我们输入的字符串是否都保存到command_args中了——————————
        //for(int i = 0 ; i < index; i++)
        //{
        //    printf("%d : %s\n", i, command_args[i]);
        //}
        //——————————————————————————————————————————————————————————————
        // 4. TODO, 编写后面的逻辑, 内建命令
        // 5. 创建进程,执行
        pid_t id = fork();
        if(id == 0)
        {
            //child
            // 6. 程序替换
            //exec*?
execvp(command_args[0]/*不就是保存的是我们要执行的程序名字吗?*/, command_args);
            exit(1); //执行到这里,子进程一定替换失败
        }
        int status = 0;
        pid_t ret = waitpid(id, &status, 0);
        if(ret > 0)
        {
            printf("等待子进程成功: sig: %d, code: %d\n", status&0x7F, (status>>8)&0xFF);
        }
    }// end while
}

2. Built-in commands

1. Built-in commands - take chdir as an example

If you directly execute cd with exec*, at most you only let the child process switch paths. The child process is a process that finishes as soon as it runs! It is meaningless to switch the path of the child process. In the shell, we hope that the path of the parent process - the shell itself will happen Variety. As long as the parent process path changes, subsequent child processes will inherit the parent process path

If some behaviors must be executed by the parent process shell, if you don’t want the child process to execute, you must never create a child process at this time! Only the
parent process can implement the corresponding code! The command executed by the parent process shell itself is called It is a built-in command /built-in bind-in command.
The built-in command is equivalent to a function inside the shell!

(Add to the 4th TODO based on myshell above)

Executed by the parent process itself, corresponding to the built-in command of the upper layer

chdir: pass which path you want to go to

//对应上层的内建命令
int ChangeDir(const char * new_path)
{
    chdir(new_path);    

    return 0; // 调用成功
}

while(1)
{
……
 // 4. TODO, 编写后面的逻辑, 内建命令
        if(strcmp(command_args[0], "cd") == 0 && command_args[1] != NULL)
        {
            ChangeDir(command_args[1]); //让调用方进行路径切换, 父进程
            continue;
        }
}

Examples of built-in commands: cd command, export, echo

3. Environment variables in process replacement

The data of the environment variable, in the context of the process
1. The environment variable will be inherited by the child process, so it will have global attributes
2. When we replace the program, the environment variable of the current process will not be replaced, but will be inherited Of the parent process!! Because the environment variable is the data of the system.

Those with e can pass environment variables (execle, execvpe) but will overwrite the original environment variables of the system. When executing the execle and execvpe of the child process, passing in the environment variables passed by yourself will overwrite the original environment variables of the system; without e, the environment variables of the system are inherited by default. If we don’t want to overwrite the original ones (the execle and execvpe of the child process cannot be executed), but just want to add environment variables, we need to add environment variables to the parent process, and the parent process executes the built-in command putenv to increase our environment variables, child processes can inherit and obtain them ( environment variables will be inherited by all child processes under it by default )

How to add your own environment variables inside the shell - putenv Note: It needs to be an independent space

putenv: Import the incoming environment variables into its own context

3fbad74f509a4a79a3c0a5c0d6a2bee2.png

Function description: getenv() is used to obtain the content of the parameter name environment variable (the linux command env can be used to view the environment variable). The parameter name is the name of the environment variable, and if the variable exists, it will return a pointer to the content.

void PutEnvInMyShell(char * new_env)
{
    putenv(new_env);
}

// 4. TODO, 编写后面的逻辑, 内建命令
      if(strcmp(command_args[0], "export") == 0 && command_args[1] != NULL)
        {
            // 目前,环境变量信息在command_line,每次memset时command_line都会被清空
            // 所以我们需要自己用全局的env_buffer保存一下环境变量内容
            strcpy(env_buffer, command_args[1]);
            PutEnvInMyShell(env_buffer); //export myval=100, BUG?
            continue;
        }

Supplementary simulation shell to color ls:

[zsh@ecs-78471 ~]$ which ls
alias ls='ls --color=auto'
	/usr/bin/ls

Explanation: alias is an alias, alias ls='ls --color=auto' In the system, the command 'ls --color=auto' is aliased as ls, so usually our ls is actually 'ls --color=auto' this order. To color ls, you need to add this command

        //3. "ls -a -l -i" -> "ls" "-a" "-l" "-i" 字符串切分
        command_args[0] = strtok(command_line, SEP);
        int index = 1;
        // 给ls命令添加颜色
        if(strcmp(command_args[0]/*程序名*/, "ls") == 0 ) //如果是ls命令,就加色
            command_args[index++] = (char*)"--color=auto";

        // = 是故意这么写的
        // strtok 截取成功,返回字符串其实地址
        // 截取失败,返回NULL
        while(command_args[index++] = strtok(NULL, SEP));

Explanation: command_args[0] = strtok(command_line, SEP); ls has been put into the array command_args[0], if(strcmp(command_args[0]/*program name*/, "ls") == 0 ) judgment If it is an ls command, add color and execute the following command: command_args[index++] = (char*)"--color=auto"; (not execute if it is not ls) add --color=auto to ls to color, Followed by "ls" "--color=auto" "-a" "-l" "-i"

The complete code of the simulated shell:

#include <stdio.h>
#include <string.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

#define SEP " "
#define NUM 1024
#define SIZE 128

char command_line[NUM];
char *command_args[SIZE];

char env_buffer[NUM]; //for test

extern char**environ;

//对应上层的内建命令
int ChangeDir(const char * new_path)
{
    chdir(new_path);

    return 0; // 调用成功
}

void PutEnvInMyShell(char * new_env)
{
    putenv(new_env);
}

int main()
{
    //shell 本质上就是一个死循环
    while(1)
    {
        //不关心获取这些属性的接口, 搜索一下
        //1. 显示提示符
        printf("[张三@我的主机名 当前目录]# ");
        fflush(stdout);

        //2. 获取用户输入
        memset(command_line, '\0', sizeof(command_line)*sizeof(char));
        fgets(command_line, NUM, stdin); //键盘,标准输入,stdin, 获取到的是c风格的字符串, '\0'
        command_line[strlen(command_line) - 1] = '\0';// 清空\n

        //3. "ls -a -l -i" -> "ls" "-a" "-l" "-i" 字符串切分
        command_args[0] = strtok(command_line, SEP);
        int index = 1;
        // 给ls命令添加颜色
        if(strcmp(command_args[0]/*程序名*/, "ls") == 0 ) //如果是ls命令,就加色
            command_args[index++] = (char*)"--color=auto";

        // = 是故意这么写的
        // strtok 截取成功,返回字符串其实地址
        // 截取失败,返回NULL
        while(command_args[index++] = strtok(NULL, SEP));

        //for debug
        //for(int i = 0 ; i < index; i++)
        //{
        //    printf("%d : %s\n", i, command_args[i]);
        //}
    
        // 4. TODO, 编写后面的逻辑, 内建命令
        if(strcmp(command_args[0], "cd") == 0 && command_args[1] != NULL)
        {
            ChangeDir(command_args[1]); //让调用方进行路径切换, 父进程
            continue;    //内建命令走完直接continue就不会创建子进程
        }
        if(strcmp(command_args[0], "export") == 0 && command_args[1] != NULL)
        {
            // 目前,环境变量信息在command_line,会被清空
            // 此处我们需要自己保存一下环境变量内容
            strcpy(env_buffer, command_args[1]);
            PutEnvInMyShell(env_buffer); //export myval=100, BUG?
            continue;    //内建命令走完直接continue就不会创建子进程
        }

        // 5. 创建进程,执行
        pid_t id = fork();
        if(id == 0)
        {
            //child
            // 6. 程序替换
            //exec*?
            execvp(command_args[0]/*不就是保存的是我们要执行的程序名字吗?*/, command_args);
            exit(1); //执行到这里,子进程一定替换失败
        }
        int status = 0;
        pid_t ret = waitpid(id, &status, 0);
        if(ret > 0)
        {
            printf("等待子进程成功: sig: %d, code: %d\n", status&0x7F, (status>>8)&0xFF);
        }
    }// end while
}

Test cd .. found that the parent process can return normally

a8c12675341c43509df42fe30df1c55c.png

Test export can also add environment variables normally

cc60a1557517492fbfe0bb69cfb271c4.png

Guess you like

Origin blog.csdn.net/zhang_si_hang/article/details/127401753