Linux | Program Replacement

Preface

        This article mainly records some of the problems encountered by the editor when learning program replacement, and shares and records them, hoping to bring help to everyone;

1. Initial program replacement

        The so-called program replacement is to replace the code and data of the process and run the code of the new program; when we explained the process address space before, the child process will copy the PCB control block and other kernel information of the parent process, including the page table. Mapping, that is to say, the parent process and the child process share the same piece of code and data. However, when the child process modifies the data, copy-on-write occurs. Generally, under clearing here, all we copy when writing is data, not Changes will be made to the code, and our program replacements today will make changes to the code;

        The system calls replaced by our program are mainly as follows;

        It seems like there are a lot of them. In fact, if we learn a few of them, the remaining ones can be easily solved. Next, I will take you to have a preliminary understanding of the above system calls;

2. How to perform program replacement

1. execl function

        The function is defined as follows;

int execl(const char *path, const char *arg, ...);

Reference number one:path

        This parameter is the address of the program we want to replace, such as our ls command. We use which or whereis to query the path of the command;

        We can fill this address into this parameter;The first parameter is to find the location of the program to be replaced;

Parameter two:

        This second parameter is actually a variable parameter list, corresponding to the l in the function, which means list. We can list how we call this instruction, ending with NULL. Or for example, the ls command, we can add various options to personalize ls with different functions; for example, here we can fill in "ls", "-a", "-l", NULL; This parameter allows us to tell the OS how to run this program;

return value:

        The return value of this function is very strange. If it fails, it returns -1, but once it succeeds, there is no return value;

Question:What do you think about no return value?

        ​ ​ First think about it, if our call is successful, should we have a return value? We first understand the nature of program replacement. Program replacement is to remap our code and data. It is not to recreate a new PCB control block, process address space, page table and other kernel data structures; once the program replacement is successful, the original code We can no longer access the data because the mapping of the page table has changed, so we cannot save the return value because the code has changed;

Note:The return value of all subsequent program substitutions is the same

Seeing is better than hearing a hundred times:

        The compilation and running results are as follows;

        The result is as we expected, we printed main begin, but no main end, because after the program replacement was successful, the entire code was replaced;

Note: The function behind is also similar to this function, so I will not explain it in such detail;

2. execv function

        The difference between this function and the previous function is that l is changed to v. The v here can be understood as vector, just like an array. Therefore, the difference in this function is that the second parameter is not listed like a list, but directly Just pass in an array. This array is still terminated by NULL, which may not be needed in some circumstances. The specific function is declared as follows;

int execv(const char *path, char *const argv[]);

        The compilation and running results are as follows;

3. execlp function

        This function adds a p based on our execl function. The function declaration is as follows;

int execlp(const char *file, const char *arg, ...);

        At this time, only the first parameter has changed. The first parameter has become file, which is the file name. At this time, we can directly enter the executable program name. At this time, the operating system will In the PATH in the environment variable, search each path to see if the file exists. If it exists, execute it directly. If not, an error will be returned; the new p here is exactly the PATH; Students who don’t know about environment variables can read the following blog;

Linux | Process-CSDN Blog

        Some friends may be wondering, why do we need two ls? In fact, we should divide the parameters of this function into two parts to understand;

        This has been emphasized before, I hope everyone can understand;

4. execvp function

        You will use the previous execlp function. This function is just the same. The only difference is that one is listed in the form of a list, and the other is that the parameters are passed in the form of a character pointer array. The specific usage is as follows;

5. execle function

        The only difference between this function and the execl function is that this function can pass environment variables, as shown in the following statement;

int execle(const char *path, const char *arg,..., char * const envp[]);

6. execvpe function

        ​​​​​​​​​​​​​​​​​​​​​​​ ​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​ Here is a character pointer array, p is the default environment variable PATH to find the program, e represents the environment variable that can be passed in, the declaration is as follows;

int execvpe(const char *file, char *const argv[],char *const envp[]);

7. execve function

        The picture below shows the query results in the man manual. Careful friends have discovered that at the beginning we only said that there are 6 program replacement functions. If you don’t believe it, you can go to the initial program replacement picture to see if there is this function. This function is actually It is a real system call. The above 6 functions are actually the encapsulation of this system call. The usage rules of this function are also the same as the above functions, so we will not demonstrate too much here.

3. Deep understanding of program replacement

1. Program replacement code form

        In fact, there are two types of program replacement codes. One is to directly call the program replacement function for replacement, and the other is to first call the fork function and then let the child process call the program replacement function; that is to say, the above code can Change it to the following form;

        What are the benefits of writing code like this? We let the child process execute the program replacement code. Even if there is a problem with the program replacement program and it will crash after execution, it will only crash the child process and will not affect our parent process; because we must be clear that our program replacement can not only To execute system instructions, we can execute the executable program we wrote ourselves. As shown below, we wrote a sub.c program;

        We are changing the code of the original main function below. We call the program sub we just wrote ourselves;

        Compile and run, and the result is as we expected. We can also use program replacement to call the program we wrote ourselves;

2. Soul questions

Question:Will our environment variables be replaced?

        No, program replacement only replaces code and data, and does not replace our environment variables;

Problem:Program replacement and creation of child processes

        Creating a child process will first create kernel data such as PCB, process address space, page table, etc. for us, and then load the code and data into the memory. At this time, we call the kernel data + code data a process. This is completed after the creation of the process. Program replacement refers to replacing the code and data of the current program with the code and data of another program. At this time, new kernel data will not be created. It will only change the kernel data, such as the mapping of the page table, to the new kernel data. physical address;

4. Simulate the shell command line

        With the above knowledge base, we can simulate and implement a shell command line applet; first of all, we must understand what a command line must have?

1. It must be an infinite loop, constantly reading our instructions for analysis and execution;

2. First, there must be a command line prompt;

3. Obtain user input instructions

4. Parse user input instructions

5. Determine built-in commands, that is, commands that are completed by the parent process itself.

6. Create a subprocess and replace the program

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdlib.h>

// 命令字符数组大小
#define NUM 128
// 解析后命令字符指针数组大小
#define SIZE 32
// 命令分隔符
#define SEP " "
// 是否调试
//#define DEBUG


// 保存用户输入命令
char cmd_line[NUM];
// 解析命令后保存的数组
char* args[SIZE];


char Myenv[64];
int main()
{
    while(1)
    {
        // 清空命令字符数组
        memset(cmd_line, 0, sizeof(cmd_line));

        // 1、显示提示符
        printf("[root@localhost]# ");
        fflush(stdout); // 必须刷新缓冲区

        // 2、获取用户输入
        fgets(cmd_line, sizeof(cmd_line), stdin);
        // 处理最后的换行符
        cmd_line[strlen(cmd_line) - 1] = '\0';
#ifdef DEUBG
        printf("%s\n", cmd_line);
#endif
        // 3、对用户输入命令解析
        args[0] = strtok(cmd_line, SEP);
        int index = 1;
        while(args[index++] = strtok(NULL, SEP));
#ifdef DEBUG 
        for(int i = 0; args[i]; i++)
        {
            printf("args[%d]: %s\n", i, args[i]);
        }
#endif

        // 4、内建命令处理与特殊命令处理
        if(strcmp(args[0], "cd") == 0)
        {
            if(args[1] != NULL)
            {
                // 更改当前目录
               int ret = chdir(args[1]);
               if(ret < 0)
               {
                   printf("更改失败\n");
               }
               continue;
            }
            else 
            {
                printf("命令格式有误\n");
                continue;
            }
        }
        if(strcmp(args[0], "ll") == 0)
        {
            args[0] = (char*)"ls";
            args[1] = (char*)"-l";
        }
        if(strcmp(args[0], "export") == 0)
        {
           if(args[1] != NULL)
           {
               strcpy(Myenv, args[1]);
               putenv(Myenv);
           }
        }
        // 5、创建子进程,并让子进程执行命令
        pid_t id = fork();
        if(id < 0)
        {
            // fork 函数调用失败
            printf("fork fail\n");
            continue;
        }
        else if(id == 0)
        {
            printf("MYVAL:%s\n", getenv("MYVAL"));
            // 子进程
            execvp(args[0], args);
            exit(-1);
        }
        else 
        {
            // 父进程
            int status = 0;
            waitpid(id, &status, 0);
            if(WIFEXITED(status))
            {
                // 正常退出
                printf("正常退出,退出码: %d\n", WEXITSTATUS(status));
            }
            else 
            {
                // 异常退出,获取退出信号
                printf("崩溃了,推出信号: %d\n", status & 0x7F);
            }
        }
    } 
        return 0;
}

Guess you like

Origin blog.csdn.net/Nice_W/article/details/134090024