Linux process control (3) --- process replacement + simple shell implementation

Table of contents

execl()

execv()

execlp()

execvp()

How to use execl to execute the C/C++ executable program written by myself?

How to use makefile to compile two files at the same time

execle()

execvpe()

Writing a simple shell


What is process substitution?

After we fork before, it is a part of the execution code of the parent and child processes, and then the parent and child codes are shared, and each copy of the data is copied when writing.

But what if the child process wants to execute a whole new program ? The child process wants to have its own code, which uses program replacement.

Program replacement is to load a brand new program (code and data) on the disk into the address space of the calling process through a specific interface .

For example, there is only one process, and the parent and child processes are not considered.

First load the executable program into the memory, then the process finds the code address through the mapping of the address space and page table, and then executes it.

 If a program replacement occurs at this time, it needs to be replaced with another executable program other.exe in the disk. At this time, the program on the new disk will be loaded into the memory and mapped with the current page table. The original program myproc The .exe hardly changes.

And these tasks can be realized by using the exec series of interfaces.

So the principle of process replacement is:

After using fork to create a child process, it executes the same program as the parent process (but it may execute a different code branch), and the child process often needs to call an exec function to execute another program. When a process calls an exec function, the process's user-space code and data are completely replaced by the new program, and execution starts from the new program's startup routine. Calling exec does not create a new process , so the id of the process does not change before and after calling exec.

Among them, the essence of the exec series of functions is the function of how to load the program.

Let's demonstrate two examples, one is a single process, and the other is a parent-child process.

First look at a process:

make compiles and then runs:

As we expected.

Then we perform process replacement, of course we need to use the exec series functions mentioned above.

execl()

We man execl view usage:

 

There are a total of 6 related functions here, but if you learn one, the latter is basically not a problem.

Then let's come to the first execl first.

path is the path to the new loader. path: path + target file name

The second is to pass in a string as a parameter, which will be elaborated later.

... This is called a variable parameter list , that is, multiple, variable number of parameters can be passed in. The last parameter must be NULL at the end to indicate that the parameter has been passed.

How we write on the command line, how to fill in the parameters here. What does it mean, let's look at the following example.

Open the code just now, use the execl function, and then we use the system program first, such as filling in the path of ls, and then the parameter, if we want to execute ls, we will enter ls on the command line, so the parameter can be written as ls, as follows:

 After exiting, make compiles and executes.

First of all, we found that the code before execl is running normally, but the code after execl is not running, which means that the program has been replaced by ls at this time . It will replace all the code and data of the current code, including the executed and Unexecuted, once replaced, the following code will not be executed.

Secondly, we found that the ls command was also executed, indicating that the replaced program was also running normally.

This is one usage of execl.

Of course, more parameters can be added, for example:

Then run:

 At this time, it is found that the program is equivalent to executing ls -a -l, showing hidden files and detailed information.

Of course, it can be replaced with other commands, which, pwd, top...etc. Just fill in the path.

 I know the usage, but the return value has not yet been said

 It is said that execl has a return value only when it fails to return, returning -1.

That is to say, there will be no return value if the execl process is replaced successfully.

In fact, if we think about it carefully, this is also the case. After the execl process is successfully replaced, even the line of code of its own execl will be replaced, and a brand new program will be replaced, so the return value here is also meaningless.

Next we demonstrate an example of a parent-child process.

  Then our expected result should be that after the child process ls, the parent process displays that the wait is successful, and outputs the exit code of the child process.

It can be found that, as we expected.

Why create subprocesses?

Why not affect the parent process, the parent process focuses on reading data, parsing data, and assigning processes to execute code functions!

The parent process is responsible for fork and manages these child processes, and the child process is responsible for program replacement and completes its own work. 

After execl, the relationship between parent and child processes?

Before loading a new program, the relationship between the parent process and the child process is code sharing and copy-on-write data.

When the child process execl loads a new program, the code between the parent and child needs to be separated, and the code needs to be copied on write, so that the parent and child processes are completely separated in terms of code and data.

execv()

We also use man to check the usage first.

 Note the difference from execl:

 The parameter passing of execl is similar to the way of list,

And execv is an array of pointers , there is no essential difference from execl, only the difference in passing parameters, execv needs us to pass in a pointer, which points to the argv array. We write the options into the argv array in advance , and execl needs We pass each option as an argument.

The above is the parameter passing method of execv. Let's see how the code is written.

 Pay attention to the difference with execl, execv is equivalent to writing the array outside first, and then passing the written array in.

 

After running, the result is still correct.

execlp()

We still check the usage of man:

What is the difference between this file and the previous path?

As we said above, to find a replacement file, you need to write its path. So can the program be found without the path?

Of course it is possible, as is the case with environment variables. For example, when we usually run ls, we don’t need to add the path to run it, but when we run our own executable program, we need to add the path ./ to run.

So the file of execlp means that it will automatically look for it in the environment variable PATH, without telling it where the program is.

Write it in code like this:

 First it will automatically look for "ls" in the environment variable PATH, and then execute ls -a -l.

Again, the result is correct.

At the same time, you have to distinguish the difference between the above two ls: the first ls means who you want to execute (search path), and the second ls means how you want to execute (match). 

execvp()

This is similar to the difference between execl and execp, except that the second parameter is different, that is, the essence is the same except for the way of passing parameters.

 Note the difference from the previous execlp:

Only the second parameter is different, but the way of passing parameters is changed:

 

It still works fine:

How to use execl to execute the C/C++ executable program written by myself?

We can use the execl function in the myproc.c file to call the mycmd executable file compiled from the mycmd.c file.

Then use make to compile the two files, and then execute myproc to call the C executable program (mycmd) written by yourself.

First continue with the previous one, and then write the path of mycmd in the myproc.c file. Why it is convenient to modify later, we can directly #define it, then pass it into the execl function, and execute the -a option.

 Then write the mycmd.c file:

The command line parameters in the main function need to be used. If there are no two parameters entered, the program will end directly. If the input

mycmd -a , then output a.

mycmd -b, then output b.

Then exit, make compiles, then there is a problem:

How to use makefile to compile two files at the same time

As we said in make/makefile before, if you directly make, it will execute from top to bottom in the makefile and only execute the first statement, so that only one file can be compiled. When compiling many files, you have to compile them one by one. , will be troublesome.

At this time, the pseudo object is used, which is also mentioned in that chapter.

Define a pseudo-object all, for example, we will finally form two executable files mycmd and myproc.

Then we only maintain one dependency, let the pseudo-object all depend on mycmd and myproc.

In this way, when the compiler encounters all, it will automatically find these two statements, and then compile them separately, which is successful.

After make is compiled, we run the myproc file

It can be found that the C program written by ourselves has been successfully run. 

execle()

 It can be found that the first two parameters are exactly the same as our previous set, so we only need to look at the third parameter.

The third parameter is an environment variable, precisely to pass the environment variable to the new program.

We write an environment variable in the original program myproc.c file:

Then get the environment variable in the mycmd.c file

At this point we make again to compile and run.

 

The environment variables were successfully obtained.

When we do not pass in the environment variable, that is, the new process cannot get it, it will return null.

execvpe()

This interface is nothing more than the superposition of the parameters of the previous interfaces. As long as you know the previous interfaces, this interface can still be used.

file is searched from the environment variable, argv is the parameter option to be passed in, and envp is the environment variable to be passed in. The previous interfaces have been mentioned, so no demonstration will be given here.

Of course, none of the above six are system interfaces in the strict sense, but the basic encapsulation provided by the system. The real system interface is execve.

 filename needs to write the full path of the file, argv is also the same as above, it is a parameter and an option, and the last parameter is also an environment variable.

This is the end of the process replacement here. There are many functions with similar names, and the functions of each function are different. It is also difficult to remember.

In fact, careful observation of these function names is also regular.

 l(list): Indicates that the parameter adopts a list, that is, all the parameters are passed into the function, such as execl, execlp, execle.
v(vector): The parameter uses an array, that is, first write the parameter options into the array externally, and then Into the function, such as execv, execvp.
p(path): There is p to automatically search the environment variable PATH, with its own path, no need to write all the paths of the file, such as execlp, execvp.
e(env): means to maintain the environment variable by itself , you can pass in environment variables, such as execle.

int execl(const char *path, const char *arg, ...);
int execlp(const char *file, const char *arg, ...);
int execle(const char *arg, ...); char *path, const char *arg, ...,char *const envp[]);
int execv(const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[ ]);

This picture is also a summary of the above. You don’t need to memorize it by rote. If you understand the above meanings, you will know how to pass in parameters when you see the function.

Writing a simple shell

The writing of the simple shell basically covers all the previous knowledge points of process control, including process creation, process termination, process waiting and today's process replacement. For details, you can check the source code, and there will be detailed comments.

The only thing to note here is that when the parent process bash creates a child process, the process replacement executed by the child process, that is, the command cannot complete the built-in command , such as cd, the child process just completes the cd and then the process exits, so it does not Pointless.

So we need to let the parent process bash complete it by ourselves. Here we need to use a chdir function. How to use it can be checked by man chdir. The overall code is as follows:

    include<stdio.h>                                                                                                                                
    2 #include<stdlib.h>
    3 #include<string.h>
    4 #include<unistd.h>
    5 #include<sys/types.h>
    6 #include<sys/wait.h>
    7 
    8 #define NUM 1024
    9 #define SIZE 32
   10 #define SEP " "
   11 //保存打散之后的命令行字符串
   12 char* g_argv[SIZE];
   13 //保存完整的命令行字符串
   14 char cmd_line[NUM];
   15 
   16 //shell 运行原理: 通过让子进程执行命令,父进程等待 和 解析命令
   17 int main()
   18 {
   19   //0.命令行解释器,一定是一个常驻内存的进程,不退出
   20   while(1)
   21   {
   22     //1.打印出提示信息:[root@localhost myShell]$ 
   23     printf("[root@localhost myShell]# ");
   24     fflush(stdout);
   25     memset(cmd_line,'\0',sizeof(cmd_line));
   26     //2.获取用户的输入[输入的是各种指令和选项:"ls -a -l"]
   27     if(fgets(cmd_line,sizeof(cmd_line),stdin) == NULL)
   28     {
   29       continue;
   30     }
   31     cmd_line[strlen(cmd_line)-1] = '\0';
   32     //ls -a -l\n
   33     //printf("echo: %s\n",cmd_line);
   34     //3.命令行字符串解析:"ls -a -l" -> "ls" "-a" "-l"
   35 
   36     g_argv[0] = strtok(cmd_line,SEP);//第一次调用,要传入原始字符串
   37     int index = 1;
   38     //这段代码等价于下面while(g_argv[index++] = strtok(NULL,SEP));
   39    // while(1)
   40    // {
   41    //   g_argvp[index] = strtok(NULL,SEP);//第二次,如果还要解析原始字符串,则传入NULL
   42    //   index++;
   43    // }
           //如果是ls命令,我们可以给它加上颜色
   44     if(strcmp(g_argv[0],"ls") == 0)
   45     {
   46       g_argv[index++] = "--color=auto";
   47     }
   48     while(g_argv[index++] = strtok(NULL,SEP));
   49     //for DEBUG
   50    // for(index = 0; g_argv[index]; index++)
   51    // {
   52    //   printf("g_argv[%d]:%s\n",index,g_argv[index]);
   53    // }
   54    //
   55     //4.TODO 内置命令:让父进程(shell)自己执行的命令,我们叫做内置命令(内建命令)
   56     //内置命令本质就是shell中的一个函数调用
   57     if(strcmp(g_argv[0],"cd") == 0)//不想让子进程执行,而是父进程执行
   58     {
   59       if(g_argv[1] != NULL)
   60       {
   61         chdir(g_argv[1]);
   62         continue;
   63       }
   64     }
   65     //5.fork()
   66     pid_t id = fork();
   67     if(id == 0)
   68     {
   69       //child process
   70       printf("功能让子进程执行\n");                                                                                                              
   71       execvp(g_argv[0],g_argv);//ls -a -l
   72       exit(1);
   73     }
   74        else
   75     {
   76       //father process
   77       int status = 0;
   78       pid_t ret = waitpid(id,&status,0);
   79       if(ret > 0)
   80       {
   81         printf("exit code:%d\n",WEXITSTATUS(status));
   82       }
   83     }
   84   }
   85   return 0;
   86 }                  

After we make compiles, we run:

The discovery function can be used normally. 

Guess you like

Origin blog.csdn.net/weixin_47257473/article/details/131827129