[Linux] Process Control--Process Creation/Process Termination/Process Waiting/Process Program Replacement/Simple Shell Implementation

Article directory

1. Process creation
2. Process termination
3. Process waiting
4. Process program replacement
5. Implement a simple shell

1. Process creation

1.fork function

The fork function is a very important system call function in Linux. It is used to create a new process under the current process. The new process is a child process of the current process. We can use man manual No. 2 to view the fork function:

Insert image description here

// 头文件
#include <unistd.h>

// 创建一个子进程
pid_t fork(void);

//返回值：子进程中返回0，父进程返回子进程id，出错返回-1

The process calls fork. When control is transferred to the fork code in the kernel, the kernel does the following tasks:

Allocate new memory blocks and kernel data structures to the child process

Copy part of the data structure content of the parent process to the child process

Add child process to system process list

fork returns and starts scheduler scheduling

Insert image description here

When a process calls fork, there are two processes with the same binary code. And they all run to the same place. But every process will be able to

To start their own journey, see the following procedure

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>

int main()
{
    
    
    pid_t id = fork();
    
    if(id == -1)
    {
    
    
        printf("fork fail\n");
        exit(-1);
    }
    else if(id == 0)
    {
    
    
        // child
        while(1)
        {
    
    
            printf("我是子进程，pid:%d,ppid:%d,id:%d\n",getpid(),getppid(),id);
            sleep(1);
        }
    }
    else
    {
    
    
        // parent
        while(1)
        {
    
    
            printf("我是父进程，pid:%d,ppid:%d,id:%d\n",getpid(),getppid(),id);
            sleep(1);
        }
    }
    
    return 0;
}

Insert image description here

Conclusion: Before fork, the parent process executes independently. After fork, the two execution streams of father and son execute separately. Note that after fork, who executes first is completely determined by the scheduler

Decide.

A little trick:When we write the Makefile, in the dependency method of the target file, we can use "< a i=3> @ " is used to represent the formed target file, that is, the content on the left side of " : " in the dependency relationship, and " @ " is used to represent the formed target file, that is, the content on the left side of " : " in the dependency relationship. content, use " $@" to represent the formed target file, that is, in the dependency relationship " : The content on the left of " is represented by "$ ^" to indicate the dependent files of the target file. That is, in the dependency relationship: "the content on the right side

Insert image description here

2.fork function return value

The fork function has two return values. The child process returns 0, and the parent process returns the pid of the child process.

After learning C/C++, we know that a function can only return one value at most, so how do weunderstand that the fork function has two return values

We know that the fork function is a system call, that is, the fork function is an operating interface provided by the operating system, so the fork function is also implemented by the operating system, so when we call the fork function, it is actually the operating system that helps us create the child process. When a function is executed normally, the main function of the function must have been executed before the function return. For the fork function, the function of the fork function is to create a child process, so the fork function has already created the child process before return. , then there are two processes at this time. Since there are two processes, the fork function will be returned twice, because each process will return, so the fork function has two return values.

How do we understand that after fork returns, the pid of the child process is returned to the parent process and 0 is returned to the child process?

Because a parent process may have multiple child processes, and a child process can only have one parent process, the parent process needs the pid of the child process to identify different child processes, while the child process does not need to identify the parent process, just call getppid directly. Get the pid of the parent process

How to understand the same id value, how to save different id values, and let if and else if execute at the same time

We know that the child process will copy the PCB, data structure and page table of the parent process, but when a process writes to its data process, copy-on-write will occur, changing the mapping relationship of the page table, in a new space To store data, the fork function returns, and the essence of return is writing. Therefore, whoever returns first writes first. Because the process is independent and copy-on-write occurs, if and else if can be executed at the same time.

3. Copy while writing

Let's look at the following program:

#include <stdio.h>
#include <unistd.h>

int global_val = 100;

int main()
{
    
    
    pid_t id = fork();
    if(id < 0)
    {
    
    
        printf("fork error\n");
        return 1;
    }
    else if(id == 0)
    {
    
    
        int cnt = 0;
        while(1)
        {
    
    
            printf("我是子进程,pid:%d,ppid:%d | global_val:%d,&global:%p\n", getpid(),getppid(),global_val,&global_val);
            sleep(1);
            cnt++;
            if(cnt == 10)
            {
    
    
                printf("子进程已经更改了全局的变量啦.....\n");
                global_val = 300;
            }
        }
    }
    else
    {
    
    
        while(1)
        {
    
    
            printf("我是父进程,pid:%d,ppid:%d | global_val:%d,&global:%p\n", getpid(),getppid(),global_val,&global_val);
            sleep(2);
        }
    }
    
    return 0;
}

Insert image description here

We found that the addresses of the global_val variables in the child process and the parent process are the same, but the values are different. We know that the operating system will create a process address space and page table for each process, and then map the address space to the physical through the page table. Memory

For the parent process, the parent process and the child process share code and data, but in order to ensure the independence of the process, when one of the processes needs to modify data, copy-on-write will occur – the operating system will re-open a space in the physical memory. , then copy the data in the original space to the new space, then modify the mapping relationship, and finally let the process modify the corresponding data.

Therefore, on the surface, the global_val addresses of the parent and child processes are the same, but this is only the same virtual address, and the physical address is not the same, so the global_val values of the parent and child processes are not the same. The same is true for the variable id that receives the return value of the fork function. The process that returns first will copy-on-write the id, so the id values are different for the parent and child processes.

Insert image description here

4.Routine usage of fork

The fork function is generally used in the following two scenarios:

1. A parent process wants to copy itself so that the parent and child processes execute different code segments at the same time. For example, a parent process waits for a client request and spawns a child process to handle the request.

2. A process wants to execute a different program. For example, after the child process returns from fork, it calls the exec function

5. Reasons for failure of fork call

There are two reasons why the fork function call may fail:

1. There are too many processes in the system

2. The number of actual user processes exceeds the limit

We can write a program that creates processes in an infinite loop to test how many processes our current operating system can create:

#include <stdio.h>
#include <unistd.h>

int main()
{
    
    
    int cnt = 0;
    while(1)
    {
    
    
        int ret = fork();
        if(ret < 0){
    
    
            printf("fork error!, cnt: %d\n", cnt);
            break;
        }
        else if(ret == 0){
    
    
            //child
            while(1) sleep(1);
        }
        //partent
        cnt++;
    }
    return 0;
}

The above code may cause the server or virtual machine to hang directly, so I will not test it here. Interested friends can test it.

2. Process termination

1. Process exit code

We run a process to let the process complete a certain task for us. Since it is to complete a task, then we may be concerned about the completion of the task by the process, so we need to determine the execution result of the task. At this time, It is necessary to use the process exit code. The function of the process exit code is to calibrate whether the execution result of a process is correct. Different process exit codes indicate different execution results. Generally speaking, there are three situations when a process exits:

1. The process exits (the code is finished running), the result is correct, and return 0 at this time

2. The process exits (the code is finished running) and the result is incorrect. At this time, return !0

3. The code has not finished running, and the program is abnormal. At this time, the exit code is meaningless.

For !0, different numbers correspond to different error codes. We can use the exit code mapping relationship provided by the system, or we can set the error information corresponding to different exit codes ourselves. We can use the C language to provide The strerror function prints out the mapping relationship of error codes provided by the system:

#include <stdio.h>
#include <string.h>

int main()
{
    
    
    int i = 0;
    for(i = 0; i < 100; ++i)
    {
    
    
        printf("%d:%s\n",i,strerror(i));
    }
    
    return 0;
}

Insert image description here

In Linux, there is a variable "?" - This variable always stores the exit code when the latest process completes execution. We can use "echo $?" to view the exit code of the latest process:

Insert image description here

We can see that when we enter the "echo #?" command again, the printed value is 0. This is because echo itself is also an executable program. When we use echo to view?, echo will also be run, so we When you check $? again later, the result is 0

2. Process exit scenario

There are three scenarios when a process exits:

1. The code has been run and the result is correct – the exit code is 0 at this time

2. The code runs but the result is wrong - the exit code is non-0 at this time

3. The code terminates abnormally – the exit code is meaningless at this time

3. Common exit methods for processes

There are three ways to exit a process:

1.main function return returns

2. Call exit to terminate the program

3. Call _exit to terminate the program

我们平时最常用的就是通过main函数return返回退出程序，但是其实我们也可以通过库函数exit和系统调用_exit直接终止程序

库函数exit

头文件:stdlib.h
函数原型:void exit(int status);

status:status 定义了进程的终止状态，父进程通过wait来获取该值
    
函数功能:终止程序

Insert image description here

我们可以看到，exit会将我们的进程直接终止，无论程序代码是否执行完毕

系统调用 _exit

头文件：unistd.h

函数原型：void _exit(int status);

status:status定义了进程的终止状态，父进程通过wait来获取该值
 
函数功能：终止进程

Insert image description here

【注意】

参数：status 定义了进程的终止状态，父进程通过wait来获取该值

说明：虽然status是int，但是仅有低8位可以被父进程所用。所以_exit(-1)时，在终端执行$?发现返回值255

exit 和 -exit 的区别

exit 终止进程，会主动刷新缓冲区，_exit终止进程，不会刷新缓冲区

我们以下面的例子来进行说明：

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main()
{
    
    
    printf("process is running");
    exit(1);
    // _exit(1);
    printf("process is running done\n");
    return 0;
}

Insert image description here

我们可以看到，exit 终止进程，会主动刷新缓冲区，_exit终止进程，不会刷新缓冲区，分析如下：

1.由于exit是C语言库函数，而_exit是系统调用，所以可以肯定的是exit的底层是_exit函数,exit是_exit的封装

2.由于计算机体系结构的限制，CPU之间和内存交互，所以数据会先被写入到缓存区，待缓存区刷新时才被打印到显示器上，而上面的程序中，我们没有使用"\n"进行缓冲区的刷新，可以看到，exit最后打印了"process id running",而_exit什么也没有打印，所以exit在终止程序后会刷新缓冲区，而_exit终止程序后不会刷新缓冲区

3.由于exit的底层是_exit，而_exit并不会刷新缓冲区，可以反映出缓冲区不在操作系统内部，而是在用户空间

进程退出不仅有正常的退出，还有不正常的退出，比如Ctrl C终止进程，或者程序中除0，野指针，空指针的解引用等问题，程序就会异常退出

Insert image description here

exit最后也会调用exit, 但在调用exit之前，还做了其他工作：

1.执行用户通过 atexit或on_exit定义的清理函数。

2.关闭所有打开的流，所有的缓存数据均被写入

3.调用_exit

Insert image description here

三、进程等待

1.为什么要进行进程等待

为什么要进行进程等待呢，有如下原因：

我们创建一个进程的目的是为了让其帮我们完成某种任务，而既然是完成任务，进程在结束前就应该返回任务执行的结果，供父进程或者操作系统进行读取，所以，一个进程在退出的时候，不能立即释放其全部的资源–对于进程的代码和数据，操作系统可以释放，因为该进程已经不会再被执行了，但是该进程的PCB应该被保留下来，因为PCB中存放着该进程的各种状态的代码，其中就包括退出状态代码。对于父子进程来说，当子进程退出后，如果父进程不对子进程的退出状态进行读取，进程一旦变成僵尸状态，那就刀枪不入，“杀人不眨眼”的kill -9 也无能为力，因为谁也没有办法杀死一个已经死去的进程。从而就会造成内存的泄漏

所以，我们需要父进程通过进程等待的方式，回收子进程资源，获取子进程退出信息，并让操作系统回收子进程的资源(释放子进程的PCB)

进程等待的本质

我们知道，子进程的退出信息是存放在子进程的task_struct中的，所以进程等待的本质就是从子进程task_struct中读取退出信息，然后保存到对应的变量中取

Insert image description here

2.如何进行进程等待

1.wait方法

我们可以通过wait系统调用来进行进程等待

Insert image description here

头文件: sys/types.h  sys/wait.h

函数原型:pid_t wait(int* status)
    
status:输出型参数，获取子进程退出状态

返回值:成功返回被等待进程的pid,失败返回-1

我们以以下的例子来说明wait的使用：

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
    
    
    pid_t id = fork();
    if(id == 0)
    {
    
    
        //子进程
        int cnt = 5;
        while(cnt)
        {
    
    
            printf("我是子进程: %d, 父进程: %d, cnt: %d\n", getpid(), getppid(), cnt--);
            sleep(1);
        }
        exit(0); //进程退出
    }
    sleep(15);
    int status = 0;
    pid_t ret = wait(&status);
    if(ret > 0)
    {
    
    
        printf("wait success: %d, sig number: %d, child exit code: %d\n", ret, (status & 0x7F), (status>>8)&0xFF);
    }

    sleep(5);
    return 0;
}

我们可以使用一个监控脚本来检测子进程从创建到终止到被父进程回收过程中状态的变化：

while :; do ps axj | head -1 && ps axj | grep mytest | grep -v grep; sleep 1; done

Insert image description here

我们可以看到，最开始父子进程都处于睡眠状态，之后子进程运行5s之后，此时由于父进程还要休眠10s，所以没有对子进程进行回收，所以子进程变成僵尸进程，10s过后，父进程使用wait系统调用对子进程进行进程等待，所以子进程由僵尸状态变成了死亡状态

2.waitpid方法

我们也可以使用waitpid来进行进程等待

Insert image description here

头文件:sys/types.h sys/wait.h

函数原型:pid_t waitpid(pid_t pid,int* status,int option);

pid :pid=1,等待任意一个子进程，与wait等效，pid > 0,等待其进程与pid相等的子进程；

status：输出型参数，获取子进程退出状态，不关心则可以直接设置为NULL

option：等待方式，option = 0 -> 阻塞等待，option = WNOHANG -> 非阻塞等

返回值:waitpid调用成功时返回被等待进程的pid;如果设置了WNOHANG,且waitpid发现没有已退出的子进程可收集，则返回0，调用失败则返回-1

我们以以下的例子来说明waitpid的使用：

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>


int main()
{
    
    
    pid_t id = fork();
    if(id == 0)
    {
    
    
        //子进程
        int cnt = 5;
        while(cnt)
        {
    
    
            printf("我是子进程: %d, 父进程: %d, cnt: %d\n", getpid(), getppid(), cnt--);
            sleep(1);
        }
        exit(12); //进程退出
    }
    // 父进程
    sleep(10);
    int status = 0; // 不是被整体使用的，有自己的位图结构
    pid_t ret = waitpid(id, &status, 0);
    if(ret > 0)
    {
    
    
        printf("wait success: %d, sig number: %d, child exit code: %d\n", ret, (status & 0x7F), (status>>8)&0xFF);
    }

    sleep(5);
    return 0;
}

Insert image description here

我们可以看到，waitpid和wait还是有很大区别，waitpid可以传递id来指定等待特定的子进程，也可以指定option来指明等待方式

【总结】

pid_ t waitpid(pid_t pid, int *status, int options);

返回值：

当正常返回的时候waitpid返回收集到的子进程的进程ID；

如果设置了选项WNOHANG,而调用中waitpid发现没有已退出的子进程可收集,则返回0；

如果调用中出错,则返回-1,这时errno会被设置成相应的值以指示错误所在；

参数：

pid：

Pid=-1,等待任一个子进程。与wait等效。

Pid>0.等待其进程ID与pid相等的子进程。

status:

WIFEXITED(status): 若为正常终止子进程返回的状态，则为真。（查看进程是否是正常退出）

WEXITSTATUS(status): 若WIFEXITED非零，提取子进程退出码。（查看进程的退出码）

options:

WNOHANG: 若pid指定的子进程没有结束，则waitpid()函数返回0，不予以等待。若正常结束，则返回该子进程的ID。

如果子进程已经退出，调用wait/waitpid时，wait/waitpid会立即返回，并且释放资源，获得子进程退出信息。

如果在任意时刻调用wait/waitpid，子进程存在且正常运行，则进程可能阻塞。

如果不存在该子进程，则立即出错返回

3.获取子进程status

我们在上面的程序中，打印sig number和child exit code的时候，打印的格式分别为status & 0x7F), (status>>8)&0xFF);这是由于status的位图结构决定我们这是使用的：

我们知道，wait和waitpid都有一个参数该参数是一个输出型参数，由操作系统填充，如果传递的参数是 NULL，则表示不关心子进程的退出状态的信息，否则，操作系统会根据该参数，将子进程的退出信息反馈给父进程

status 不能简单的当作整形来看待，可以当作位图来看待，具体细节如下图，其中，我们只需要研究status的低16比特位

Insert image description here

我们可以看到，status低两个字节的内容被分成了两个部分–第一个字节的前七位表示退出信号，最后一位表示core dump标志，第二个字节表示退出的状态，退出状态即表示进程退出时的退出码

对于正常退出的程序来说，退出信号和core dump都标志为0，退出状态等于退出码，对于异常终止的程序来说，退出信号为不同终止原因对应的数字，此时退出状态就没有意义

所以status正确的读取方式如下：

printf("exit signal:%d,exit code:%d\n",(status & 0x7f),(status>>8 & 0xff));

其中，status按位与上0x7f表示保留低七位，其余九位全部置为0，从而得到退出信号

status右移8位得到退出状态，再按位与上0xff是为了得到防止右移时高位补1的情况

WIFEXITED与WEXITSTATUS宏

Linux提供了WIFEXITED与WEXITSTATUS宏来帮助我们获取status中的退出状态和退出信号，而不再需要我们自己执行按位操作

WIFEXITED(status): 若为正常终止子进程返回的状态，则为真。（查看进程是否是正常退出）

WEXITSTATUS(status): 若WIFEXITED非零，提取子进程退出码。（查看进程的退出码）

4.进程的阻塞等待方式与非阻塞等待方式

waitpid函数的第三个参数用于指定父进程的等到方式

Insert image description here

其中，option代表阻塞等待方式，option为WNOHANG代表非阻塞等待

Blocking waiting means that when the parent process executes the waitpid function, if the child process has not exited, the parent process can only block in the waitpid function until the child process exits. The parent process can only execute the following code after reading the exit information through waitpid.

Non-blocking waiting means that when the parent process executes the waitpid function, the parent process will directly read the status of the child process and return, and then execute the subsequent code without waiting for the child process to exit.

polling

Polling means that the parent process continuously waits for the child process in a cyclic manner under the premise of non-blocking state, and only exits with the child process.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <assert.h>

#define NUM 10

typedef void (*func_t)(); //函数指针

func_t handlerTask[NUM];

//样例任务
void task1()
{
    
    
    printf("handler task1\n");
}
void task2()
{
    
    
    printf("handler task2\n");
}
void task3()
{
    
    
    printf("handler task3\n");
}

void loadTask()
{
    
    
    memset(handlerTask, 0, sizeof(handlerTask));
    handlerTask[0] = task1;
    handlerTask[1] = task2;
    handlerTask[2] = task3;
}

int main()
{
    
    
    pid_t id = fork();
    // fork返回-1 直接断言断死
    assert(id != -1);
    if(id == 0)
    {
    
    
        //child
        int cnt = 10;
        while(cnt)
        {
    
    
            printf("child running, pid: %d, ppid: %d, cnt: %d\n", getpid(), getppid(), cnt--);
            sleep(1);
        }

        exit(10);
    }

    loadTask();
    // parent
    int status = 0;
    while(1)
    {
    
    
        pid_t ret = waitpid(id, &status, WNOHANG); //WNOHANG: 非阻塞-> 子进程没有退出, 父进程检测时候，立即返回
        if(ret == 0)
        {
    
    
            // waitpid调用成功 && 子进程没退出
            //子进程没有退出，我的waitpid没有等待失败，仅仅是监测到了子进程没退出.
            printf("wait done, but child is running...., parent running other things\n");
            for(int i = 0; handlerTask[i] != NULL; i++)
            {
    
    
                handlerTask[i](); //采用回调的方式，执行我们想让父进程在空闲的时候做的事情
            }
        }
        else if(ret > 0)
        {
    
    
            // 1.waitpid调用成功 && 子进程退出了
            printf("wait success, exit code: %d, sig: %d\n", (status>>8)&0xFF, status & 0x7F);
            break;
        }
        else
        {
    
    
            // waitpid调用失败
            printf("waitpid call failed\n");
            break;
        }
        sleep(1);
    }
    return 0;
}

Insert image description here

5. Process waiting summary

1. In order to read the exit result of the child process and recycle the resources of the child process, we need the process to wait

2. The essence of process waiting is that the parent process reads the exit information from the task_struct of the child process and then saves it to status.

3. We can obtain exit information through wait and waitpid system calls to complete the process waiting

4. The status parameter is an output parameter. The parent process writes the exit information of the child process into status through the wait/waitpid function.

5. Status is stored as a bitmap, including exit status and exit signal. If the exit signal is not 0, then the exit status is meaningless.

6. We can use the macros WIFEXITED and WEXITSTATUS provided by the system to obtain the exit status and exit signal in status respectively.

7. The process waiting mode is divided into blocking waiting mode and non-blocking waiting mode. The blocking waiting mode is marked with 0, and the non-blocking waiting mode is marked with the macro WONHANG.

8. Since non-blocking waiting does not wait for the child process to exit, we need to use polling to continuously obtain the exit information of the child process.

4. Process program replacement

1. The purpose of creating a child process

Creating a child process serves two purposes:

1. You want the child process to execute part of the parent process code and execute part of the disk code corresponding to the parent process.

2. If you want the child process to execute a brand new program, let the child process find a way to load the specified program on the disk and execute the code and data of the new program.

2. What is process program replacement?

For the second purpose of creating a subprocess - to let the subprocess execute a different program is program replacement

Process program replacement means that after the parent process uses the fork function to create a child process, the child process executes another program by calling the exec series of functions. When the process calls a certain exec function, the user space code and data of the process are completely Replace with the new program and then execute the new program

However, the task_struct and mm_struct of the original process and the process id will not change, and the mapping relationship of the page table may change. Therefore, when calling the exec series functions, a new process will not be created, but the original process will be allowed to execute another process. Code and data for a new program

3. Principle of process program replacement

After using fork to create a child process, it executes the same program as the parent process (but may execute different code branches). The child process often calls an exec function.

to execute another program. When a process calls an exec function, the user space code and data of the process are completely replaced by the new program, starting from the start of the new program

The routine begins execution. Calling exec does not create a new process, so the ID of the process does not change before and after calling exec. Therefore, process program replacement is to replace the code and data in the physical memory of the original process with the code and data area of the new program.

Insert image description here

4. How to replace process program

(1) Replacement function

Linux provides a series of exec functions to implement process program replacement, including six library functions and a system call

Insert image description here

We can see that there is only one system call function to realize process program replacement – execve. The other series of exec functions are all encapsulation of the execve system call to meet different replacement scenarios. The bottom layer still calls execve.

The six library functions are as follows

#include <unistd.h>

extern char **environ;

int execl(const char *path, const char *arg, ...);
int execlp(const char *file, const char *arg, ...);
int execle(const char *path, const char *arg,..., char * const envp[]);
int execv(const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[]);
int execvpe(const char *file, char *const argv[],char *const envp[]);

Once these functions are called successfully, it means that the code and data of the original program have been replaced by the new program. In other words, subsequent statements of the original program will no longer be executed, so there is no Return value, because the return value received is no longer meaningful. Only when the exec function call fails and the original program can continue to execute, the return of exec is meaningful.

[Summary] If these functions are called successfully, a new program will be loaded and executed starting from the startup code and will not return. If there is an error in the call, -1 is returned, so the exec function only has an error return value but no success return value.

(2) Understanding function naming

These function prototypes seem easy to confuse, but once you master the rules, they are easy to remember.

l(list): indicates that the parameters are in a list

v(vector): Array for parameters

p(path): p automatically searches for files under the environment variable PATH path, that is, we do not need to bring the path when replacing programs related to Linux instructions.

e(env): Indicates that you maintain environment variables yourself

Insert image description here

(3)Use of functions

If we want to execute a program, we first need to find the executable program. Second, we need to specify the way the program is executed, that is, how to execute it. For the exec function, 'p' and non-'p' are used Find the program, 'l' and 'v' are used to specify the program execution method, and 'e' is used to specify the environment variable

execl && execlp

The use of the exec function is actually very simple. The first parameter is the path of the program we need to replace. If the program is in the PATH environment and the exec function has "p", we can just write the function name without understanding.

Let's take the Linux command "ls" as an example. We know that ls is an executable program in the "usr/bin" directory in Linux, and the program is in the PATH environment variable. So if we need to replace the program, exec The first parameter of the function is as follows:

execl("/usr/bin/ls",...); // 需要带路径
execlp("ls",..); // 可以不带路径

What we need to note is that the exec function with "p" does not need to have a path; the prerequisite for the path is that the program to be replaced is in the PATH environment variable. If it is not in the PATH environment variable, then we still need to have the path.

The second parameter is how we execute our program. Here we need to remember one thing: in the Linux command line, we pass parameters how to execute the program. It should be noted that multiple instructions in the command line are an entire string separated by spaces, and in the exec function we need to split different options, that is, each option must be divided into a separate string, so you can We see that there is a variable parameter list "..." in the exec function. At the same time, we now need to set the last variable parameter to NULL, indicating that the parameter transfer is completed.

// 命令行中怎么传递就怎么传递  ls -a -l
execl("/usr/bin/ls","ls","-a","l",NULL);
execlp("ls","ls","-a","l",NULL);

What we need to note is that ls in Linux actually uses the alise command to set the alias, so when we execute ls, it has the "-color=auto" option by default, which allows different types of files to have different colors.

So if we want different types of files to appear in different colors when replacing the ls process program, then we need to explicitly pass the "-color=auto" option

execl("/usr/bin/ls","ls","-a","-l","--color=auto",NULL);
execlp("ls","ls","-a","-l","--color=auto",NULL);

Below we use a specific example to demonstrate how to perform process program replacement:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
    
    
    pid_t id = fork();
    if(id == -1)
    {
    
    
        perror("fork");
        exit(1);
    }
    else if(id == 0)
    {
    
    
        printf("pid:%d,child process running..\n",getpid());
        int ret = execl("/usr/bin/ls","ls","-a","-l","--color=auto",NULL);
        
        if(ret == -1)
        {
    
    
            printf("process exec fail\n");
            exit(1);
        }
       	printf("pid:%d,child process done..\n",getpid());
        return 0;
    }
    
    int status;
    pid_t ret = waitpid(id, &status, 0);
    if(ret == -1)
    {
    
    
        perror("waitpid");
        return 1;
    }
    else
    {
    
    
        printf("wait success: exit code: %d, sig: %d\n", (status>>8)&0xFF, status & 0x7F);
    }
    
    return 0;
}

Insert image description here

We can see that when we use "ls -a -l" on the command line, we get the same result as when we use process program replacement.

execv && execvp

"v" in the exec function means that the parameters are passed in the form of an array - argv is an array of pointers. Each element in the array is a pointer, and each pointer points to a parameter (string). Similarly, the last element points to NULL, indicating that the parameters have been passed

Let’s still take the ls command as an example to demonstrate:

char* argv[]={
    
    
    (char*)"ls",
    (char*)"-a",
    (char*)"-l",
    (char*)"--color=auto",
    NULL  
 };
execlv("/usr/bin/ls",argv);                                                               
execvp("ls",argv);

Since "ls", "-a" and other strings are constant strings, and the parameters in argv are char* const instead of const char*, we need to force it here. If we don't force it, there will be no problem. big

Insert image description here

execle && execvpe

The "e" in the exec function represents the environment variable - like argv, envp is also an array of pointers. Each element in the array is a pointer, pointing to an environment variable (string). We can explicitly initialize envp. Pass our custom environment variables, but this also means that we give up the font's default environment variables

char *const envp_[] = {
    
    
            (char*)"MYENV=11112222233334444",
            NULL
        };
execle(".mybin","./mybin",NULL,envp);

mybin.c

#include <stdio.h>
#include <stdlib.h>

int main()
{
    
    
    // 系统就有
    printf("PATH:%s\n", getenv("PATH"));
    printf("PWD:%s\n", getenv("PWD"));
    // 自定义
    printf("MYENV:%s\n", getenv("MYENV"));

    printf("我是另一个C程序\n");
    printf("我是另一个C程序\n");
    printf("我是另一个C程序\n");
    printf("我是另一个C程序\n");
    printf("我是另一个C程序\n");
    printf("我是另一个C程序\n");
    printf("我是另一个C程序\n");
    printf("我是另一个C程序\n");
    printf("我是另一个C程序\n");

    return 0;
}

myexec.c

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <assert.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(int argc, char *argv[])
{
    
    
    printf("process is running...\n");
    pid_t id  = fork();
    assert(id != -1);

    if(id == 0)
    {
    
    
        execle("./mybin", "mybin", NULL, envp_); //自定义环境变量
        exit(1); //must failed
    }

    int status = 0;
    pid_t ret = waitpid(id, &status, 0);
    if(ret>0) printf("wait success: exit code: %d, sig: %d\n", (status>>8)&0xFF, status & 0x7F);
}

Insert image description here

We can see that here we only obtain the custom environment variable MYENV, but the system environment variables PATH and PWD are not obtained.

We can get the system's environment variables by passing environ

if(id == 0)
{
    
    
    extern char** environ;
    execle("./mybin", "mybin", NULL, environ); //自定义环境变量
    exit(1); //must failed
}

Insert image description here

But at this time we cannot get our customized environment variables, so how can we get the custom environment variables and system environment variables at the same time? At this time we can use the putenv function to import the custom environment variables into the system environment variables. , and then achieve it by passing the environment variable environ

Insert image description here

putenv((char*)"MYENV=4443332211"); //将指定环境变量导入到系统中 environ指向的环境变量表
execle("./mybin", "mybin", NULL, environ); //实际上，默认环境变量你不传，子进程也能获取

Insert image description here

5. Implement a simple shell

1.Initial implementation of shell

To implement a simple named line interpreter, we probably need to divide it into the following steps:

1. Output prompt, which is the prompt to the left of where we usually write instructions.

2. Obtain commands from the terminal for command input

3. Parse the input command

4. Create a child process

5. Process program replacement

6. Process waiting

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <string.h>

#define NUM 1024 // 一个指令的最大长度
#define OPT_NUM 64  // 一个指令的最多选项

char lineCommand[NUM];  // 保存输入命令的数组
char* myargv[OPT_NUM];  // 保存选项的数组

int main()
{
    
    
    while(1)
    {
    
    
        // 输出提示符
    	printf("[用户名@主机名 当前路径]$");
    	fflush(stdout);
    
    	// 从键盘(stdin)获取指令输入
    	char* s = fgets(lineCommand,sizeof(lineCommand)-1,stdin);//最后一个位置来保存极端场景下的\0
    	if(s == NULL)
    	{
    
    
        	perror("fgets");
        	exit(1);
    	}
    
    	lineCommand[strlen(lineCommand)-1] = '\0', //消除命令行中最后的换行符
    
    	// 将输入的字符串解析为说个字符串存放到myargv数组中，即字符串切割
    	myargv[0] = strtok(lineCommand," ");
    	int i = 1;
    	while(myargv[i++] = strtok(NULL," "));
    
    	// 创建子进程
    	pid_t id = fork();
    
    	if(id == -1)
    	{
    
    
        	perror("fork");
        	exit(1);
    	}
    	else if(id == 0)
    	{
    
    
        	// 子进程进行进程程序替换
        	execvp(myargv[0],myargv);
        	exit(1);
    	}
        else
        {
    
    
         	int status = 0;
    		pid_t ret = waitpid(id, &status,0);
    
    		if(ret == -1)
    		{
    
    
        		perror("waitpid");
        		exit(1);
    		}   
        }
    }
   	
    return 0;
}

Insert image description here

In this way, we have completed some basic instructions in Linux, but we found that there is no color function when we use ls. We can individually judge the ls instruction in the program, and then manually add "– color=auto"option

if(myargv[0] != NULL && strcmp(myargv[0],"ls") == 0)
{
    
    
    myargv[1++] = (char*)"--color=auto";
}

Insert image description here

2. What is the current path?

We will find a problem when running our above program. When we use cd to change the path, the pwd command still displays our original path.

Insert image description here

Before we solve this problem, we need to understand what the current path is:

Insert image description here

We can see that after the test program is run, there are two paths in the system. An exe path refers to the path of the test executable program on the disk, and cwd (current working directory) is the working directory of the current process. Just what we usually call the current path

In Linux, we can use the chdir system call to change the working directory of a process:

Insert image description here

After understanding what is the working directory of the current process, we can explain why the directory will not change after our shell executes the cd command.

Myshell executes various instructions in the command line by creating a child process. In other words, the cd command is executed by the child process, so naturally the working directory of the child process is changed, and the working directory of the parent process is changed. will not change

When we use the pwd command to view the current path, the child process corresponding to the cd command has completed execution and exited. At this time, myshell will create a new process for pwd, and the working directory of this child process is the same as the working directory of the parent process. , so the working directory printed by pwd will not change

If we want to solve this problem, we need to use chdir to change the working directory of the parent process to the specified directory, so here we need to make separate judgments on the instructions.

// cd 改变父进程的路径
if(myargv[0] != NULL && strcmp(myargv[0],"cd") == 0)
{
    
    
    if(myargv[1] != NULL)
    {
    
    
        chdir(myargv[1]);// myargv[1]中保存着指定路径
    }
    continue;  // 下面的语句不需要在继续执行了，以为你cd的目的已经达到了
}

Insert image description here

3. Built-in commands/external commands

There are two types of commands in Linux – built-in commands and external commands

The built-in command is part of the shell program. Its function is in the bash source code. It does not need to spawn a sub-process to execute, nor does it need to rely on external program files to run. Instead, it is completed by the internal logic of the shell process itself, and the external command is This is accomplished by creating a sub-process, then replacing the process program, using foreign currency program files, etc.

We can use the type command to distinguish built-in commands in Linux from external commands

Insert image description here

We process the cd command as a built-in command - when myshell encounters the cd command, it directly changes the working directory and continues directly after processing, instead of creating a child process. However, for pwd we Not processed separately into built-in commands

At the same time, we found that echo is also a built-in command, which also explains why "echo$ variable" can view local variables and "echo$?" can obtain the exit code of the most recent process. The reasons are as follows:

Local variables are only valid in the current process, but when you use echo to view local variables, the shell does not create a child process, but directly searches in the current process, so you can view local variables.

The shell can obtain the exit status of a child process by waiting for the process, and then save it in the variable ?. When the command line enters "echo$?", it directly outputs the content in ?, and then sets ? to 0. (echo exits normally with the exit code), and there is no need to create a child process.

So we can also add the echo command to our shell program:

int  lastCode = 0; // 保存退出码
int  lastSig = 0; //保存退出信号
if(myargv[0] != NULL && myargv[1] != NULL && strcmp(myargv[0], "echo") == 0)
{
    
    
    if(strcmp(myargv[1], "$?") == 0)
    {
    
    
        printf("%d, %d\n", lastCode, lastSig);
    }
    else
    {
    
    
        printf("%s\n", myargv[1]);
    }
    continue;
}

// fork之后添加的内容
lastCode = ((status >> 8) & 0xff);
lastSig = (status & 0x7f);

Insert image description here

4.shell complete code

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <string.h>

#define NUM 1024 // 一个指令的最大长度
#define OPT_NUM 64  // 一个指令的最多选项

char lineCommand[NUM];  // 保存输入命令的数组
char* myargv[OPT_NUM];  // 保存选项的数组

int  lastCode = 0;
int  lastSig = 0;

int main()
{
    
    
    while(1)
    {
    
    
        // 输出提示符
    	printf("[用户名@主机名 当前路径]$");
    	fflush(stdout);
    
    	// 从键盘(stdin)获取指令输入  输入\n
    	char* s = fgets(lineCommand,sizeof(lineCommand)-1,stdin);//最后一个位置来保存极端场景下的\0
    	if(s == NULL)
    	{
    
    
        	perror("fgets");
        	exit(1);
    	}
    	//消除命令行中最后的换行符
    	lineCommand[strlen(lineCommand)-1] = '\0', 
    
    	// 将输入的字符串解析为说个字符串存放到myargv数组中，即字符串切割
        // "ls -a -l -i" -> "ls" "-a" "-l" "-i" -> 1->n
    	myargv[0] = strtok(lineCommand," ");
    	int i = 1;
        if(myargv[0] != NULL && strcmp(myargv[0],"ls") == 0)
		{
    
    
    		myargv[1++] = (char*)"--color=auto";
		}
        // 如果没有子串了，strtok->NULL, myargv[end] = NULL
    	while(myargv[i++] = strtok(NULL," "));
        
    	// cd 改变父进程的路径
        // 如果是cd命令，不需要创建子进程,让shell自己执行对应的命令，本质就是执行系统接口
        // 像这种不需要让我们的子进程来执行，而是让shell自己执行的命令 --- 内建/内置命令
		if(myargv[0] != NULL && strcmp(myargv[0],"cd") == 0)
		{
    
    
    		if(myargv[1] != NULL)
    		{
    
    
        		chdir(myargv[1]);// myargv[1]中保存着指定路径
        	}
    		continue;  // 下面的语句不需要在继续执行了，因为你cd的目的已经达到了
		}
        
        if(myargv[0] != NULL && myargv[1] != NULL && strcmp(myargv[0], "echo") == 0)
		{
    
    
    		if(strcmp(myargv[1], "$?") == 0)
    		{
    
    
        		printf("%d, %d\n", lastCode, lastSig);
    		}
    		else
    		{
    
    
        		printf("%s\n", myargv[1]);
    		}
    		continue;
		}
        
    	// 创建子进程
    	pid_t id = fork();
    
    	if(id == -1)
    	{
    
    
        	perror("fork");
        	exit(1);
    	}
    	else if(id == 0)
    	{
    
    
        	// 子进程进行进程程序替换
        	execvp(myargv[0],myargv);
        	exit(1);
    	}
        else
        {
    
    
         	int status = 0;
    		pid_t ret = waitpid(id, &status,0);
    
    		if(ret == -1)
    		{
    
    
        		perror("waitpid");
        		exit(1);
    		}   
            lastCode = ((status >> 8) & 0xff);
			lastSig = (status & 0x7f);
        }
    }
   	
    return 0;
}