【Linux】Process control, process replacement

1. Process creation

First introduction to fork function

The fork function is a very important function in Linux. It creates a new process from an existing process. The new process is the child process, and the original process is the parent process.

#include <unistd.h>
pid_t fork(void);
返回值：自进程中返回0，父进程返回子进程id，出错返回-1

The process calls fork. After control is transferred to the fork code in the kernel, the kernel does:

Allocate new memory blocks and kernel data structures to the child process

Copy part of the data structure content of the parent process to the child process

Add child process to system process list

fork returns and starts scheduler scheduling

When a process calls fork, there are two processes with the same binary code. And they all run to the same place. But each process will be able to start their own journey.

Before fork, the parent process executes independently. After fork, the parent and son execution streams execute separately.

Note that after fork, who executes first is completely determined by the scheduler.

fork function return value

The child process returns 0, and
the parent process returns the pid of the child process.

Why does the fork function return 0 to the child process and the PID of the child process to the parent process?

The reason is very simple. A father can have many children, but a child can only have one father. Therefore, a parent process can create many child processes, and a child process can only have one parent process. Therefore, the parent process does not need to be marked, and for the parent process, it may have to manage multiple child processes, so the parent process needs to know the PID of the child process to manage the next operations.

Why does the fork function have two return values?

When the parent process calls the fork function, the fork function performs a series of operations internally, including but not limited to creating the process control block task_struct for the child process, process address space mm_struct, page table and other information processing, that is to say, during fork The child process has been created before the return inside the function, so the return statement of the fork function is executed by the parent process and the child process at the same time. This is why fork has two return values.

copy on write

Usually, the code of the father and the son is shared. When the father and the son are not writing, the data is also shared. When either party tries to write, they each have a copy in the form of copy-on-write. See the figure below for details:

Why does data need to be copied on write?

One of the biggest characteristics of processes is their independence. If data is modified in one process without copy-on-write, it will affect the data of other processes. In order to ensure the independence of each process, copy-on-write is necessary. of

Why not copy the data when creating the child process?

In fact, many times, the child process does not necessarily use all the data of the parent process, and if in some cases the child process simply reads the data of the parent process, there is no need to copy the data. All correct approaches should be to allocate on demand and then allocate when data needs to be modified, so that memory space can be used efficiently.

Will the code be copied on write?

In many cases, there is no need to copy the code. However, if there is no guarantee, for example, when replacing a process, copy-on-write code copying is required.

Common usage of fork

A parent process wants to duplicate itself so that the parent and child processes execute different code segments at the same time. For example, a parent process waits for a client request and spawns a child process to handle the request.

A process wants to execute a different program. For example, after the child process returns from fork, it calls the exec function.

Reasons why the fork call failed

Usually nothing goes wrong

Unless there are too many processes in the system and
the number of actual user processes exceeds the limit

2. Process termination

Process exit scenario

The code is run and the result is correct

The code runs and the result is incorrect

Code terminates abnormally

Common exit methods for processes

Normal termination (you can check the process exit code through echo $?):

Return from main

call exit

_exit (system call)

quit unexpectedly:

ctrl + c, signal terminated

_exit function

#include <unistd.h>
void _exit(int status);
参数：status 定义了进程的终止状态，父进程通过wait来获取该值

Note: Although status is int, only the lower 8 bits can be used by the parent process. So when _exit(-1) is executed in the terminal, echo $?the return value is 255

Note: Exit codes have corresponding string meanings to help users confirm the cause of execution failure. The specific meanings of these exit codes are artificially specified. The string meanings of the same exit code may be different in different environments.

The strerror function in C language can obtain the error information corresponding to the error code in C language through the error code.

We don't often use the method of using the _exit function to exit the process. The _exit function can also exit the process anywhere in the code, but the _exit function will directly terminate the process and will not do any finishing work before exiting the process.

For example, if _exit is used in the following code to terminate the process, the data in the buffer will not be output.

exit function

#include <unistd.h>
void exit(int status);

In fact, the underlying implementation of exit will eventually call exit, but before calling exit, it also does other work:

Execute user-defined cleanup function.

All open streams are closed and all cached data is written

call_exit

For example, in the following code, exit will output the data in the buffer before terminating the process.

return exit

return is a more common way to exit a process. Executing return n is equivalent to executing exit(n), because the runtime function calling main will use the return value of main as the parameter of exit.

The differences and connections between return, exit and _exit

Only return in the main function can exit the process. Return in the sub-function cannot exit the process. The exit function and _exit function can exit the process when used anywhere in the code.

Executing return num in the main function is equivalent to executing exit(num), because after calling the main function, the return value of the main function will be used as the parameter of exit to call the exit function.

3. Process waiting

The necessity of process waiting

The child process exits. If the parent process is left alone, it may cause 僵尸进程problems and cause memory leaks. Moreover, once the process enters 僵尸状态, it will be invulnerable. Even the "kill without blinking" kill -9 can't do anything, because no one can kill it. Kill a dead process.

Moreover, child processes are generally created by the parent process to dispatch tasks, so the parent process needs to know the running results and result status of the child process.

The parent process needs to wait for the process to obtain the running results of the child process and recycle the resources of the child process.

Process waiting method

wait method

#include<sys/types.h>
#include<sys/wait.h>
pid_t wait(int*status);
返回值：
    成功返回被等待进程pid，失败返回-1。
参数：
    输出型参数，获取子进程退出状态,不关心则可以设置成为NULL

waitpid method

pid_ t waitpid(pid_t pid, int *status, int options);
返回值：
    当正常返回的时候waitpid返回收集到的子进程的进程ID；
    如果设置了选项WNOHANG,而调用中waitpid发现没有已退出的子进程可收集,则返回0；
    如果调用中出错,则返回-1,这时errno会被设置成相应的值以指示错误所在；
参数：
    pid：
        Pid=-1,等待任一个子进程。与wait等效。
        Pid>0.等待其进程ID与pid相等的子进程。
status:
    WIFEXITED(status): 若为正常终止子进程返回的状态，则为真。（查看进程是否是正常退出）
    WEXITSTATUS(status): 若WIFEXITED非零，提取子进程退出码。（查看进程的退出码）
options:
    WNOHANG: 若pid指定的子进程没有结束，则waitpid()函数返回0，不予以等待。若正常结束，则返回该子进
    程的ID。

If the child process has exited, when call wait/waitpid, wait/waitpid will return immediately, release resources, and obtain the child process exit information.

If wait/waitpid is called at any time and the child process exists and is running normally, the process may be blocked.

If the child process does not exist, an error is returned immediately.

Get child process status

The two functions wait and waitpid used by the process to wait have a status parameter, which is an output parameter and is filled in by the operating system.

If you pass NULL for the status parameter, it means that you do not care about the exit status information of the child process. Otherwise, the operating system will feed back the exit information of the child process to the parent process through this parameter.

status is an integer variable, but status cannot simply be treated as an integer. Different bits of status represent different information. The specific details are as follows (only the lower 16 bits of status are studied):

//在代码中我们可以这样获取对应的状态码
exitCode = (status >> 8) & 0xFF; //退出码
exitSignal = status & 0x7F;      //退出信号

//对于此，系统当中提供了两个宏来获取退出码和退出信号。
//WIFEXITED(status)：用于查看进程是否是正常退出，本质是检查是否收到信号。
///WEXITSTATUS(status)：用于获取进程的退出码。
exitNormal = WIFEXITED(status);  //是否正常退出
exitCode = WEXITSTATUS(status);  //获取退出码

However, it should be noted that if a process exits abnormally, it means that the process was killed by a signal, and the corresponding exit code has no meaning. Just look at its exit signal.

//wait函数测试代码
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main()
{
    
    
  pid_t pid=fork();
  if(pid==-1){
    
    
    perror("fork");
    exit(1);
  }else if(pid==0){
    
    
      //子进程
      int cnt=5;
      while(cnt--){
    
    
        printf("我是子进程，我的pid是:%d,我的ppid是%d\n",getpid(),getppid());
        sleep(1);
      }
      exit(2);
  }else{
    
    
      //父进程
      int status;
      int ret=wait(&status);
      if(ret==-1){
    
    
        perror("wait");
        exit(3);
      }else{
    
    
        int exitSignal=status & 0x7F;//退出信号
        int exitCode=(status >>8) & 0xFF; //退出码
        printf("exitSignal:%d exitCode:%d \n",exitSignal,exitCode);
      }
  }
  return 0;
}

We can use the following monitoring script to monitor the process in real time:

while :;do ps -axj | head -1 && ps -axj | grep test | grep -v grep ; echo "#############"; sleep 1;done

Blocking and non-blocking waiting methods

Blocking wait mode of process

After creating a child process, the parent process can use the waitpid function to wait for the child process (in this case, set the third parameter of waitpid to 0) until the child process exits and reads the exit information of the child process.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main()
{
     
     
  pid_t pid=fork();
  if(pid==-1){
     
     
    perror("fork");
    exit(-1);
  }
  if(pid==0){
     
     
    //子进程
    int cnt=5;
    while(cnt--){
     
     
      printf("我是子进程,我的pid:%d,我的ppid:%d\n",getpid(),getppid());
      sleep(1);
    }
    exit(1);
  }
  //父进程
  int status;
  int ret=waitpid(pid,&status,0);//阻塞等待
  if(ret==-1){
     
     
      perror("wait");
      return 1;
  }else{
     
     
      printf("wait success,exitSignal:%d exitCode:%d\n",status & 0x7F,(status >>8)& 0xFF);
  }

  return 0;
}
//运行结果
//我是子进程,我的pid:5776,我的ppid:5775
//我是子进程,我的pid:5776,我的ppid:5775
//我是子进程,我的pid:5776,我的ppid:5775
//我是子进程,我的pid:5776,我的ppid:5775
//我是子进程,我的pid:5776,我的ppid:5775
//wait success,exitSignal:0 exitCode:1
Similarly, we can test that while the parent process is running, we can try to use the kill -9 command to kill the child process. At this time, the parent process can also wait for the child process to succeed. (The exit code of a process killed by a signal is meaningless)

Non-blocking waiting method of process :

In the above example, when the child process does not exit, the parent process is waiting for the child process to exit. During the waiting period, the parent process cannot do anything. This kind of waiting is called blocking waiting.

In fact, we can ask the parent process not to wait for the child process to exit. Instead, the parent process can do some things of its own when the child process does not exit. When the child process exits, the child process's exit information is read, that is, non-blocking waiting.

The method is very simple. Pass WNOHANG to the third parameter potions of the waitpid function. In this way, if the waiting child process does not end, the waitpid function will directly return 0 and will not wait. If the waiting child process ends normally, the pid of the child process is returned.

The parent process can call the waitpid function every once in a while. If the waiting child process has not exited, the parent process can do some other things first, and then call the waitpid function after a while to read the exit information of the child process.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int main(){
     
     
  pid_t pid=fork();
  if(pid==-1){
     
     
    perror("fork");
    exit(-1);
  }
  if(pid==0){
     
     
    //子进程
    int cnt=5;
    while(cnt--){
     
     
      printf("我是子进程,我的pid:%d,我的ppid:%d\n",getpid(),getppid());
      sleep(1);
    }
    exit(1);
  }
  //父进程
  int status;
  while(1){
     
     
       int ret=waitpid(pid,&status,WNOHANG);//非阻塞等待
      if(ret==-1){
     
     
          perror("wait");
          return 1;
      }else if(ret==0){
     
     
          printf("非阻塞等待中.....正在执行其他任务.......\n");
          sleep(1);
      }else {
     
     
          printf("wait success,exitSignal:%d exitCode:%d\n",status & 0x7F,(status >>8)& 0xFF);
          break;
      }
  }
  return 0;
}

4. Process program replacement

replacement principle

The child process created with fork often executes the same program as the parent process. Although it is possible to execute different codes, if you want the child process to execute another program, you often need to call the exec function.

When a process calls an exec function, the user space code and data of the process are completely replaced by the new program, and execution starts from the new program's startup routine. Calling exec does not create a new process, so the ID of the process does not change before and after calling exec.

After the child process replaces the process program, will it affect the code and data of the parent process?

Obviously not, because we have said that one of the biggest characteristics of the process is independence. If one of the processes modifies data, copy-on-write will be performed.

replacement function

In fact, there are six functions starting with exec, collectively called exec functions:

#include <unistd.h>
int execl(const char *path, const char *arg, ...);
int execlp(const char *file, const char *arg, ...);
int execle(const char *path, const char *arg, ...,char *const envp[]);
int execv(const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[]);

一、int execl(const char *path, const char *arg, …);

The first parameter is the path to the program to be executed, and the second parameter is a variable parameter list, indicating how you want to execute the program, and ends with NULL.

For example, the ls program is to be executed.

execl("usr/bin/ls","ls","-a","-l",NULL);

It turns out that after the exec function is successfully executed, all the code under the exec function will not be executed concurrently, because all code data has been replaced after the exec function is executed.

二、int execlp(const char *file, const char *arg, …);

The first parameter is the name of the program to be executed, and the second parameter is a variable parameter list, indicating how you want to execute the program, and ends with NULL.

For example, the ls program is to be executed.

execlp("ls","ls","-a","-l",NULL)

三、int execle(const char *path, const char *arg, …, char *const envp[]);

The first parameter is the path to the program to be executed, the second parameter is a variable parameter list, indicating how you want to execute the program, and ends with NULL, and the third parameter is the environment variable you set yourself.

For example, if you set the MYVAL environment variable, you can use the environment variable inside the mycmd program.

char* myenvp[] = {
    
     "MYVAL=2023", NULL };
execle("./mycmd", "mycmd", NULL, myenvp);

四、int execv(const char *path, char *const argv[]);

The first parameter is the path to execute the program, and the second parameter is an array of pointers. The contents of the array indicate how you want to execute the program. The array is terminated by NULL.

For example, the ls program is to be executed.

char* myargv[] = {
    
     "ls", "-a", "-l", NULL };
execv("/usr/bin/ls", myargv);

五、int execvp(const char *file, char *const argv[]);

The first parameter is the name of the program to be executed, and the second parameter is an array of pointers. The contents of the array indicate how you want to execute the program. The array is terminated by NULL.

For example, the ls program is to be executed.

char* myargv[] = {
    
     "ls", "-a", "-l", NULL };
execvp("ls", myargv)

六、int execve(const char *path, char *const argv[], char *const envp[]);

The first parameter is the path to execute the program, the second parameter is an array of pointers, the contents of the array indicate how you want to execute the program, the array ends with NULL, and the third parameter is the environment variable you set yourself.

For example, if you set the MYVAL environment variable, you can use the environment variable inside the mycmd program.

char* myargv[] = {
    
     "mycmd", NULL };
char* myenvp[] = {
    
     "MYVAL=2023", NULL };
execve("./mycmd", myargv, myenvp);

In fact, there is another function called execve system call

#include <unistd.h>
int execve(const char *filename, char *const argv[],
                  char *const envp[]);

In fact, exec is a family of functions, including execle, execlp, execvp, execv, and execl, but their specific implementation is based on calling execve(). The difference lies in the specification of path names, parameters and environment variables. The following distinguishes these functions from these three aspects:

Naming comprehension

These function prototypes seem easy to confuse, but they are easy to remember as long as you master the rules.

l(list): indicates that the parameters are in a list

v(vector): Array for parameters

p(path): Automatically search the environment variable PATH with p

e(env): Indicates that you maintain environment variables yourself

Function explanation

If these functions are called successfully, the specified program will be loaded and executed from the startup code without returning.
If the call fails, -1 is returned.

In other words, as long as the exec series function returns, it means that the call failed.

Make a simple version of the command line interpreter shell

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <string.h>

#define LEN 1024 //命令的最大的长度
#define NUM 32 //命令拆分后的最大个数
int main()
{
    
    
  char cmd[LEN];
  char* myargv[NUM];
  const char* user=getenv("USER");//获取用户名
  const char* path=getenv("PWD");//获取当前路径
  while(1)
  {
    
    
    printf("[%s@ %s]# ",user,path);
    fgets(cmd,LEN,stdin);
    cmd[strlen(cmd)-1]='\0';
    //分割解析命令
    int i=0;
    myargv[i++]=strtok(cmd," ");
    while(myargv[i]=strtok(NULL," ")) i++;
    if(fork()==0)
    {
    
    

      //子进程执行替换任务
      execvp(myargv[0],myargv);
      exit(1);
    }
    int status;
    int ret=waitpid(-1,&status,0);//回收资源
    if(ret==-1){
    
    
      perror("wait");
    }else{
    
    
      printf("exit code:%d\n", (status>>8)& 0xFF);
    }
  }
  return 0;
}