Article directory

process control

process control

Talking about fork

The fork function can create a new process from an existing process. The new process is the child process, and the original process is the parent process.

#include<unistd.h>
pid_t fork(void);//pid_t为返回值
返回值：fork成功就把子进程pid返回给父进程，而把0返回给子进程，如果fork失败就把-1返回给父进程，子进程没有返回值

The process calls fork,

The kernel first allocates a new memory block and kernel data structure to the child process, copies part of the data structure of the parent process to the child process, then adds the child process to the system process list, fork returns, and starts the scheduler scheduling.

After fork, it is completely up to the scheduler to decide who will execute the parent and child processes first.

copy-on-write

Usually, the parent-child process code is shared, and the physical space also uses the same block, but any process tries to write, the operating system first copies the process data, separates the different process data, changes the page table mapping, and then lets the process modify it— Realistic copy.

The reason for the failure of the fork call

Reason: There are too many processes in the system; the number of actual user processes exceeds the limit

Here is a piece of code, after running, you can see how many processes your operating system can accommodate. If you have a virtual machine or a cloud server, you can try it. After the system crashes, exit the system and wait for a while to restart.

  1 #include<stdio.h>
  2 #include<unistd.h>
  3 
  4 int main()
  5 {
    
    
  6   int num=0;
  7   while(1)          
  8   {
    
    
  9     int ret=fork();
 10     if(ret<0)//如果创建子进程失败
 11     {
    
             
 12       printf("fork error!,%d \n",num);
 13       break;
 14     }                
 15     else if(ret==0)                
 16     {
    
      
 17       //子进程                          
 18       while(1)
 19       sleep(1);
 20 
 21     }
 22     //父进程
 23     num++;
 24   }
 25   return 0;
 26 }

process terminated

When we write c++ or c code, most of them start writing from the main function, and then return 0 after writing; so what is the meaning of this return 0? The so-called return 0 is the process exit code, and the exit code records the result of the process exit, etc.

Scenario of process exit

There are only three scenarios for process exit

The code runs and the result is correct

The code runs to completion with incorrect results

The code execution terminated abnormally

So when the code is finished running, where can the result be seen?

Common ways to exit a process

View process exit code

echo $? : View process exit code

First I wrote this code, if num is equal to 5050 then the process exit code of the main function is 1, otherwise it is 0

Then after running, the first echo $? is the exit code 1 of the main function process of mytest.c. And echo $? is also a process, and the exit code is 0, so why the exit code of the latter process is 0?

return 0 This 0 indicates that the code has finished running, and the result of the process execution is correct, while non-zero indicates that the code has finished running, and the result is incorrect!

and! Different numbers in 0 indicate different errors

Here I want to mention a function strerror, which can convert the process exit code into a corresponding string that can summarize the result; then here I write a small program to print the string summary of the result corresponding to the process exit code within 200

I copied the result below, and you can see that there are 134 process exit codes in the Linux system, and each exit code has a corresponding result. Among them, the first type of 0 means that the result of the process execution is correct, and the others are the reasons for the wrong corresponding results of the process execution.

0: Success
 1: Operation not permitted
 2: No such file or directory
 3: No such process
 4: Interrupted system call
 5: Input/output error
 6: No such device or address
 7: Argument list too long
 8: Exec format error
 9: Bad file descriptor
 10: No child processes
 11: Resource temporarily unavailable
 12: Cannot allocate memory
 13: Permission denied
 14: Bad address
 15: Block device required
 16: Device or resource busy
 17: File exists
 18: Invalid cross-device link
 19: No such device
 20: Not a directory
 21: Is a directory
 22: Invalid argument
 23: Too many open files in system
 24: Too many open files
 25: Inappropriate ioctl for device
 26: Text file busy
 27: File too large
 28: No space left on device
 29: Illegal seek
 30: Read-only file system
 31: Too many links
 32: Broken pipe
 33: Numerical argument out of domain
 34: Numerical result out of range
 35: Resource deadlock avoided
 36: File name too long
 37: No locks available
 38: Function not implemented
 39: Directory not empty
 40: Too many levels of symbolic links
 41: Unknown error 41
 42: No message of desired type
 43: Identifier removed
 44: Channel number out of range
 45: Level 2 not synchronized
 46: Level 3 halted
 47: Level 3 reset
 48: Link number out of range
 49: Protocol driver not attached
 50: No CSI structure available
 51: Level 2 halted
 52: Invalid exchange
 53: Invalid request descriptor
 54: Exchange full
 55: No anode
 56: Invalid request code
 57: Invalid slot
 58: Unknown error 58
 59: Bad font file format
 60: Device not a stream
 61: No data available
 62: Timer expired
 63: Out of streams resources
 64: Machine is not on the network
 65: Package not installed
 66: Object is remote
 67: Link has been severed
 68: Advertise error
 69: Srmount error
 70: Communication error on send
 71: Protocol error
 72: Multihop attempted
 73: RFS specific error
 74: Bad message
 75: Value too large for defined data type
 76: Name not unique on network
 77: File descriptor in bad state
 78: Remote address changed
 79: Can not access a needed shared library
 80: Accessing a corrupted shared library
 81: .lib section in a.out corrupted
 82: Attempting to link in too many shared libraries
 83: Cannot exec a shared library directly
 84: Invalid or incomplete multibyte or wide character
 85: Interrupted system call should be restarted
 86: Streams pipe error
 87: Too many users
 88: Socket operation on non-socket
 89: Destination address required
 90: Message too long
 91: Protocol wrong type for socket
 92: Protocol not available
 93: Protocol not supported
 94: Socket type not supported
 95: Operation not supported
 96: Protocol family not supported
 97: Address family not supported by protocol
 98: Address already in use
 99: Cannot assign requested address
 100: Network is down
 101: Network is unreachable
 102: Network dropped connection on reset
 103: Software caused connection abort
 104: Connection reset by peer
 105: No buffer space available
 106: Transport endpoint is already connected
 107: Transport endpoint is not connected
 108: Cannot send after transport endpoint shutdown
 109: Too many references: cannot splice
 110: Connection timed out
 111: Connection refused
 112: Host is down
 113: No route to host
 114: Operation already in progress
 115: Operation now in progress
 116: Stale file handle
 117: Structure needs cleaning
 118: Not a XENIX named type file
 119: No XENIX semaphores available
 120: Is a named type file
 121: Remote I/O error
 122: Disk quota exceeded
 123: No medium found
 124: Wrong medium type
 125: Operation canceled
 126: Required key not available
 127: Key has expired
 128: Key has been revoked
 129: Key was rejected by service
 130: Owner died
 131: State not recoverable
 132: Operation not possible due to RF-kill
 133: Memory page has hardware error
 134: Unknown error 134
 135: Unknown error 135
 136: Unknown error 136
 137: Unknown error 137

Under normal circumstances, the process terminates normally, or the main function returns (other functions return as the end of the function call); call exit; system call _exit;

exit 和_exit

exit: After calling this function, the process can exit directly, and the parameter is the process exit code

Then I wrote such a code that if the main function does not exit, it will enter an infinite loop.

Then after running, check to see that the main function has indeed exited

Then I write this code again. Generally, the end of other functions is the end of the function call, and I call an exit function in front of the return value in this addtosum function to see whether it exits the function call or exits the process.

It turns out that the exit function directly exits the process in any function call!

exit is a library function, and _exit is a system call, then the underlying implementation of exit is also _exit, so what is the difference between exit and _exit?

wrote this code

After running, you can see that hello bug is printed out after two seconds

So what about changing exit to _exit?

It can be seen that it is not printed at all.

From this comparison, we can know that the difference between the exit of the library function and the _eixt of the system call is: exit will refresh the buffer before exiting the process, while _exit will not. And it can be deduced that the buffer is not in the operating system, it should be in user space.

process waiting

If the child process exits and the parent process ignores the process, it may cause a zombie process problem, which in turn leads to a memory leak. In addition, once the process becomes a zombie process, even the command kill -9 to kill the process is powerless. So how should the parent process manage the exiting child process?

After the child process finishes running, the parent process needs to wait through the process: recycle the child process resources, and obtain the child process exit information , so as to avoid the emergence of zombie processes afterwards.

How the process waits

wait

Once the parent process calls wait, it will block itself immediately, and wait will automatically analyze whether a child process of the current process has exited. If it finds a child process that has become a zombie, wait will collect the information of the child process, and Return after completely destroying it; if no such child process is found, wait will block here until one appears.

The parameter of the first function is *status, which is a pointer of integer type. In most cases, NULL is passed. If it succeeds, it returns the pid of the collected child process, and if it fails, it returns -1.

Then I wrote such a function, created a child process, entered the child process to print the child process pid and parent process ppid, and then slept for one second, and the child process exited after five seconds, but because the parent process also slept, it would Enter the zombie state, and after a few seconds, the parent process recycles the child process and prints out the return value of wait

This is indeed the case after running

waitpid

I wrote such a piece of code, ret gets the pid of the child process, stastus gets the exit information of the child process.

So what exactly is status?

Get subprocess status

1. Both wait and waitpid have a status parameter, which is an output parameter filled by the operating system; if NULL is passed, it means that the exit status information of the child process is not concerned; otherwise, the operating system will The exit information of the process is fed back to the parent process.

2. The status cannot be simply viewed as an integer, but can be viewed as a bitmap, that is, the status must be able to represent the three scenarios when the process exits. (The code finishes running, the result is correct; the code runs, the result is incorrect; the code runs abnormally terminated)

There are 32 bits in the binary of the integer status, now look at the first 16 bits (0-15);

The 8th-15th bits (the second lowest eight bits) store the exit status of the process, that is, the exit code of the child process (you can know whether the result of the process is correct through the exit status). If the exit code is 0, the result is correct. Non-zero corresponds to an error case .

Bits 0-6 (lower seven bits) store the termination signal of the process (you can know whether the process exits normally through the termination signal), and the eighth bit stores the core dump flag (as shown in the figure: the binary structure of status), and the termination signal is 0—process Exit normally. Non-zero is abnormal, and you can see most of the termination signals through kill -l.

Get the termination signal through status&0x7F[01111111] , and then get the exit code through (status>>8)&0xFF[011111111]

Correspondingly, the corresponding termination signal can also be obtained by killing the child process through kill

For example, if I kill -3 the child process, then the termination signal returned by the child process to the parent process is also 3

Macro definition to check whether the process exits normally, check the exit code

WIFEXITED(status) : If the status returned by the normal termination of the child process is true, (check whether the process exits normally)

WEXITSTATUS(status): If WIFEXITED is non-zero, extract the exit code of the child process (check the exit code of the process)

  1#include<string.h>
  2 #include<stdio.h>
  3 #include<unistd.h>
  4 #include<assert.h>
  5 #include<sys/types.h>
  6 #include<sys/wait.h>
  7 #include<stdlib.h>
  8 int  main()
  9 {
    
    
 10  pid_t id=fork();
 11 assert(id!=-1);
 12 if(id==0)
 13{
    
    
 14 //子进程
 15 int num=30;
 16 while(num)
 17 {
    
    
 18   printf("child running,pid: %d ,ppid: %d ,num: %d\n",getpid(),getppid(),num--);
 19   sleep(1);
 20 }
 21 exit(10);
 22 }
 23 //父进程
 24   int status=0;
 25   int ret=waitpid(id,&status,0);
 26   if(ret>0)
 27 {
    
                                                                                                     
 28   //判断子进程是否正常退出
 29   if(WIFEXITED(status))//子进程正常退出
 30   {
    
    //判断子进程运行的结果
 31     printf("exit code: %d\n",WEXITSTATUS(status));
 32   }else 
 33   {
    
    
 34     //子进程异常终止
 35     printf("child exit abnormally!\n ");
 36   }
 37 //  printf("wait success,exit code: %d, sig: %d\n",(status>>8)&0xFF,status&0x7F);
 38 }
 39 return 0;
 40 }

Talk about zombie process again

When the child process exits, the code and data will be released, but the exit information (exit code and exit signal) must be stored in the pcb of the child process. At this time, the child process is in the Z state. When the parent process system calls waitpid/wait, The parent process will take the exit information from the child process pcb to the status through the child process id.

Talking about blocking wait and non-blocking wait

Xiaoshuai —> parent process; girlfriend —> child process; phone call —> system call wait/waitpid

This man is called Xiaoshuai. He has a girlfriend. One day Xiaoshuai asked his girlfriend to go to the movies. He made an appointment to wait for her downstairs at 9 o'clock. Xiaoshuai arrived as scheduled. But he didn’t see his girlfriend, so Xiaoshuai called his girlfriend, and his girlfriend said that he would wait for a while while putting on makeup, Xiaoshuai said yes, but don’t hang up, Xiaoshuai waited, and asked his girlfriend in two minutes Did it melt, and asked again after two minutes, and didn't hang up until the girlfriend arrived downstairs.

—Checking the status of the girlfriend without hanging up the phone is blocking and waiting.

Another day, Xiaoshuai asked his girlfriend to have dinner, and he went downstairs at 9 o'clock. Xiaoshuai arrived as scheduled and called his girlfriend. The girlfriend said he had to wait while he was putting on makeup. This time Xiaoshuai said to hang up and was waiting to call her. While Xiaoshuai was waiting, he watched the football game for a while, read the book for a while, and called his girlfriend after a while to ask if he was all right. Xiaoshuai hung up the phone and continued to wait, doing other things while waiting.

Call - status detection, if not ready, return immediately (hang up); each time is a non-blocking wait. Multiple non-blocking wait-polls.

Macro definition WNOHANG: non-blocking wait

Pass the WHOHANG defined by the macro to waitpid, which is non-blocking waiting.

WNOHANG: If the child process specified by pid has not ended, the waitpid() function returns 0 and does not wait. If it ends normally, return the ID of the child process, if the call to waitpid fails, return -1

#include<string.h>
  2 #include<stdio.h>
  3 #include<unistd.h>
  4 #include<assert.h>
  5 #include<sys/types.h>
  6 #include<sys/wait.h>
  7 #include<stdlib.h>
  8 int  main()
  9 {
    
    
 10  pid_t id=fork();
 11 assert(id!=-1);
 12 if(id==0)
 13 {
    
    
 14 //子进程
 15 int num=3;                                                                                        
 16 while(num)
 17 {
    
    
 18   printf("child running,pid: %d ,ppid: %d ,num: %d\n",getpid(),getppid(),num--);
 19   sleep(3);
 20 }
 21 exit(10);
 22 }
 23 //父进程
 24 int status=0;
 25 while(1)
 26 {
    
    
 27   pid_t ret=waitpid(id,&status,WNOHANG);//WHOHANG:非阻塞->子进程没有退出，父进程检测时候，立即返回
 28   if(ret==0)
 29   {
    
    
 30     //waitpid调用成功，子进程没有退出
 31   printf("wait done,but child is running...\n");
 32   }
  33   else if(ret>0)
 34   {
    
    
 35     //waitpid调用成功，子进程退出了
 36     printf("wait success,exit code: %d,sig: %d\n",(status>>8)&0xFF,status&0x7F);
 37     break;
 38   }
 39   else{
    
    
 40     //waitpid调用失败
 41     printf("waitpid call failed!\n");
 42     break;
 43   }
 44   sleep(1);
 45 }
 46 return 0;}

The figure below shows the parent process non-blocking waiting and polling waiting, and finally the child process exits normally.

Here I pass a wrong id to the parent process, that is, the waitpid call fails

It can be seen that the failure of the call is printed, and the parent process exits while the child process is still running and adopted by the OS

The meaning of non-blocking waiting: it will not occupy all the resources of the parent process, and can do other things during polling.

While the parent process is waiting for the child process in a non-blocking manner, other things can be done. I wrote several tasks here, and chose to let the parent process call back the function to execute.

  1 #include<string.h>
  2 #include<stdio.h>
  3 #include<unistd.h>
  4 #include<assert.h>
  5 #include<sys/types.h>
  6 #include<sys/wait.h>
  7 #include<stdlib.h>
  8 
  9 #define NUM 10
 10 typedef void (*func_t)();//函数指针-void 是函数返回类型，fun_t是函数名，没有参数
 11 
 12 func_t handerTask[NUM];
 13 
 14 void task1()
 15 {
    
    
 16   printf("hander task1\n");
 17 }
 18 void task2()
 19 {
    
    
 20   printf("hander task2\n");
 21 }
 22 void task3()
 23 {
    
    
 24   printf("hander task3\n");
 25 }
 26 void loadTask()
 27 {
    
    
 28   memset(handerTask,0,sizeof(handerTask));
 29   handerTask[0]=task1;
 30   handerTask[1]=task2;
 31   handerTask[2]=task3;
 32 }
 33 int main()
 34 {
    
    
 35  pid_t id=fork();
 36 assert(id!=-1);
 37 if(id==0)  
 38 {
    
    
 39 //子进程
 40 int num=3;
 41 while(num)
 42 {
    
    
 43   printf("child running,pid: %d ,ppid: %d ,num: %d\n",getpid(),getppid(),num--);
 44   sleep(3);
 45 }
 46 exit(10);
 47 }
 48 //父进程
 49 loadTask();
 50 int status=0;
 51 while(1)
 52 {
    
    
 53   pid_t ret=waitpid(id,&status,WNOHANG);//WHOHANG:非阻塞->子进程没有退出，父进程检测时候，立即返回
 54   if(ret==0)
 55   {
    
    
 56     //waitpid调用成功，子进程没有退出
 57   printf("wait done,but child is running...\n");
 58  for(int i=0;handerTask[i]!=NULL;i++)
 59  {
    
    
 60    handerTask[i]();//采用回调的方式，执行我们想让父进程在非阻塞等待时做的事情。
 61  }
 62 
 63 
 64 
 65   }
 66   else if(ret>0)
 67   {
    
    
 68     //waitpid调用成功，子进程退出了
 69     printf("wait success,exit code: %d,sig: %d\n",(status>>8)&0xFF,status&0x7F);
 70     break;
 71   }                                                                                                                                                                                                         
 72   else{
    
    
 73     //waitpid调用失败
 74     printf("waitpid call failed!\n");
 75     break;
 76   }
 77   sleep(1);
 78 }
 79 return 0;}

process program replacement

The essence of program replacement: load the code and data of the specified program to the specified location, and overwrite the previous code and data.

The process finds its own virtual space through the pcb, and then finds the physical space through the page table to execute its own code and data, while the program replacement uses the exec function to load the code and data to be replaced on the disk into the physical memory of the current process and overwrite it. Then the process executes the subsequent code and data pull!

At this point, only the code and data are replaced, and no new process is created.

replace function

Here are functions starting with exec, collectively referred to as exec functions

Function call successful—program replacement, call failure—no replacement

execl

Function with l character - pass the path, string it up with next like a list

Which program to execute: pass the relative path/absolute path of the program

How to execute: Same as the command line input: "program name", "option 1", "option 2"..., NULL [exec functions must end with NULL]

Variable parameter list: pass a different number of parameters to the function

If the execl function fails, it returns -1, and if it succeeds, it does not return. Even if it succeeds, the original subsequent code will be replaced, so it is useless to return.

Q: When the child process calls the exec function for program replacement, will it affect the parent process?

A: At this time, the parent process and the child process share the same piece of code and data. When the child process calls the exec function, the operating system will copy-on-write to the child process, and then perform program replacement. The parent and child processes do not affect each other, which also reflects the independence of the process.

This reflects the purpose of creating a child process:

1. Let the child process execute part of the parent process.

2. Let the child process perform program replacement and execute a brand new program.

execlp

For functions with p characters, you only need to pass the program name

can run

So if the execl and execlp functions are placed in the same function, are the two duplicated?

The former is to pass parameters through the path, and the latter is to find the function name through the environment variable PATH, without repetition.

execv

execvp

So how to make the exec function call the program you wrote?

Here I wrote some identifying code

By creating a pseudo-target all, multiple target files can be executed simultaneously.

Now I want myexec to call the mycom program I wrote

  1 #include<stdio.h> 
  2 #include<unistd.h>
  3 #include<stdlib.h>  
  4 #include<assert.h>   
  5 #include<sys/wait.h>
  6 #include<sys/types.h>
  7 int main()
  8 {
    
                                       
  9                 
 10   printf("process is running...\n");
 11 pid_t id=fork();
 12 assert(id!=-1);
 13 if(id==0) 
 14 {
    
            
 15   //子进程  
 16 sleep(1);                       
 17 //  char*const argv_[]={"ls","-a","-l","--color=auto",NULL};
 18   execl("./mycom","mycom",NULL);
 19 exit(1);             
 20   //execvp("ls",argv_);            
 21   //execv("/usr/bin/ls",argv_);
 22   //  execl("/user/bin/ls/"/*传程序路径*/,"ls","-a","-l","--color=auto",NULL/*想怎么执行*/);
 23 //  execlp("ls"/*传程序名*/,"ls","-a","-l","--color=auto",NULL/*想怎么执行*/);
 24 //  //全部的exec函数参数都是以NULL结尾
 25              
 26 }                               
 27 int status=0;
 28 pid_t ret=waitpid(id,&status,0);
 29 if(id>0)                                       
 30 {
    
    
 31   printf("wait success:%d ,sig number: %d,child exit code:%d\n",ret,status&0x7F,(status>>8)&0xFF);
 32 }          
 33   printf("process running done...\n");
 34   return 0;
 35 
 36 }

And program replacement can be used to call the executable program corresponding to any back-end language.

execle

Call the PATH, PWD environment variable, and a custom variable of MYENV in the mycom.c file

Then the execle function in myexec.c calls the custom variable

You can see that the environment variable is not called but the custom variable is called

This time the execle function calls the environment variable

You can see that the environment variable is called, but the custom variable is not called

So what if you want to call custom variables and environment variables?

putenv

putenv: Import custom variables into the table of environment variables

Add the custom variable MYENV to the environment variable table

Load first or call function first?

The main function has command line parameters, and the parameters include programs, environment variables, etc., so should the main function be called first or load various command line parameters into memory first?
Load first! Because the main function is also passed parameters ! Command line parameters and environment variables, etc. are loaded into memory first, and parameters are passed if the function needs it!

In fact, the various exec functions mentioned above are the encapsulation of the system call execve, and various encapsulations are also suitable for various application scenarios.

Well, here is a summary, focusing on process termination: three situations of process exit, view process exit code, method of process termination; process waiting: method of process waiting, how to obtain the process status and exit code of a child process, using macros Definition Check whether the process exits normally; the introduction and difference between blocking wait and non-blocking wait; process replacement: the use of five process replacement functions, the use of putenv function, etc. This article has been written for several days, and it is not easy to make, please like it~~~

In-depth explanation of process control