Article directory
process control
Talking about fork
The fork function can create a new process from an existing process. The new process is the child process, and the original process is the parent process.
#include<unistd.h>
pid_t fork(void);//pid_t为返回值
返回值:fork成功就把子进程pid返回给父进程,而把0返回给子进程,如果fork失败就把-1返回给父进程,子进程没有返回值
The process calls fork,
The kernel first allocates a new memory block and kernel data structure to the child process, copies part of the data structure of the parent process to the child process, then adds the child process to the system process list, fork returns, and starts the scheduler scheduling.
After fork, it is completely up to the scheduler to decide who will execute the parent and child processes first.
copy-on-write
Usually, the parent-child process code is shared, and the physical space also uses the same block, but any process tries to write, the operating system first copies the process data, separates the different process data, changes the page table mapping, and then lets the process modify it— Realistic copy.
The reason for the failure of the fork call
Reason: There are too many processes in the system; the number of actual user processes exceeds the limit
Here is a piece of code, after running, you can see how many processes your operating system can accommodate. If you have a virtual machine or a cloud server, you can try it. After the system crashes, exit the system and wait for a while to restart.
1 #include<stdio.h>
2 #include<unistd.h>
3
4 int main()
5 {
6 int num=0;
7 while(1)
8 {
9 int ret=fork();
10 if(ret<0)//如果创建子进程失败
11 {
12 printf("fork error!,%d \n",num);
13 break;
14 }
15 else if(ret==0)
16 {
17 //子进程
18 while(1)
19 sleep(1);
20
21 }
22 //父进程
23 num++;
24 }
25 return 0;
26 }
process terminated
When we write c++ or c code, most of them start writing from the main function, and then return 0 after writing; so what is the meaning of this return 0? The so-called return 0 is the process exit code, and the exit code records the result of the process exit, etc.
Scenario of process exit
There are only three scenarios for process exit
The code runs and the result is correct
The code runs to completion with incorrect results
The code execution terminated abnormally
So when the code is finished running, where can the result be seen?
Common ways to exit a process
View process exit code
echo $? : View process exit code
First I wrote this code, if num is equal to 5050 then the process exit code of the main function is 1, otherwise it is 0
Then after running, the first echo $? is the exit code 1 of the main function process of mytest.c. And echo $? is also a process, and the exit code is 0, so why the exit code of the latter process is 0?
return 0 This 0 indicates that the code has finished running, and the result of the process execution is correct, while non-zero indicates that the code has finished running, and the result is incorrect!
and! Different numbers in 0 indicate different errors
Here I want to mention a function strerror, which can convert the process exit code into a corresponding string that can summarize the result; then here I write a small program to print the string summary of the result corresponding to the process exit code within 200
I copied the result below, and you can see that there are 134 process exit codes in the Linux system, and each exit code has a corresponding result. Among them, the first type of 0 means that the result of the process execution is correct, and the others are the reasons for the wrong corresponding results of the process execution.
0: Success
1: Operation not permitted
2: No such file or directory
3: No such process
4: Interrupted system call
5: Input/output error
6: No such device or address
7: Argument list too long
8: Exec format error
9: Bad file descriptor
10: No child processes
11: Resource temporarily unavailable
12: Cannot allocate memory
13: Permission denied
14: Bad address
15: Block device required
16: Device or resource busy
17: File exists
18: Invalid cross-device link
19: No such device
20: Not a directory
21: Is a directory
22: Invalid argument
23: Too many open files in system
24: Too many open files
25: Inappropriate ioctl for device
26: Text file busy
27: File too large
28: No space left on device
29: Illegal seek
30: Read-only file system
31: Too many links
32: Broken pipe
33: Numerical argument out of domain
34: Numerical result out of range
35: Resource deadlock avoided
36: File name too long
37: No locks available
38: Function not implemented
39: Directory not empty
40: Too many levels of symbolic links
41: Unknown error 41
42: No message of desired type
43: Identifier removed
44: Channel number out of range
45: Level 2 not synchronized
46: Level 3 halted
47: Level 3 reset
48: Link number out of range
49: Protocol driver not attached
50: No CSI structure available
51: Level 2 halted
52: Invalid exchange
53: Invalid request descriptor
54: Exchange full
55: No anode
56: Invalid request code
57: Invalid slot
58: Unknown error 58
59: Bad font file format
60: Device not a stream
61: No data available
62: Timer expired
63: Out of streams resources
64: Machine is not on the network
65: Package not installed
66: Object is remote
67: Link has been severed
68: Advertise error
69: Srmount error
70: Communication error on send
71: Protocol error
72: Multihop attempted
73: RFS specific error
74: Bad message
75: Value too large for defined data type
76: Name not unique on network
77: File descriptor in bad state
78: Remote address changed
79: Can not access a needed shared library
80: Accessing a corrupted shared library
81: .lib section in a.out corrupted
82: Attempting to link in too many shared libraries
83: Cannot exec a shared library directly
84: Invalid or incomplete multibyte or wide character
85: Interrupted system call should be restarted
86: Streams pipe error
87: Too many users
88: Socket operation on non-socket
89: Destination address required
90: Message too long
91: Protocol wrong type for socket
92: Protocol not available
93: Protocol not supported
94: Socket type not supported
95: Operation not supported
96: Protocol family not supported
97: Address family not supported by protocol
98: Address already in use
99: Cannot assign requested address
100: Network is down
101: Network is unreachable
102: Network dropped connection on reset
103: Software caused connection abort
104: Connection reset by peer
105: No buffer space available
106: Transport endpoint is already connected
107: Transport endpoint is not connected
108: Cannot send after transport endpoint shutdown
109: Too many references: cannot splice
110: Connection timed out
111: Connection refused
112: Host is down
113: No route to host
114: Operation already in progress
115: Operation now in progress
116: Stale file handle
117: Structure needs cleaning
118: Not a XENIX named type file
119: No XENIX semaphores available
120: Is a named type file
121: Remote I/O error
122: Disk quota exceeded
123: No medium found
124: Wrong medium type
125: Operation canceled
126: Required key not available
127: Key has expired
128: Key has been revoked
129: Key was rejected by service
130: Owner died
131: State not recoverable
132: Operation not possible due to RF-kill
133: Memory page has hardware error
134: Unknown error 134
135: Unknown error 135
136: Unknown error 136
137: Unknown error 137
Under normal circumstances, the process terminates normally, or the main function returns (other functions return as the end of the function call); call exit; system call _exit;
exit 和_exit
exit: After calling this function, the process can exit directly, and the parameter is the process exit code
Then I wrote such a code that if the main function does not exit, it will enter an infinite loop.
Then after running, check to see that the main function has indeed exited
Then I write this code again. Generally, the end of other functions is the end of the function call, and I call an exit function in front of the return value in this addtosum function to see whether it exits the function call or exits the process.
It turns out that the exit function directly exits the process in any function call!
exit is a library function, and _exit is a system call, then the underlying implementation of exit is also _exit, so what is the difference between exit and _exit?
wrote this code
After running, you can see that hello bug is printed out after two seconds
So what about changing exit to _exit?
It can be seen that it is not printed at all.
From this comparison, we can know that the difference between the exit of the library function and the _eixt of the system call is: exit will refresh the buffer before exiting the process, while _exit will not. And it can be deduced that the buffer is not in the operating system, it should be in user space.
process waiting
If the child process exits and the parent process ignores the process, it may cause a zombie process problem, which in turn leads to a memory leak. In addition, once the process becomes a zombie process, even the command kill -9 to kill the process is powerless. So how should the parent process manage the exiting child process?
After the child process finishes running, the parent process needs to wait through the process: recycle the child process resources, and obtain the child process exit information , so as to avoid the emergence of zombie processes afterwards.
How the process waits
wait
Once the parent process calls wait, it will block itself immediately, and wait will automatically analyze whether a child process of the current process has exited. If it finds a child process that has become a zombie, wait will collect the information of the child process, and Return after completely destroying it; if no such child process is found, wait will block here until one appears.
The parameter of the first function is *status, which is a pointer of integer type. In most cases, NULL is passed. If it succeeds, it returns the pid of the collected child process, and if it fails, it returns -1.
Then I wrote such a function, created a child process, entered the child process to print the child process pid and parent process ppid, and then slept for one second, and the child process exited after five seconds, but because the parent process also slept, it would Enter the zombie state, and after a few seconds, the parent process recycles the child process and prints out the return value of wait
This is indeed the case after running
waitpid
I wrote such a piece of code, ret gets the pid of the child process, stastus gets the exit information of the child process.
So what exactly is status?
Get subprocess status
1. Both wait and waitpid have a status parameter, which is an output parameter filled by the operating system; if NULL is passed, it means that the exit status information of the child process is not concerned; otherwise, the operating system will The exit information of the process is fed back to the parent process.
2. The status cannot be simply viewed as an integer, but can be viewed as a bitmap, that is, the status must be able to represent the three scenarios when the process exits. (The code finishes running, the result is correct; the code runs, the result is incorrect; the code runs abnormally terminated)
There are 32 bits in the binary of the integer status, now look at the first 16 bits (0-15);
The 8th-15th bits (the second lowest eight bits) store the exit status of the process, that is, the exit code of the child process (you can know whether the result of the process is correct through the exit status). If the exit code is 0, the result is correct. Non-zero corresponds to an error case .
Bits 0-6 (lower seven bits) store the termination signal of the process (you can know whether the process exits normally through the termination signal), and the eighth bit stores the core dump flag (as shown in the figure: the binary structure of status), and the termination signal is 0—process Exit normally. Non-zero is abnormal, and you can see most of the termination signals through kill -l.
Get the termination signal through status&0x7F[01111111] , and then get the exit code through (status>>8)&0xFF[011111111]
Correspondingly, the corresponding termination signal can also be obtained by killing the child process through kill
For example, if I kill -3 the child process, then the termination signal returned by the child process to the parent process is also 3
Macro definition to check whether the process exits normally, check the exit code
WIFEXITED(status) : If the status returned by the normal termination of the child process is true, (check whether the process exits normally)
WEXITSTATUS(status): If WIFEXITED is non-zero, extract the exit code of the child process (check the exit code of the process)
1#include<string.h>
2 #include<stdio.h>
3 #include<unistd.h>
4 #include<assert.h>
5 #include<sys/types.h>
6 #include<sys/wait.h>
7 #include<stdlib.h>
8 int main()
9 {
10 pid_t id=fork();
11 assert(id!=-1);
12 if(id==0)
13{
14 //子进程
15 int num=30;
16 while(num)
17 {
18 printf("child running,pid: %d ,ppid: %d ,num: %d\n",getpid(),getppid(),num--);
19 sleep(1);
20 }
21 exit(10);
22 }
23 //父进程
24 int status=0;
25 int ret=waitpid(id,&status,0);
26 if(ret>0)
27 {
28 //判断子进程是否正常退出
29 if(WIFEXITED(status))//子进程正常退出
30 {
//判断子进程运行的结果
31 printf("exit code: %d\n",WEXITSTATUS(status));
32 }else
33 {
34 //子进程异常终止
35 printf("child exit abnormally!\n ");
36 }
37 // printf("wait success,exit code: %d, sig: %d\n",(status>>8)&0xFF,status&0x7F);
38 }
39 return 0;
40 }
Talk about zombie process again
When the child process exits, the code and data will be released, but the exit information (exit code and exit signal) must be stored in the pcb of the child process. At this time, the child process is in the Z state. When the parent process system calls waitpid/wait, The parent process will take the exit information from the child process pcb to the status through the child process id.
Talking about blocking wait and non-blocking wait
Xiaoshuai —> parent process; girlfriend —> child process; phone call —> system call wait/waitpid
This man is called Xiaoshuai. He has a girlfriend. One day Xiaoshuai asked his girlfriend to go to the movies. He made an appointment to wait for her downstairs at 9 o'clock. Xiaoshuai arrived as scheduled. But he didn’t see his girlfriend, so Xiaoshuai called his girlfriend, and his girlfriend said that he would wait for a while while putting on makeup, Xiaoshuai said yes, but don’t hang up, Xiaoshuai waited, and asked his girlfriend in two minutes Did it melt, and asked again after two minutes, and didn't hang up until the girlfriend arrived downstairs.
—Checking the status of the girlfriend without hanging up the phone is blocking and waiting.
Another day, Xiaoshuai asked his girlfriend to have dinner, and he went downstairs at 9 o'clock. Xiaoshuai arrived as scheduled and called his girlfriend. The girlfriend said he had to wait while he was putting on makeup. This time Xiaoshuai said to hang up and was waiting to call her. While Xiaoshuai was waiting, he watched the football game for a while, read the book for a while, and called his girlfriend after a while to ask if he was all right. Xiaoshuai hung up the phone and continued to wait, doing other things while waiting.
Call - status detection, if not ready, return immediately (hang up); each time is a non-blocking wait. Multiple non-blocking wait-polls.
Macro definition WNOHANG: non-blocking wait
Pass the WHOHANG defined by the macro to waitpid, which is non-blocking waiting.
WNOHANG: If the child process specified by pid has not ended, the waitpid() function returns 0 and does not wait. If it ends normally, return the ID of the child process, if the call to waitpid fails, return -1
#include<string.h>
2 #include<stdio.h>
3 #include<unistd.h>
4 #include<assert.h>
5 #include<sys/types.h>
6 #include<sys/wait.h>
7 #include<stdlib.h>
8 int main()
9 {
10 pid_t id=fork();
11 assert(id!=-1);
12 if(id==0)
13 {
14 //子进程
15 int num=3;
16 while(num)
17 {
18 printf("child running,pid: %d ,ppid: %d ,num: %d\n",getpid(),getppid(),num--);
19 sleep(3);
20 }
21 exit(10);
22 }
23 //父进程
24 int status=0;
25 while(1)
26 {
27 pid_t ret=waitpid(id,&status,WNOHANG);//WHOHANG:非阻塞->子进程没有退出,父进程检测时候,立即返回
28 if(ret==0)
29 {
30 //waitpid调用成功,子进程没有退出
31 printf("wait done,but child is running...\n");
32 }
33 else if(ret>0)
34 {
35 //waitpid调用成功,子进程退出了
36 printf("wait success,exit code: %d,sig: %d\n",(status>>8)&0xFF,status&0x7F);
37 break;
38 }
39 else{
40 //waitpid调用失败
41 printf("waitpid call failed!\n");
42 break;
43 }
44 sleep(1);
45 }
46 return 0;}
The figure below shows the parent process non-blocking waiting and polling waiting, and finally the child process exits normally.
Here I pass a wrong id to the parent process, that is, the waitpid call fails
It can be seen that the failure of the call is printed, and the parent process exits while the child process is still running and adopted by the OS
The meaning of non-blocking waiting: it will not occupy all the resources of the parent process, and can do other things during polling.
While the parent process is waiting for the child process in a non-blocking manner, other things can be done. I wrote several tasks here, and chose to let the parent process call back the function to execute.
1 #include<string.h>
2 #include<stdio.h>
3 #include<unistd.h>
4 #include<assert.h>
5 #include<sys/types.h>
6 #include<sys/wait.h>
7 #include<stdlib.h>
8
9 #define NUM 10
10 typedef void (*func_t)();//函数指针-void 是函数返回类型,fun_t是函数名,没有参数
11
12 func_t handerTask[NUM];
13
14 void task1()
15 {
16 printf("hander task1\n");
17 }
18 void task2()
19 {
20 printf("hander task2\n");
21 }
22 void task3()
23 {
24 printf("hander task3\n");
25 }
26 void loadTask()
27 {
28 memset(handerTask,0,sizeof(handerTask));
29 handerTask[0]=task1;
30 handerTask[1]=task2;
31 handerTask[2]=task3;
32 }
33 int main()
34 {
35 pid_t id=fork();
36 assert(id!=-1);
37 if(id==0)
38 {
39 //子进程
40 int num=3;
41 while(num)
42 {
43 printf("child running,pid: %d ,ppid: %d ,num: %d\n",getpid(),getppid(),num--);
44 sleep(3);
45 }
46 exit(10);
47 }
48 //父进程
49 loadTask();
50 int status=0;
51 while(1)
52 {
53 pid_t ret=waitpid(id,&status,WNOHANG);//WHOHANG:非阻塞->子进程没有退出,父进程检测时候,立即返回
54 if(ret==0)
55 {
56 //waitpid调用成功,子进程没有退出
57 printf("wait done,but child is running...\n");
58 for(int i=0;handerTask[i]!=NULL;i++)
59 {
60 handerTask[i]();//采用回调的方式,执行我们想让父进程在非阻塞等待时做的事情。
61 }
62
63
64
65 }
66 else if(ret>0)
67 {
68 //waitpid调用成功,子进程退出了
69 printf("wait success,exit code: %d,sig: %d\n",(status>>8)&0xFF,status&0x7F);
70 break;
71 }
72 else{
73 //waitpid调用失败
74 printf("waitpid call failed!\n");
75 break;
76 }
77 sleep(1);
78 }
79 return 0;}
process program replacement
The essence of program replacement: load the code and data of the specified program to the specified location, and overwrite the previous code and data.
The process finds its own virtual space through the pcb, and then finds the physical space through the page table to execute its own code and data, while the program replacement uses the exec function to load the code and data to be replaced on the disk into the physical memory of the current process and overwrite it. Then the process executes the subsequent code and data pull!
At this point, only the code and data are replaced, and no new process is created.
replace function
Here are functions starting with exec, collectively referred to as exec functions
Function call successful—program replacement, call failure—no replacement
execl
Function with l character - pass the path, string it up with next like a list
Which program to execute: pass the relative path/absolute path of the program
How to execute: Same as the command line input: "program name", "option 1", "option 2"..., NULL [exec functions must end with NULL]
Variable parameter list: pass a different number of parameters to the function
If the execl function fails, it returns -1, and if it succeeds, it does not return. Even if it succeeds, the original subsequent code will be replaced, so it is useless to return.
Q: When the child process calls the exec function for program replacement, will it affect the parent process?
A: At this time, the parent process and the child process share the same piece of code and data. When the child process calls the exec function, the operating system will copy-on-write to the child process, and then perform program replacement. The parent and child processes do not affect each other, which also reflects the independence of the process.
This reflects the purpose of creating a child process:
1. Let the child process execute part of the parent process.
2. Let the child process perform program replacement and execute a brand new program.
execlp
For functions with p characters, you only need to pass the program name
can run
So if the execl and execlp functions are placed in the same function, are the two duplicated?
The former is to pass parameters through the path, and the latter is to find the function name through the environment variable PATH, without repetition.
execv
execvp
So how to make the exec function call the program you wrote?
Here I wrote some identifying code
By creating a pseudo-target all, multiple target files can be executed simultaneously.
Now I want myexec to call the mycom program I wrote
1 #include<stdio.h>
2 #include<unistd.h>
3 #include<stdlib.h>
4 #include<assert.h>
5 #include<sys/wait.h>
6 #include<sys/types.h>
7 int main()
8 {
9
10 printf("process is running...\n");
11 pid_t id=fork();
12 assert(id!=-1);
13 if(id==0)
14 {
15 //子进程
16 sleep(1);
17 // char*const argv_[]={"ls","-a","-l","--color=auto",NULL};
18 execl("./mycom","mycom",NULL);
19 exit(1);
20 //execvp("ls",argv_);
21 //execv("/usr/bin/ls",argv_);
22 // execl("/user/bin/ls/"/*传程序路径*/,"ls","-a","-l","--color=auto",NULL/*想怎么执行*/);
23 // execlp("ls"/*传程序名*/,"ls","-a","-l","--color=auto",NULL/*想怎么执行*/);
24 // //全部的exec函数参数都是以NULL结尾
25
26 }
27 int status=0;
28 pid_t ret=waitpid(id,&status,0);
29 if(id>0)
30 {
31 printf("wait success:%d ,sig number: %d,child exit code:%d\n",ret,status&0x7F,(status>>8)&0xFF);
32 }
33 printf("process running done...\n");
34 return 0;
35
36 }
And program replacement can be used to call the executable program corresponding to any back-end language.
execle
Call the PATH, PWD environment variable, and a custom variable of MYENV in the mycom.c file
Then the execle function in myexec.c calls the custom variable
You can see that the environment variable is not called but the custom variable is called
This time the execle function calls the environment variable
You can see that the environment variable is called, but the custom variable is not called
So what if you want to call custom variables and environment variables?
putenv
putenv: Import custom variables into the table of environment variables
Add the custom variable MYENV to the environment variable table
Load first or call function first?
The main function has command line parameters, and the parameters include programs, environment variables, etc., so should the main function be called first or load various command line parameters into memory first?
Load first! Because the main function is also passed parameters ! Command line parameters and environment variables, etc. are loaded into memory first, and parameters are passed if the function needs it!
In fact, the various exec functions mentioned above are the encapsulation of the system call execve, and various encapsulations are also suitable for various application scenarios.
Well, here is a summary, focusing on process termination: three situations of process exit, view process exit code, method of process termination; process waiting: method of process waiting, how to obtain the process status and exit code of a child process, using macros Definition Check whether the process exits normally; the introduction and difference between blocking wait and non-blocking wait; process replacement: the use of five process replacement functions, the use of putenv function, etc. This article has been written for several days, and it is not easy to make, please like it~~~