Process-related concepts
1. Concurrency
2. Single-programming
3. Multi-programming
4. cpu/mmu
5. Process control block
6. Process status
environment variable
1. The role of commonly used environment variables
2. Functions
process control primitives
1. The fork function creates the structure of the child process in a loop
2. The exec function uses the parameters of each function/function
3. wait/waitpid The general way to recycle the child process
Programs and Processes
A program refers to a binary file with a compilation number, such as a.out file, which is saved on the disk and does not occupy system resources (cpu, memory, open files, devices, locks...)
Process is an abstract concept closely related to operating system principles. A process is an active program that occupies system resources and executes in memory (the program runs to generate a process)
concurrency
Concurrency In the operating system, there are multiple processes in a period of time that have been started and run to the state of running, but only one process is still running at any point in time
Single program design mode : Microsoft's DOS system can only execute one program at a time on the CPU. If the next program wants to execute, it must queue up and wait for the first program to be executed. The execution efficiency is very low.
Multi-programming design mode : It may appear that multiple processes are executed at the same time, but in fact they cannot be executed at the same time. The essence is to divide each process into multiple task segments. The CPU divides itself into multiple time wheels and then divides itself The time wheel slice is assigned to one of the task fragments for execution
Clock interrupt : It is a hardware method to divide the cpu time wheel. When a process encounters a clock interrupt, it is irresistible to give up the cpu to make the cpu execute other processes.
In the multiprogramming model, multiple processes take turns using the CPU. However, the current common cpu is at the nanosecond level and can execute about 1 billion instructions in 1 second. Since the reaction speed of the human eye is at the millisecond level, it seems to be running at the same time.
Simple Architecture of CPU
cache: The buffer
prefetcher will only take one instruction from the buffer each time.
The function of the decoder is to analyze what this instruction is doing. Which registers are needed to configure and complete the operation
and hand it over to the ALU. This arithmetic logic unit will only The addition and left shift operations are completed and then returned to the register
The basic working principle of mmu
The mmu is located inside the cpu as a piece of hardware
Virtual address: The available address space is 4G, but the actual occupied memory is not so large
0x804a4000 int a = 10;
the physical address is 1000
mmu helps to divide the virtual address and physical address corresponding to
the cpu's memory access level into 3 2 1 0 where 3 is The lowest 0 is the highest Linux only uses levels 3 and 0.
Compared with virtual memory, user space is level 3 and kernel space is level 0.
The processes are independent of each other. Two a.outs are running at the same time. The physical memory space needs to be allocated respectively in the physical memory, but the kernel only needs one copy. Two copies share the same kernel space.
Process Control Block PCB
Each process has a process control block (PCB) in the kernel to maintain process-related information. The process control block of the linux kernel is a task_struct structure with many internal members. The following parts are important to master:
process-id. Each process in the system has a unique id, represented by pid_t type in C language, which is actually a non-negative integer
The status of the process, including ready, running, suspended, stopped, etc.
Some cpu registers that need to be saved and restored during process switching
Information describing the virtual address space
Information describing the controlling terminal
current working directory
umask mask
The file descriptor table contains many pointers to the file structure
information about signals
user id and group id
Sessions and process groups
The resource limit that the process can use (Resource Limit)
Status of the process:
Introduction to Environment Variables
The Linux operating system is a multi-tasking and multi-user open source operating system
. Multi-tasking: Concurrent
multi-user: Multiple users can log in to a computer at the same time.
Environment variables: Refers to some parameters used in the operating system to specify the operating environment of the operating system usually have the following characteristics:
1. String (essential) 2. There is a unified format: name=value[:value] 3. The value is used Describe process environment information
PATH
is used to record the executable path of the file echo $PATH
SHELL
records what the current command parser is Current Shell Its value is usually /bin/bash
TERM
The current terminal type graphical interface is usually xterm. The terminal type determines the output display mode of some programs. For example, graphical interface terminals can display Chinese characters, but character terminals generally cannot.
LANG
language and locale determine the display format of character encoding and time, currency and other information
HOME
The path of the current user's home directory Many programs need to save configuration files in the home directory so that each user has his own set of configurations when running the program
View process environment variables
#include <stdio.h>
//当前进程下的环境变量表
extern char **environ;
int main(void){
for(int i=0;environ[i];i++){
printf("%s\n",environ[i]);
}
return 0;
}
After compiling and running, the environment variable table of this .c file process can be output
Environment variable manipulation function
The getenv function
gets the value of the environment variable
char *getenv(const char* name); Success: returns the value of the environment variable Failure: NULL (name does not exist)
The setenv function
sets the value of the environment variable
int setenv(const char* name, const char* value, int overwrite); Success: 0 Failure: -1
Parameter overwrite value: 1: Overwrite the original environment variable
0: Not overwrite This parameter is often used Set new environment variables such as ABC=haha-day-night
The unsetenv function
deletes the definition of the environment variable name
int unsetenv(const char *name); Success: 0 Failure: -1
Note: name does not exist and returns 0 (success) An error will occur when the name is named "ABC="
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void){
char *val;
const char *name = "ABD";
val = getenv(name);
printf("1.%s = %s\n",name,val);
setenv(name, "haha-day-night",1); //here 1 means add a new env
val = getenv(name);
printf("2.%s = %s\n",name,val);
#if 0
int ret = unsetenv("ABDFGH");
printf("ret = %d\n",ret);
val = getenv(name);
printf("3.%s = %s\n",name,val);
#else
int ret = unsetenv("ABD");
printf("ret = %d\n",ret);
val = getenv(name);
printf("3.%s = %s\n",name,val);
#endif
return 0;
}
Output result:
ABD = (null) //Indicates that there is no one named ABD in the current environment variable
ABD = haha-day-night //Rename ABD and set it to haha-day-night and then print it
ret = 0 //0 means successfully delete ABD
ABD = (null) // Get ABD again and it becomes NULL
Create a single child process
The fork function
creates a child process
pid_t fork(void); failure returns -1, and success returns the pid [non-negative integer] and 0 of the child process (two return values)
When the parent process goes to fork, it will create a child process and the parent process will execute at the same time. At this time, there will be two return values. The parent process fork returns the child process id, and the child process fork returns 0.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int main(void){
pid_t pid;
printf("xxxxxxxxxxxx\n");
pid = fork();
if(pid == -1){
perror("fork error");
exit(1);
}else if(pid == 0){
printf("i am child, pid = %u, ppid = %u\n",getpid(),getppid());
}else{
printf("i am parent, pid = %u, ppid = %u\n",getpid(),getppid());
sleep(1);
}
printf("YYYYYYYYYYYYYYY\n");
return 0;
}
Output result:
xxxxxxxxxxxx
i am parent, pid = 16715, ppid = 2302
i am child, pid = 16716, ppid = 16715
YYYYYYYYYYYYY
Here, the ppid 2302 of the parent can be checked by ps aux | grep 2302 to know that this is bash, indicating that bash has executed this program as the parent process of this program
Create N child processes in a loop
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int main(void){
pid_t pid;
printf("xxxxxxxxxxxx\n");
//循环创建5个子进程
for(int i=0;i<5;i++){
pid = fork();
if(pid == -1){
perror("fork error");
exit(1);
}else if(pid == 0){
printf("i am %dth child, pid = %u, ppid = %u\n",i+1,getpid(),getppid());
}else{
printf("i am %dth parent, pid = %u, ppid = %u\n",i+1,getpid(),getppid());
sleep(1);
}
}
printf("YYYYYYYYYYYYYYY\n");
return 0;
}
⚠️: But the output found that there are not 5 child processes
Solution:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int main(void){
int i;
pid_t pid;
printf("xxxxxxxxxxxx\n");
//循环创建5个子进程
for(i=0;i<5;i++){
pid = fork();
if(pid == -1){
perror("fork error");
exit(1);
}else if(pid == 0){
break; //子进程直接break
}
}
if(i<5){
sleep(i);
printf("i am %d child, pid = %u\n",i+1,getpid());
}else{
sleep(i);
printf("i am parent");
}
return 0;
}
Output result:
xxxxxxxxxxxx
i am 1 child, pid = 17065
i am 2 child, pid = 17066
i am 3 child, pid = 17067
i am 4 child, pid = 17068
i am 5 child, pid = 17069
i am parent
Summary: The system operation process here is actually to do five loops to generate five child processes, and then five child processes and one parent process, a total of six processes start to seize the CPU at the same time. Because sleep is set, the order of output can be controlled so that the first The child process outputs first, then the last child process, and finally the parent process because the parent process sleeps for 5 seconds.
Parent-child process sharing
What are the similarities and differences between the parent and child processes after fork?
Just after fork:
father and son similarities: global variables, .data, .txt, stack, heap, environment variables, user ID, host directory, process working directory, Signal processing method...
Differences between parent and child: 1. Process ID 2. Fork return value 3. Parent process ID 4. Process running time 5. Alarm clock (timer). 6. Pending Signal Sets
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int var = 34;
int main(void){
pid_t pid;
pid = fork();
if(pid == -1){
perror("fork error");
exit(1);
}else if(pid >0){
sleep(2);
var = 55;
printf("i am parent pid = %d, parentID = %d, var = %d\n",getpid(),getppid(),var);
}else if(pid == 0){
var = 100;
printf("i am child pid = %d, parentID = %d, var = %d\n",getpid(),getppid(),var);
}
printf("var = %d\n",var);
return 0;
}
Output result:
i am child pid = 6493, parentID = 6492, var = 100
was = 100
i am parent pid = 6492, parentID = 6371, var = 55
was = 55
Prove that global variables are exclusive to parents and children
At this stage, the parent-child process follows the principle of share-on-read and copy-on-write .
Parent-child process sharing 1. File descriptor 2. Mapping area established by mmap
After fork, whoever executes the parent and child processes first depends on the scheduling algorithm used by the kernel.
gdb debugging
When using gdb to debug, gdb can only track one process. Before the fork function is called, set the gdb debugging tool to track the parent process or track the child process through instructions. The default is to track the parent process
The set follow-fork-mode child command sets gdb to follow the child process after fork
set follow-fork-mode parent set to follow the parent process
Note that it must be set before the fork function call to be valid
When there are multiple sub-processes in the loop, you can use conditional breakpoints to set the sub-processes that need to be debugged