Process - CPU and MMU / environment variables / create child process

Process-related concepts

        1. Concurrency
        2. Single-programming
        3. Multi-programming
        4. cpu/mmu
        5. Process control block
        6. Process status

environment variable

        1. The role of commonly used environment variables
        2. Functions

process control primitives

        1. The fork function creates the structure of the child process in a loop
        2. The exec function uses the parameters of each function/function
        3. wait/waitpid The general way to recycle the child process

Programs and Processes

A program refers to a binary file with a compilation number, such as a.out file, which is saved on the disk and does not occupy system resources (cpu, memory, open files, devices, locks...)

Process is an abstract concept closely related to operating system principles. A process is an active program that occupies system resources and executes in memory (the program runs to generate a process)

concurrency

Concurrency In the operating system, there are multiple processes in a period of time that have been started and run to the state of running, but only one process is still running at any point in time

Single program design mode : Microsoft's DOS system can only execute one program at a time on the CPU. If the next program wants to execute, it must queue up and wait for the first program to be executed. The execution efficiency is very low.

Multi-programming design mode : It may appear that multiple processes are executed at the same time, but in fact they cannot be executed at the same time. The essence is to divide each process into multiple task segments. The CPU divides itself into multiple time wheels and then divides itself The time wheel slice is assigned to one of the task fragments for execution

Clock interrupt : It is a hardware method to divide the cpu time wheel. When a process encounters a clock interrupt, it is irresistible to give up the cpu to make the cpu execute other processes.

In the multiprogramming model, multiple processes take turns using the CPU. However, the current common cpu is at the nanosecond level and can execute about 1 billion instructions in 1 second. Since the reaction speed of the human eye is at the millisecond level, it seems to be running at the same time.

Simple Architecture of CPU

  

cache: The buffer
prefetcher will only take one instruction from the buffer each time. 
The function of the decoder is to analyze what this instruction is doing. Which registers are needed to configure and complete the operation
and hand it over to the ALU. This arithmetic logic unit will only The addition and left shift operations are completed and then returned to the register

The basic working principle of mmu

The mmu is located inside the cpu as a piece of hardware

 

 Virtual address: The available address space is 4G, but the actual occupied memory is not so large
                0x804a4000 int a = 10; 
the physical address is 1000       
mmu helps to divide the virtual address and physical address corresponding to
the cpu's memory access level into 3 2 1 0 where 3 is The lowest 0 is the highest Linux only uses levels 3 and 0.
Compared with virtual memory, user space is level 3 and kernel space is level 0.

The processes are independent of each other. Two a.outs are running at the same time. The physical memory space needs to be allocated respectively in the physical memory, but the kernel only needs one copy. Two copies share the same kernel space.

Process Control Block PCB

Each process has a process control block (PCB) in the kernel to maintain process-related information. The process control block of the linux kernel is a task_struct structure with many internal members. The following parts are important to master:

process-id. Each process in the system has a unique id, represented by pid_t type in C language, which is actually a non-negative integer

The status of the process, including ready, running, suspended, stopped, etc.

Some cpu registers that need to be saved and restored during process switching

Information describing the virtual address space

Information describing the controlling terminal

current working directory

umask mask

The file descriptor table contains many pointers to the file structure

information about signals

user id and group id

Sessions and process groups

The resource limit that the process can use (Resource Limit)

 Status of the process:

Introduction to Environment Variables

The Linux operating system is a multi-tasking and multi-user open source operating system 
. Multi-tasking: Concurrent
multi-user: Multiple users can log in to a computer at the same time.

Environment variables: Refers to some parameters used in the operating system to specify the operating environment of the operating system usually have the following characteristics:
1. String (essential) 2. There is a unified format: name=value[:value] 3. The value is used Describe process environment information

PATH
is used to record the executable path of the file echo $PATH

SHELL
records what the current command parser is Current Shell Its value is usually /bin/bash

TERM
The current terminal type graphical interface is usually xterm. The terminal type determines the output display mode of some programs. For example, graphical interface terminals can display Chinese characters, but character terminals generally cannot.

LANG
language and locale determine the display format of character encoding and time, currency and other information

HOME

The path of the current user's home directory Many programs need to save configuration files in the home directory so that each user has his own set of configurations when running the program

View process environment variables

#include <stdio.h>

//当前进程下的环境变量表
extern char **environ;

int main(void){
for(int i=0;environ[i];i++){
printf("%s\n",environ[i]);
}
return 0;
}

After compiling and running, the environment variable table of this .c file process can be output

Environment variable manipulation function

The getenv function
gets the value of the environment variable
        char *getenv(const char* name); Success: returns the value of the environment variable Failure: NULL (name does not exist)

The setenv function
sets the value of the environment variable
        int setenv(const char* name, const char* value, int overwrite); Success: 0 Failure: -1
        Parameter overwrite value: 1: Overwrite the original environment variable
                                         0: Not overwrite This parameter is often used Set new environment variables such as ABC=haha-day-night

The unsetenv function
deletes the definition of the environment variable name
        int unsetenv(const char *name); Success: 0 Failure: -1
        Note: name does not exist and returns 0 (success) An error will occur when the name is named "ABC="

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void){
char *val;
const char *name = "ABD";

val = getenv(name);
printf("1.%s = %s\n",name,val);

setenv(name, "haha-day-night",1); //here 1 means add a new env

val = getenv(name);
printf("2.%s = %s\n",name,val);

#if 0
int ret = unsetenv("ABDFGH");
printf("ret = %d\n",ret);

val = getenv(name);
printf("3.%s = %s\n",name,val);

#else
int ret = unsetenv("ABD");
printf("ret = %d\n",ret);

val = getenv(name);
printf("3.%s = %s\n",name,val);

#endif

return 0;
}

Output result:

ABD = (null) //Indicates that there is no one named ABD in the current environment variable

ABD = haha-day-night //Rename ABD and set it to haha-day-night and then print it

ret = 0 //0 means successfully delete ABD

ABD = (null) // Get ABD again and it becomes NULL

Create a single child process 

The fork function
creates a child process
pid_t fork(void); failure returns -1, and success returns the pid [non-negative integer] and 0 of the child process (two return values)

 When the parent process goes to fork, it will create a child process and the parent process will execute at the same time. At this time, there will be two return values. The parent process fork returns the child process id, and the child process fork returns 0.

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main(void){
pid_t pid;
printf("xxxxxxxxxxxx\n");
pid = fork();

if(pid == -1){
perror("fork error");
exit(1);
}else if(pid == 0){
printf("i am child, pid = %u, ppid = %u\n",getpid(),getppid());
}else{
printf("i am parent, pid = %u, ppid = %u\n",getpid(),getppid());
sleep(1);
}
printf("YYYYYYYYYYYYYYY\n");
return 0;
}

Output result:

xxxxxxxxxxxx

i am parent, pid = 16715, ppid = 2302

i am child, pid = 16716, ppid = 16715

YYYYYYYYYYYYY

Here, the ppid 2302 of the parent can be checked by ps aux | grep 2302 to know that this is bash, indicating that bash has executed this program as the parent process of this program

Create N child processes in a loop 

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main(void){
pid_t pid;
printf("xxxxxxxxxxxx\n");

//循环创建5个子进程
for(int i=0;i<5;i++){
    pid = fork();
    if(pid == -1){
        perror("fork error");
        exit(1);
    }else if(pid == 0){
        printf("i am %dth child, pid = %u, ppid = %u\n",i+1,getpid(),getppid());
    }else{
        printf("i am %dth parent, pid = %u, ppid = %u\n",i+1,getpid(),getppid());
        sleep(1);
    }
}

printf("YYYYYYYYYYYYYYY\n");
return 0;
}

⚠️: But the output found that there are not 5 child processes

 Solution:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main(void){
int i;
pid_t pid;
printf("xxxxxxxxxxxx\n");

//循环创建5个子进程
for(i=0;i<5;i++){
    pid = fork();
    if(pid == -1){
        perror("fork error");
        exit(1);
    }else if(pid == 0){
        break;  //子进程直接break
    }
}
if(i<5){
    sleep(i);
    printf("i am %d child, pid = %u\n",i+1,getpid()); 
}else{
    sleep(i);
    printf("i am parent");
}
return 0;
}

Output result:

xxxxxxxxxxxx

i am 1 child, pid = 17065

i am 2 child, pid = 17066

i am 3 child, pid = 17067

i am 4 child, pid = 17068

i am 5 child, pid = 17069

i am parent

Summary: The system operation process here is actually to do five loops to generate five child processes, and then five child processes and one parent process, a total of six processes start to seize the CPU at the same time. Because sleep is set, the order of output can be controlled so that the first The child process outputs first, then the last child process, and finally the parent process because the parent process sleeps for 5 seconds. 

Parent-child process sharing 

What are the similarities and differences between the parent and child processes after fork?
Just after fork:
father and son similarities: global variables, .data, .txt, stack, heap, environment variables, user ID, host directory, process working directory, Signal processing method...
Differences between parent and child: 1. Process ID 2. Fork return value 3. Parent process ID 4. Process running time 5. Alarm clock (timer). 6. Pending Signal Sets

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int var = 34;

int main(void){
    pid_t pid;

    pid = fork();
    if(pid == -1){
        perror("fork error");
        exit(1);
    }else if(pid >0){
        sleep(2);
        var = 55;
        printf("i am parent pid = %d, parentID = %d, var = %d\n",getpid(),getppid(),var);
    }else if(pid == 0){
        var = 100;
        printf("i am child pid = %d, parentID = %d, var = %d\n",getpid(),getppid(),var);
    }
    printf("var = %d\n",var);

    return 0;
}

Output result:

i am child pid = 6493, parentID = 6492, var = 100

was = 100

i am parent pid = 6492, parentID = 6371, var = 55

was = 55

Prove that global variables are exclusive to parents and children

At this stage, the parent-child process follows the principle of share-on-read and copy-on-write .

Parent-child process sharing 1. File descriptor 2. Mapping area established by mmap

After fork, whoever executes the parent and child processes first depends on the scheduling algorithm used by the kernel.

gdb debugging

When using gdb to debug, gdb can only track one process. Before the fork function is called, set the gdb debugging tool to track the parent process or track the child process through instructions. The default is to track the parent process

The set follow-fork-mode child command sets gdb to follow the child process after fork

set follow-fork-mode parent set to follow the parent process

Note that it must be set before the fork function call to be valid

When there are multiple sub-processes in the loop, you can use conditional breakpoints to set the sub-processes that need to be debugged

Guess you like

Origin blog.csdn.net/weixin_43754049/article/details/126065508