[Linux] Process learning (2) --- understand process operation


view process

View through the system directory

There is a system folder named proc in the root directory (check the results as shown below), this proc folder contains a lot of process information, some of which are named numbers, these numbers are actually the PID of a certain process, corresponding to the folder Various information about the corresponding process is recorded in it. If we want to view the process information of the process with PID 1, we can view the folder named 1 /proc/1.
insert image description here

View by ps command

1. Using the ps command alone will display all process information.

[nan@VM-8-10-centos test_23_4_23]$ ps axj

2. Use the ps command together with the grep command to display only the information of a certain process.

ps axj | head -1 && ps axj | grep myproc | grep -v grep
//head -1  这个指令可以带上进程的小标题。
//grep -v grep 由于grep本身也是一个进程,加上这句话可以过滤掉grep这个进程的显示

insert image description here
3. Abort the process

法一:Ctrl+c
法二:kill -9 [进程PID]

Get the process identifier through a system call

  • process id (PID)
  • parent process id (PPID)

by usingsystem call function, getpid and getppid can obtain the PID and PPID of the process respectively.
The header files that need to be included #include<unistd.h> #include<sys/types.h>
Let's observe the following situation through a piece of code:

#include<stdio.h>
#include<unistd.h>
#include<sys/types.h>
int main()
{
    
    
     while(1)
     {
    
    
        printf("你好,我已经是一个进程了,我的PID是:%d,我的父进程是:%d\n",getpid(),getppid());
        sleep(1);
     }
     return 0;                                                                                                                                                                    
 }

After running the executable program generated by the code, the PID and PPID of the process can be printed cyclically. We can view the information of the process through the ps command, and we can find that the PID and PPID of the process obtained through the ps command are the same as those obtained by using the system call functions getpid and getppid.
insert image description here
Another phenomenon is that if we repeatedly run the program on the command line, the PID of the process will be different each time, but the PPID of the process is the same. We can view the following attribute information of the parent process through the ps command. It was found that the parent process of this process is bash.

Conclusion: All programs started by the command line will eventually become processes, and the parent process corresponding to this process is bash, so the bash command line interpreter is essentially a process. Bash executes the program by forking a subprocess. If the program has a bug and exits, it is only a problem with the subprocess, which has no effect on bash. If we use the kill command to terminate the bash process, the command line will be invalid (interested students can try it, and it will be restored after restarting)
insert image description here

Create a process through a system call

Getting to know the fork function

  • fork is a system call level function, its function is to create a child process
  • Run man forkto view the user manual of the fork system call function
  • fork has two return values
  • Code sharing between parent and child processes, separate space for data, private copy (copy-on-write)

Let's first look at a piece of test code:

#include<stdio.h>
#include<unistd.h>
#include<sys/types.h>
int main()
{
    
    
    printf("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA:PID:%d,PPID:%d\n",getpid(),getppid());                                                                                               
    fork();
    printf("BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB:PID:%d,PPID:%d\n",getpid(),getppid());
    sleep(1);
    
    return 0;
}

Running result :
insert image description here

Seeing the result of this operation, some people will definitely wonder, why are two lines of B information printed?

Explanation: And we can see from the running results that the PID of B printed for the first time is 22063 (process PID), and the PPID (parent process ID) of B printed for the second time is also 22063. This result shows that when the code is executed After the fork function, we created a child process ourselves. And the parent process of this child process is the myproc process we run, and the parent process 20166 of the myproc process is base. There are two lines of B in the printed result because our current process splits after the fork creates a child process (as shown in the figure). One is the current myproc process and the other is the child process.
insert image description here

Code sharing between parent and child processes, separate space for data, private copy (copy-on-write)

Before fork, the parent process has its PCB, as well as code and data. After fork creates a child process, it does not copy the code and data of the parent process, but creates another PCB corresponding to the process in the kernel. Most of the process attributes of the child process will be templated by the parent process, and a small part of the attributes are private to the child process, such as the PID and PPID of the child process. That is to say, after fork, the parent process and the child process share a copy of code and data (of the parent process) .

The return value of the fork function

  1. If the child process is successfully created, return the PID of the child process in the parent process, and return 0 in the child process.
  2. Returns -1 in the parent process if child process creation fails.

The test codes of A and B are printed above. The child process created by the fork function shares the code with the parent process, but it is meaningless to let the parent and child processes do the same thing. Therefore, in fact, the if-else statement is generally used after fork For shunting, the parent and child processes are independent of each other and can perform different tasks.

The test code is as follows :

#include<stdio.h>
#include<unistd.h>
#include<sys/types.h>
int main()
{
    
    
     printf("I am running...\n");
     //接收fork函数的返回值
     pid_t id=fork();
 
     if(id==0)
     {
    
    
         //子进程
         while(1)
         {
    
    
            printf("我是子进程...\n");
             sleep(1);
         }
     }
     else if(id>0)
     {
    
    
         //父进程
         while(1)
         {
    
    
             printf("我是父进程...\n");
             sleep(1);
         }
     }
     else
     {
    
    
         //fork error
     }
     return 0;                                                                                                                                                                    
}

Running result : parent-child process cycle printing
insert image description here
insert image description here

From the above code, it can be clearly understood that after fork creates child processes, the if-else statements are actually executed separately. There are two myproc processes running~

Conclusion :
a . After fork, one execution flow will be changed into two execution flows.
b . After fork, the order in which the two processes are scheduled by the OS is uncertain, depending on the specific implementation of the operating system scheduling algorithm.
c . After fork, the code after fork is shared, and we use if and else to perform execution stream splitting.

Note: The parent and child processes are independent of each other

When the process is running, it is independent! When the parent and child processes are running, they are also independent. kill -9 child process PID, you can see that the parent process is running normally. Their code is shared, and the data is privately owned in a copy-on-write manner.

How does fork see code and data?

Code: code is read-only,
data: when an execution flow tries to modify data, the operating system will automatically trigger copy-on-write for our current process. Data between parent and child processes will not affect each other.

process state

Blocked and running state

To understand the various states of the process, we need to first understand what is blocking and what is running.

ask? When we open a software, is it always running?

The answer is No, the CPU does not process one process before processing the next process. Instead, we take turns to process, just because the processing speed is very fast, we don't feel the time difference.

Blocking state: The process in which a process waits for a certain resource to become available.

The process must wait for the specific resources to be used by others before being used by itself. The task_struct structure needs to be queued under some resource managed by the OS. Therefore, a state of non-advancement caused by waiting for a certain condition to be ready, that is, the process is stuck, is called blocking. For example, if you go to the bank counter to handle business, but the clerk asks you to go to the side to fill out the form first, then you are in a blocked state.

Processes not only occupy CPU resources, but also hardware resources. For the CPU, it can quickly process the request of the process; but for the hardware, the speed is very slow, such as the network card, there may be Thunder, Baidu Netdisk, QQ and other processes that need to obtain the resources of the network card, so each structure that describes the hardware There is also a task_struct* queue running queue pointer, which points to the head node of the PCB object in the queue.

Then the speed difference between CPU and hardware is huge, how should the system balance this speed? When the CPU finds that a process in the running state needs to access hardware resources, it will let the process queue up in the run queue of the hardware that needs to be accessed, and the CPU continues to execute the next process.

Then the process state stripped by the CPU to the hardware run queue is called the blocked state . After the process finishes accessing the hardware, the state of the process will be changed to the running state, that is, the process returns to the running queue of the CPU.

Summary: PCBs can be maintained in different queues.

Blocking and pending state : The speed of the hardware is slow, but a large number of processes need to access the hardware, which will inevitably generate more blocking processes. The code and data of these blocking processes will not be executed in a short period of time. If they all exist in memory, they will will cause memory usage.

For this problem, if there are too many blocked processes in the memory and the memory is insufficient, the operating system will move its code and data to the disk first, leaving only the PCB structure to save memory space. This process state is called is pending. The process of loading or saving process-related data to disk is called swapping in and swapping out memory data.

The blocking state of a process is not necessarily a suspended state, and some operating systems may have a new state of suspension or a running state of suspension .

Process state in the Linux kernel source code

In order to figure out what a running process means, we need to know the different states of the process. A process can have several states (in the Linux kernel, a process is sometimes called a task).
The following states are defined in the kernel source code:

/*
* The task state array is a strange "bitmap" of
* reasons to sleep. Thus "running" is zero, and
* you can test for combinations of others with
* simple bit tests.
*/
static const char *task_state_array[] = {
    
    
	"R (running)",       /*  0*/
    "S (sleeping)",      /*  1*/
    "D (disk sleep)",    /*  2*/
    "T (stopped)",       /*  4*/
    "T (tracing stop)",  /*  8*/
    "Z (zombie)",        /* 16*/
    "X (dead)"           /* 32*/
};

ps: The current state of the process is saved in its own process control block (PCB), which is also saved in the task_struct in the Linux operating system.

Content classification of task_struct process control block

Next, start to analyze the definition of each state in detail.
insert image description here
Process status view command

ps axj | head -1 && ps axj | grep 进程PID | grep -v grep

Running status-R

Test code:

#include<stdio.h>
int main
{
    
    
	while(1)
	{
    
    }
	return 0;
}

Query process status:
insert image description here
R running status (running) : A process is running, but it does not mean that the process must be running. This process may be running, or it may be in the running queue (queuing). All processes in the running state are put into the run queue. When the operating system switches the process to run, it selects the process from the run queue to run.

light sleep state-S

The essence of the sleep state is the blocking state.

Test code:

#include <stdio.h>    
int main()    
{
    
        
    int a=0;    
    while(1)    
    {
    
        
        printf("%d\n",a++);               
    }                                  
    return 0;                          
} 

View process status:
insert image description here

Light sleep state S : A process is in a light sleep state (sleeping). On the surface, the process is waiting for something to complete. A process in a light sleep state can be woken up at any time, or it can be killed (light sleep can also be called interruptible sleep).

Some people may wonder, the code is obviously running, why is it in a blocked state?
That's because this test code has one more printing function printf than the code in the previous test running state. Since it is a printing function, of course it needs to access the peripherals (display screen). So, at this time, we maintain the PCB of the mytest process Go to the running queue of the peripheral to wait for the peripheral (blocked state). Since the processing speed of the CPU is much higher than the speed of the peripheral, we only have a small probability. It may be possible to see that the process status is R, and most of the queries are The process status is still S.

ps: There is a + sign behind the status to indicate the foreground process, and no + sign indicates the background process.
The foreground process can terminate the process through Ctrl+c, the background process is not controlled by the terminal, and Ctrl+c cannot terminate the process.

Deep Sleep State-D

Deep sleep state-D: also known as uninterruptible sleep state, which means that the process will not be killed, even the operating system, otherwise your system may be down, and only the process can be recovered automatically by waking up. In this state, the process usually waits for the IO to end.

suspend state-T

The essence of the suspended state is also a blocking state.

Suspended state -T (stopped) : In Linux, we can send a process into a suspended state by sending SIGSTOP (kill -19 process PID), and send a SIGCONT signal (kill -18 PID) to continue running a process in a suspended state.
insert image description here

insert image description here
Tracing pause state t : After we use gdb to debug the executable file, use b to set a breakpoint, and run (run), the program will stop at the breakpoint, and the program will enter the t trace pause state (tracing stop), indicating that the process is being traced.

Zombie State - Z

Zombie state-Z (zombie) : When a process is about to exit, the operating system OS will not release the resources of the process immediately, but will wait for a period of time to let the parent process or the operating system read the return result of the child process (that is, the exit code ), a zombie process will be generated if the return code of the child process exit is not read . Zombie processes remain in the process table in a terminated state and wait forever for the parent process to read the exit status code. Therefore, as long as the child process exits, the parent process is still running, but the parent process does not read the state of the child process, and the child process enters the zombie state -Z.

For example, when we write C/C++ code, we will always return 0 at the end. This 0 is actually the exit code, and it is also the result that the parent process needs to get when it is waiting for the child process. The exit code is temporarily saved in its in the process control block.
The parent process is to derive a child process to complete a certain task, so the task completion status of the child process also needs to be reported to the parent process at the end.
In the Linux operating system, we can use the echo $? command to get the exit code of the last process exit .

[nan@VM-8-10-centos test_23_4_27]$ echo $?

Simulate the zombie state, test the code : the child process of the following code prints once, and exits when it executes to exit(1), while the parent process will keep printing information, that is to say, the child process exits, and the parent process is still running, but If the parent process does not read the exit result of the child process, the child process will fall into a zombie state.

#include <stdio.h>    
#include <unistd.h>    
#include <sys/types.h>    
#include <stdlib.h>    
int main()    
{
    
        
    
    pid_t id=fork();    
    if(id==0)    
    {
    
        
        //子进程    
        while(1)    
        {
    
        
            printf("子进程,PID=%d,PPID=%d\n",getpid(),getppid());    
            sleep(1);    
            exit(1);    
        }    
    }                                                                                                                        
    else if(id>0)      
    {
    
                              
        //父进程    
        while(1)      
        {
    
      
            printf("父进程,PID=%d,PPID=%d\n",getpid(),getppid());
            sleep(1);   
        }
    }
    else 
    {
    
    
        perror("fork error\n");
        exit(-1);                                                                                                            
    }
    return 0;
}  

The test results are shown in the figure:
insert image description here

insert image description here
The dangers of zombie processes

  1. The exit status of the zombie process must be maintained, because it needs to tell the process (parent process) that cares about it, the task you entrusted to me, and how I am doing. But if the parent process does not read it all the time, the child process will always be in the zombie Z state.
  2. Maintaining the exit status requires data maintenance, which also belongs to the basic information of the process, so the exit information of the zombie process is stored in task_stuct (PCB). If the parent process has not read the exit result of the child process, then the Z state has not exited, and the PCB will to be maintained at all times.
  3. If a parent process creates many child processes, but none of them are recycled, it will cause a waste of memory resources, because the data structure object (task_stuct) itself will occupy memory. If it is not recycled, it will of course cause a memory leak. Such a serious problem.

DEATH STATE-X

X dead state (dead ): This state is just a return state, you will not see this state in the task list. The process death state is immediately reclaimed by its parent process, which is too fast for us to see.

Guess you like

Origin blog.csdn.net/weixin_63449996/article/details/130322571