The first assignment: Source code analysis process model based on Linux 2.6

1 Introduction

The content of this article is based on the source code of Linux 2.6, in-depth analysis of the process model.

  • what is a process
  • How the operating system organizes processes
  • How process state transitions
  • How processes are scheduled
  • Talk about your own views on the operating system process model

2. What is a process

  A process is a general term for a program in execution and all the resources it contains, including virtual processors, virtual space, registers, stacks, global data segments, and so on.

 Windows10 process as shown

3. How the operating system organizes processes

 

1. Describe the process - PCB

Process information is placed in a structure called process control block, which can be understood as a collection of process attributes, called: PCB. Under Linux, PCB is a structure called task_struct, which stores information about the process. information.

Content classification of task_struct structure:

①Identifier (PID): a unique identifier that describes the process and is used to distinguish other processes

There are many ways to get the pid. The most recommended one is to get the pid of the process through the system call getpid().

②Status: task status, exit code, exit signal, etc.

Status classification: R running status (runing) S sleep status (sleeping) D disk sleep status (Disk sleeping) T stopped status (stopped) X dead status (dead) Z zombie status (zombie) 

③ Priority: the priority relative to other processes

PRI: The priority of the process execution, the smaller the value, the higher the priority

NI: Represents the nice value, which represents the revised value of the priority of the process that can be executed

④Program counter: The address of the next instruction to be executed in the program

⑤ Memory pointers: pointers to program code and process-related data, as well as pointers to memory blocks shared with other processes

⑥Context data: the data in the registers of the processor when the process is executed

⑦I/O status information: including displayed I/O requests, I/O devices allocated to the process and a list of files used by the process

⑧ Accounting information: may include the total time of the processor, the total number of clocks used, time limit, account number, etc.

⑨Other information

The process of organization

Because the process needs to be closed and opened continuously, which is very similar to the structure of the linked list in the data structure, the processes running in the system all exist in the kernel in the form of a task_struct linked list. At this time, we organize the processes.

3. View the process

Now that the processes have been described and organized, what are some ways we can view them?

① Process information can be viewed by the /proc file; for example, to view a process with a PID (process identifier) ​​of 1, you need to view the /proc/1 folder

②Get the process identifier through the system call:

 

#include<stdio.h>  
#include<sys/types.h>  
#include<unistd.h>  
int main(){  
      printf("pid:%d\n",getpid());  
      printf("ppid:%d\n",getppid());  
      return 0;  
 }  

 

The getppid() function is to get the identifier of the parent process, and the getpid() function is to get the identifier of the child process

Fourth, create a process

Create a process through a system call, that is, use the fork() function

#include<stdio.h>  
#include<sys/types.h>  
#include<unistd.h>  
int main(){  
       pid_t ret = fork();  
     printf("hello proc:%d,ret = %d\n",getpid(),ret);  
       return 0;  
  }  

Five, the status of the process:

R running state (running): indicates that the process is either running or in the run queue

S sleep state (sleeping): means that the process is waiting for the completion of the event, this sleep is also called interruptible sleep

D disk sleep state (Disk sleep): sleep in this state is usually waiting for the end of I/O, also known as uninterruptible sleep

T stopped state (stopped): You can send a SIGSTOP signal to the process to stop the process. You can also send the SIGCONT signal to keep the process running.

X dead state (dead): this state is just a return state, you will not see this state in the task list

Z zombie state (zombie): the child process exits, the parent process does not read the exit code of the child process, and the child process enters the Z state

Zombie process:

#include<stdio.h>  
#include<sys/types.h>  
#include<unistd.h>  
#include<stdlib.h>  
int main(){  
      pid_t id = fork();  
      if(id<0){  
          perror("fork failed!\n");  
          return 1;  
      }  
      else  if (id> 0 ){ // In this loop, we let the parent process sleep30s   
          printf( " father pid:%d\n " ,getpid());  
          sleep(30);  
      }  
      else { //In this loop, we let the child process sleep for 5s, and then exit   
          printf( " child pid:%d\n " ,getpid());  
          sleep(5);  
          exit(EXIT_SUCCESS);  
      }  
      return 0;  
}  

 

The dangers of zombie processes:

①Because the state of the process must be maintained, the exit status also belongs to the basic information of the process, so it is stored in the PCB. In the Z state, if the child process does not exit, the PCB will always be maintained.

② After the child process enters the zombie state, it will always occupy the memory, resulting in a waste of memory resources

③ Memory leak

4. How the process state transitions

1. The status of the Linux process is:

TASK_RUNNING : Ready state or running state, the process is ready to run, but not necessarily occupying the CPU, the R corresponding to the process state

TASK_INTERRUPTIBLE: Sleep state, but the process is in shallow sleep and can respond to signals. Generally, the process is in the state where the process actively sleeps, corresponding to the process state S

TASK_UNINTERRUPTIBLE: Sleep state, deep sleep, do not respond to signals, the typical scenario is that the process acquires semaphore blocking, corresponding to process state D

TASK_ZOMBIE: Zombie state, the process has exited or ended, but the parent process does not yet know, the state when there is no recycling, corresponding to the process state Z

TASK_STOPED: stop, debugging state, corresponding to process state T

2. Process scheduling timing:

Process scheduling will cause process state transition. From the above figure, it can be seen that the following conditions will trigger scheduling. When the process terminates or the process sleeps, it actively exits or sleeps to release the CPU; the light sleep process is selected by CFS scheduling to wake up, and the deep sleep process locks due to the semaphore. It is awakened by the release of waiting; the process receives a semaphore, etc.; there is also one of the most common interrupts, exceptions.

5. How processes are scheduled

The goal of Linux process scheduling

    1. Efficiency: Efficiency means accomplishing more tasks in the same amount of time. The scheduler will be executed frequently, so the scheduler should be as efficient as possible;

    2. Strengthen the interactive performance: Under the considerable load of the system, the response time of the system should also be guaranteed;

    3. To ensure fairness and avoid hunger;

    4. SMP scheduling: The scheduler must support multiprocessing systems;

    5. Soft real-time scheduling: The system must effectively call the real-time process, but it is not guaranteed to meet its requirements;

Linux process priority

The process provides two priorities, one is the normal process priority, and the second is the real-time priority. The former applies the SCHED_NORMAL scheduling policy, and the latter selects the SCHED_FIFO or SCHED_RR scheduling policy. At any time, the priority of the real-time process is higher than that of the ordinary process , and the real-time process will only be preempted by the higher-level real-time process. of.

Scheduling of real-time processes

  The real-time process has only static priority, because the kernel will not adjust its static priority according to factors such as sleep, and its range is between 0 and MAX_RT_PRIO-1. The default MAX_RT_PRIO configuration is 100, that is, the default real-time priority range is 0~99. The nice value affects processes with a priority in the range of MAX_RT_PRIO~MAX_RT_PRIO+40.

  Different from ordinary processes, during system scheduling, processes with higher real-time priorities are always executed before those with lower priorities. Real-time processes that know that real-time priority is high cannot execute. Real-time processes are always considered active. If there are several real-time processes with the same priority, the system selects the processes in the order in which they appear on the queue. Assuming that the priority of the real-time process A running on the current CPU is a, and a real-time process B with priority b enters the runnable state at this time, as long as b<a, the system will interrupt the execution of A, and execute B first. Until B can't execute (no matter what real-time process A, B is).

   Real-time processes with different scheduling policies are only comparable if they have the same priority:

   1. For FIFO processes, it means that only when the current process is completed will it be the turn of other processes to execute. That's pretty arrogant.

   2. For the RR process. Once the time slice is exhausted, the process is placed at the end of the queue, then other processes of the same priority run, and if there are no other processes of the same priority, the process continues execution.

   , for real-time processes, the high-priority process is the uncle. It is executed until it cannot be executed, and then it is the turn of the low-priority process to execute. The hierarchy is pretty strict.

Non-real-time process scheduling

Linux schedules ordinary processes according to dynamic priorities. The dynamic priority is adjusted from the static priority (static_prio). Under Linux, static priorities are invisible to the user and hidden in the kernel. The kernel provides users with an interface that can affect the static priority, which is the nice value. The relationship between the two is as follows:

  static_prio=MAX_RT_PRIO +nice+ 20

  The range of nice values ​​is -20~19, so the static priority range is between 100~139. The larger the nice value, the larger the static_prio, and the lower the final process priority.

  Execution result of the ps -el command: the nice value of each process displayed in the NI column, and PRI is the priority of the process (if it is a real-time process, it is a static priority, and if it is a non-real-time process, it is a dynamic priority)  

  The time slice of the process is completely customized by static_prio, as shown in the figure below, from "In-depth Understanding of the Linux Kernel",

  

   As we mentioned earlier, other factors are also considered when the system is scheduled, so a thing called the dynamic priority of the process will be calculated, and the scheduling will be implemented based on this. Because, not only the static priority, but also the properties of the process are considered. For example, if the process is an interactive process, its priority can be appropriately increased to make the interface more responsive, so that the user can get a better experience. Linux2.6 has been greatly improved in this regard. Linux 2.6 believes that interactive processes can be judged from a measure of average sleep time. The more sleep time a process has in the past, the more likely it is to be an interactive process. When the system schedules, it will give the process more bonuses so that the process has more opportunities to execute. Bonuses vary from 0 to 10.

  The system will strictly follow the dynamic priority order to arrange the process execution. A process with a high dynamic priority enters a non-running state, or the time slice is exhausted before it is the turn of a process with a lower dynamic priority to execute. The calculation of dynamic priority mainly considers two factors: static priority, and the average sleep time of the process, that is, bonus. Calculated as follows,

     dynamic_prio = max (100, min (static_prio - bonus + 5, 139))

  When scheduling, Linux 2.6 uses a small trick, which is the classic idea of ​​space-for-time in the algorithm, so that the optimal process can be calculated in O(1) time.

Linux process state machine

 

 

6. Talk about your views on the operating system process model

Processes are one of the oldest and most important abstractions provided by operating systems, and they transform a single CPU into multiple virtual CPUs. Without the abstraction of processes, modern computing would cease to exist. So the process is very important, so it is necessary to study the process well to further learn the knowledge behind the operating system.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325011847&siteId=291194637