The first assignment: in-depth source code analysis process model (based on Linux kernel 2.6)

foreword

This article is an in-depth source code process model analysis based on Linux kernel 2.6. The process is one of the core concepts of the operating system, and the process is an important way to implement the system. Therefore, the relevant analysis of the process is carried out here to strengthen the learning of the operating system. .

Attach the Linux kernel 2.6 source code download address: https://mirrors.edge.kernel.org/pub/linux/kernel/v2.6/

First, the concept and characteristics of the process

The first is Baidu Encyclopedia's explanation of the process:

A process is a running activity of a program in a computer on a data set, the basic unit of resource allocation and scheduling in the system, and the basis of the operating system structure. In the early computer structure of process-oriented design, the process is the basic execution entity of the program; in the contemporary computer structure of thread-oriented design, the process is the container of the thread. A program is a description of instructions, data, and their organization, and a process is the entity of a program.

Next is Wikipedia's explanation of the process:

Next, let's take a look at the explanation of the process (Process) on Wikipedia

In computing, a process is an instance of a computer program that is being executed. It contains the program code and its current activity. Depending on the operating system (OS), a process may be made up of multiple threads of execution that execute instructions concurrently.

Google translate, the content is as follows:

In computing, a process is an instance of an executing computer program. It contains program code and its current activity. Depending on the operating system (OS), a process may consist of multiple threads of execution executing instructions concurrently.

What follows are some descriptions of the characteristics of the process:

Dynamic: The essence of the process is the one-time execution of the program in the multi-program system. The process is dynamically generated and dynamically dies.
Concurrency: any process can execute concurrently with other processes
Independence: a process is a basic unit that can run independently, and it is also an independent unit for the system to allocate resources and schedule;
Asynchrony: Due to the mutual constraints between processes, the process has intermittent execution, that is, the processes advance at their own independent and unpredictable speeds
Structural characteristics: The process consists of three parts: program, data and process control block.
Multiple different processes can contain the same program: a program constitutes different processes in different data sets and can get different results; but the program cannot be changed during execution.

After we have some basic understanding of the process, we can proceed to the next step of analysis

Second, how the operating system organizes processes to run

In the Linux system, a process /linux/include/linux/sched.h is defined in the header file as task_structa structure, and its instantiation is a process, which  task_structis composed of many elements. Some important elements are listed below for analysis.


Identifier: A unique identifier associated with a process that distinguishes the executing process from other processes.
State: Describes the state of the process. Because the process has several states such as suspended, blocked, and running, there is an identifier to record the execution state of the process.
Priority: If there are several processes being executed, it involves the order in which the processes are executed, which is related to the identifier of the process priority.
Program Counter: The address of the next instruction in the program that is about to be executed.
Memory pointers: pointers to program code and process-related data.
Context data: The data in the processor's registers when the process is executing.
I/O status information: including displayed I/O requests, I/O devices allocated to the process, and a list of files used by the process.
Accounting information: including the total time of the processor, account number, etc.

The relationship between processes

/* 
 * pointers to (original) parent process, youngest child, younger sibling,
 * older sibling, respectively.  (p->father can be replaced with 
 * p->parent->pid)
 */ struct task_struct *real_parent; /* real parent process (when being debugged) */ struct task_struct *parent; /* parent process */ /*  * children/sibling forms the list of my children plus the  * tasks I‘m ptracing.  */ struct list_head children; /* list of my children */ struct list_head sibling; /* linkage in my parent‘s children list */ struct task_struct *group_leader; /* threadgroup leader */

In a Linux system, all processes are directly or indirectly linked, and each process has its parent process and may have zero or more child processes. All processes that have the same parent process are siblings.

real_parent points to its parent process, or to the init process with PID 1 if the parent process that created it no longer exists. parent points to its parent process, and when it terminates, it must send a signal to its parent process. Its value is usually the same as real_parent. Children represents the head of the linked list, and all elements in the linked list are its child processes (the linked list of child processes of the process). sibling is used to insert the current process into the sibling linked list (the sibling linked list of the process). group_leader points to the leader process of its process group.

Fourth, the transition between process states

1. Process status

In the task_struct structure, the state statement that defines the process is

volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */
The role of the valatile keyword is to ensure that this instruction will not be omitted due to compiler optimization, and requires direct reading of the value each time, which ensures that the process Stability of real-time access.
Process in the /linux/include/linux/sched.h header file we can find the possible values ​​of state as follows

/*

* Task state bitmask. NOTE! These bits are also

* encoded in fs/proc/array.c: get_task_state().

* We have two separate sets of flags: task->state

* is about runnability, while task->exit_state are

* the task exiting. Confusing, but this way

* modifying one set can‘t modify the other one by

* mistake.
*/
define TASK_RUNNING 0
define TASK_INTERRUPTIBLE 1
define TASK_UNINTERRUPTIBLE 2
define TASK_STOPPED 4
define TASK_TRACED 8

/* in tsk->exit_state */
define EXIT_ZOMBIE 16
define EXIT_DEAD 32

/* in tsk->state again */
define TASK_NONINTERACTIVE 64
define TASK_DEAD 128

According to the comments after the state, it can be obtained that when state<0, it means that the process is in an inoperable state, when state=0, it means that the process is running, and when state>0, it means that the process is stopped running condition.
The following lists some common values ​​of
state | state | description |
| :---------------------- | :------------ ------------------------------------------------ |
| 0(TASK_RUNNING) | The process is running or ready to run |
| 1(TASK_INTERRUPTIBLE) | The process is in an interruptible sleep state and can be woken up by a signal |
| 2(TASK_UNINTERRUPTIBLE) | The process is in an uninterruptible sleep state, and a signal cannot be passed Wake up |
| 4( TASK_STOPPED ) | Process is stopped |
| 8( TASK_TRACED ) | Process is monitored |
| 16( EXIT_ZOMBIE ) | Zombie state process, which means that the process is terminated, but its parent program has not yet acquired its terminated information. |
| 32(EXIT_DEAD) | Process died, this state is the final state of the process|

2. The relationship diagram of the mutual conversion between the various states of the process:

(The picture comes from the Internet)

Five, process scheduling

1. Data structures related to process scheduling

Before understanding how processes are scheduled, we need to understand some data structures related to process scheduling.

① Runnable queue (runqueue)

Under the /kernel/sched.cfile, the runnable queue is defined as struct rq, each CPU will have one struct rq, it is mainly used to store some basic information for scheduling, including timely scheduling and CFS scheduling. In Linux kernel 2.6, it  struct rqis a very important data structure. Next, we will introduce some of its important fields:

                            /*   选取出部分字段做注释   */
    //runqueue的自旋锁,当对runqueue进行操作的时候,需要对其加锁。由于每个CPU都有一个runqueue,这样会大大减少竞争的机会
    spinlock_t lock; 
    
    // 此变量是用来记录active array中最早用完时间片的时间
    unsigned long expired_timestamp; //记录该CPU上就绪进程总数,是active array和expired array进程总数和 unsigned long nr_running; // 记录该CPU运行以来发生的进程切换次数 unsigned long long nr_switches; // 记录该CPU不可中断状态进程的个数 unsigned long nr_uninterruptible; // 这部分是rq的最最最重要的部分, 我将在下面仔细分析它们 struct prio_array *active, *expired, arrays[2];

②Priority array (prio_array)

active arrayIn Linux kernel version 2.6, two more array sums sorted by priority are added to rq expired array .
The structure of these two queues is struct prio_arraythat it is defined in /kernel/sched.cand its data structure is:

struct prio_array {
    unsigned int nr_active; // 
    DECLARE_BITMAP(bitmap, MAX_PRIO+1); /* include 1 bit for delimiter */ /*开辟MAX_PRIO + 1个bit的空间, 当某一个优先级的task正处于TASK_RUNNING状态时, 其优先级对应的二进制位将会被标记为1, 因此当你需要找此时需要运行的最高的优先级时, 只需要找到bitmap的哪一位被标记为1了即可*/ struct list_head queue[MAX_PRIO]; // 每一个优先级都有一个list头 };

Active arrayIndicates the running process queue selected by the CPU for execution. The processes in this queue have time slices remaining, and the  *activepointer always points to it.
Expired arrayIt is used to store Active arraythe process in which the time slice is used up, and the *expired pointer always points to it.
Once active arraythe time slice of a common process in it is used up, the scheduler will recalculate the time slice and priority of the process, delete it active arrayfrom it, and insert it into expired arraythe corresponding priority queue in .
When all tasks in the active array have used up their time slices, they only need to *activeexchange *expiredthese two pointers to switch the run queue.

③ Scheduler main function (schedule())

scheduleIn the existence /kernel/sched.cof the function, it is a very important function of the Linux kernel. Its function is to select the next process that should be executed, and complete the switching of the process. It is the main executor of process scheduling.

2. Scheduling algorithm (O(1) algorithm)

①Introduce the O(1) algorithm

What is the O(1) algorithm: This algorithm can always select the process with the highest priority and execute it in a limited time, regardless of how many runnable processes there are in the system, so it is named the O(1) algorithm.

 ②The principle of O(1) algorithm

active arrayEarlier we mentioned two array sums sorted by priority expired array, and these two arrays are the key to implementing the O(1) algorithm.
The O(1) scheduling algorithm selects the process with the highest priority in the active array array to run each time.
So how does the algorithm find the process with the highest priority? Do you remember the previous prio_arrayfield DECLARE_BITMAP(bitmap, MAX_PRIO+1);? Here it comes into play (see the code comments for details). Here, as long as you find bitmapwhich bit is set to 1, you can get the priority of the task running on the current system (idx, implemented by the sehed_find_first_bit() method), then connect Go down and find the process list (queue) corresponding to idx. All processes in the queue are currently runnable and have the highest priority process, and then execute these processes in turn.
The process is defined in the schedulefunction, and the main code is as follows:

struct task_struct *prev, *next;
struct list_head *queue;
struct prio_array *array;
int idx;

prev = current;
array = rq->active;
idx = sehed_find_first_bit(array->bitmap); //找到位图中第一个不为0的位的序号 queue = array->queue + idx; //得到对应的队列链表头 next = list_entry(queue->next, struct task_struct, run_list); //得到进程描述符 if (prev != next) //如果选出的进程和当前进程不是同一个,则交换上下文 context_switch();

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325220745&siteId=291194637