first job
1. Summary
This article is mainly for the Linux Kernel 2.6.28 kernel version, describing the concept of process and the calling process.
Linux Kernel source code reference address: https://elixir.bootlin.com/linux/v4.6/source/include/linux/types.h
2. What is a process
2.1 The concept of process
An official definition of a process:
A process is a running activity of a program with certain independent functions about a certain data set, and it is also an independent unit for the operating system to allocate and schedule resources.
In short, a process is a management instance established by the operating system for a running program.
And a process consists of five entities:
- (OS manages the running program) data structure P
- (running program) memory code C
- (running program) memory data D
- (running program) general register information R
- Program status word information PSW (executed by the OS control program)
2.2 Visible processes
2.2.1 Processes on Windows:
2.2.2 Processes on Ubuntu
3. How the process is organized
In the Linux kernel, there is a structure used to describe and associate a process: task_struct
, the data /include/linux/sched.h
structure is defined in , and its code size is as many as 400 lines.
3.1 Process ID
The definition of the process ID is kept include/linux/pid.h
in :
enum pid_type
{
PIDTYPE_PID,
PIDTYPE_PGID,
PIDTYPE_SID,
PIDTYPE_MAX
};
Here we explain the most important PIDs in detail.
3.1.1 Process Identifier (PID)
In Linux, a process is assigned a unique process ID, or PID. It is the unique code name of a process in the system, but a process ID is not permanently owned by a process. The PID obtained by running a process at different times is not the same. The process generated when the fork or clone system call is used will be allocated by the kernel. New unique PID value.
pid_t pid;
As shown in the above code, PID task_struct
is defined in pid_t
, and it is actually a int
type , so the essence of PID is a number.
3.1.2 Scope of PID
include/linux/threads.h
In , the system limits the maximum value of the PID value.
#define PID_MAX_DEFAULT (CONFIG_BASE_SMALL ? 0x1000 : 0x8000)
It can be seen that in general, the maximum number of processes in a Linux system is 32768.
3.1.3 Generation of PID
So where does the PID come from? kernel/pidc
The answer to this question is given in:
static int alloc_pidmap(struct pid_namespace *pid_ns)
{
int i, offset, max_scan, pid, last = pid_ns->last_pid;
struct pidmap *map;
pid = last + 1;
if (pid >= pid_max)
pid = RESERVED_PIDS;
offset = pid & BITS_PER_PAGE_MASK;
map = &pid_ns->pidmap[pid/BITS_PER_PAGE];
max_scan = (pid_max + BITS_PER_PAGE - 1)/BITS_PER_PAGE - !offset;
for (i = 0; i <= max_scan; ++i) {
if (unlikely(!map->page)) {
void *page = kzalloc(PAGE_SIZE, GFP_KERNEL);
/*
* Free the page if someone raced with us
* installing it:
*/
spin_lock_irq(&pidmap_lock);
if (map->page)
kfree(page);
else
map->page = page;
spin_unlock_irq(&pidmap_lock);
if (unlikely(!map->page))
break;
}
if (likely(atomic_read(&map->nr_free))) {
do {
if (!test_and_set_bit(offset, map->page)) {
atomic_dec(&map->nr_free);
pid_ns->last_pid = pid;
return pid;
}
offset = find_next_offset(map, offset);
pid = mk_pid(pid_ns, map, offset);
/*
* find_next_offset() found a bit, the pid from it
* is in-bounds, and if we fell back to the last
* bitmap block and the final block was the same
* as the starting point, pid is before last_pid.
*/
} while (offset < BITS_PER_PAGE && pid < pid_max &&
(i != max_scan || pid < last ||
!((last+1) & BITS_PER_PAGE_MASK)));
}
if (map < &pid_ns->pidmap[(pid_max-1)/BITS_PER_PAGE]) {
++map;
offset = 0;
} else {
map = &pid_ns->pidmap[0];
offset = RESERVED_PIDS;
if (unlikely(last == offset))
break;
}
pid = mk_pid(pid_ns, map, offset);
}
return -1;
}
alloc_pidmap
The function is used to allocate PIDs, and the same kernel/pid.h
way to recycle PIDs is also defined in:
static void free_pidmap(struct upid *upid)
{
int nr = upid->nr;
struct pidmap *map = upid->ns->pidmap + nr / BITS_PER_PAGE;
int offset = nr & BITS_PER_PAGE_MASK;
clear_bit(offset, map->page);
atomic_inc(&map->nr_free);
}
3.2 The state of the process
3.2.1 Process state definition
In Linux, there are 6 main process states:
code | name | describe |
---|---|---|
R | TASK_RUNNING | executable state |
S | TASK_INTERRUPTIBLE | interruptible sleep state |
D | TASK_UNINTERRUPTIBLE | uninterruptible sleep state |
T | TASK_STOPPED or TASK_TRACED | Pause state or track state |
WITH | TASK_DEAD - EXIT_ZOMBIE | Exit status, the process becomes a zombie process |
X | TASK_DEAD - EXIT_DEAD | exit status, the process is about to be destroyed |
include/linux/sched.h
They are defined in :
#define TASK_RUNNING 0
#define TASK_INTERRUPTIBLE 1
#define TASK_UNINTERRUPTIBLE 2
#define TASK_STOPPED 4
#define EXIT_ZOMBIE 16
#define EXIT_DEAD 32
- In some operating system textbooks, the RUNNING state means the process being executed in the CPU, and the state that is executable but has not been called is defined as the READY (ready) state. The above two states are uniformly defined as the TASK_RUNNING state in Linux.
- Under the normal operation of the machine, most of the processes in the system are in the TASK_INTERRUPTIBLE state, and the principle of maintaining rapid mobilization without taking up too much CPU resources makes it seem natural.
- Why is the sleep state divided into two types: interruptible and non-interruptable? Its significance is probably to avoid being interrupted in the process of interacting with the device, thus causing the machine to fall into an uncontrollable state.
- The process is in the TASK_DEAD state during the process of exiting. At this time, most of the resources occupied by the process will be reclaimed, except
task_struct
for a few special resources such as , so this state of being left and left at this time is called ZOMBIE.
3.2.2 Process state transition
The following diagram provides a brief overview of the transitions of process states in the system:
Although there are 6 different process states in the system, the transition of the process state is essentially only the mutual transition between TASK_RUNNING and non-TASK_RUNNING.
For example, when a TASK_INTERRUPTIBLE state process receives an end command, it does not directly change to the TASK_DEAD state, but first wakes up to enter the TASK_RUNNING state, and then enters the TASK_DEAD state from the TASK_RUNNING state. When a process is in the TASK_RUNNING state, it has only two options: enter the TASK_STOPED or TASK_DEAD state in response to the signal, or enter the TASK_INTERRUPTIBLE state by executing a system call.
4. How processes are scheduled
4.1 CFS Scheduler
With the change of kernel versions, the O(1) scheduler was replaced by CFS (Completely Fair Scheduler) after Linux Kernel 2.6.23.
CFS uses vruntime
to measure the priority of a process. Its calculation formula is as follows
vruntime = 进程被分配的运行时间 * NICE_0_LOAD / 进程权重
Among them, NICE_0_LOAD
represents the weight of the process whose nice value is 0, and its value is 1024, and the process weight corresponds to the nice value one-to-one, which is prio_to_weight
converted by the global array.
static const int prio_to_weight[40] = {
/* -20 */ 88761, 71755, 56483, 46273, 36291,
/* -15 */ 29154, 23254, 18705, 14949, 11916,
/* -10 */ 9548, 7620, 6100, 4904, 3906,
/* -5 */ 3121, 2501, 1991, 1586, 1277,
/* 0 */ 1024, 820, 655, 526, 423,
/* 5 */ 335, 272, 215, 172, 137,
/* 10 */ 110, 87, 70, 56, 45,
/* 15 */ 36, 29, 23, 18, 15,
};
But how do we know the running time of the process?
Its calculation formula is进程实际运行时间 = 调度周期 * 进程权重 / 所有进程权重之和
The scheduling period is the time to schedule all processes in the TASK_RUNNING state.
If the process running is idealized, the actual running time of the process is regarded as the running time allocated to it by the system, and then the two equations can be used to obtain
vruntime = (调度周期 * 进程权重 / 所有进程权重之和)* 1024 / 进程权重 = 调度周期 * 1024 / 所有进程总权重
From the above formula, we can find that even if the weights of different processes are not the same, they vruntime
should same, so if the vruntime
value of a process is small, it means that it does not get the running time it deserves. At this time, the operating system It should be preferred to run.
The above is the main idea of CFS.
vruntime
Stored in the sched_entity
data , it is a scheduling entity include/linux/sched.h
defined in :
struct sched_entity {
struct load_weight load; /* for load-balancing */
struct rb_node run_node;
struct list_head group_node;
unsigned int on_rq;
u64 exec_start;
u64 sum_exec_runtime;
u64 vruntime;
u64 prev_sum_exec_runtime;
u64 last_wakeup;
u64 avg_overlap;
#ifdef CONFIG_SCHEDSTATS
u64 wait_start;
u64 wait_max;
u64 wait_count;
u64 wait_sum;
u64 sleep_start;
u64 sleep_max;
s64 sum_sleep_runtime;
u64 block_start;
u64 block_max;
u64 exec_max;
u64 slice_max;
u64 nr_migrations;
u64 nr_migrations_cold;
u64 nr_failed_migrations_affine;
u64 nr_failed_migrations_running;
u64 nr_failed_migrations_hot;
u64 nr_forced_migrations;
u64 nr_forced2_migrations;
u64 nr_wakeups;
u64 nr_wakeups_sync;
u64 nr_wakeups_migrate;
u64 nr_wakeups_local;
u64 nr_wakeups_remote;
u64 nr_wakeups_affine;
u64 nr_wakeups_affine_attempts;
u64 nr_wakeups_passive;
u64 nr_wakeups_idle;
#endif
#ifdef CONFIG_FAIR_GROUP_SCHED
struct sched_entity *parent;
/* rq on which this entity is (to be) queued: */
struct cfs_rq *cfs_rq;
/* rq "owned" by this entity/group: */
struct cfs_rq *my_q;
#endif
};
4.2 Red-Black Trees
The different are sched_entity
organized together by a time-ordered red-black tree:
vurtime
The processes with the smallest value are stored on the left side of the tree, so that the process with the smallest vruntime
value .
5. Views on the operating system process model
For a long time, the operating system has tried to define fairness. Does the interactive process necessarily have the absolute right to speak? CFS gave his answer. It no longer attempts to distinguish interactive processes, but treats all processes equally, just as its name, Completely Fair. Its appearance makes the famous O(1) scheduler only a flash in the pan. Linux has developed across many versions, and CFS has never been replaced. It declares its own sovereignty with its unique superiority.