1 Introduction
This article is mainly based on the Linux 2.6 source code analysis process model. Source code download address: https://elixir.bootlin.com/linux/v2.6.39/source
2. Process
Definition: A process is a running activity of a program in a computer on a data set. It is the basic unit of resource allocation and scheduling in the system, and the basis of the operating system structure.
3. Organization of Linux system processes
The process is composed of three parts: process control block (PCB), program segment and data segment.
3.1 Process Control Block
Process Control Block (Processing Control Block) is a data structure in the core of the operating system, which mainly represents the process state. It is a special data structure set by the system to manage the process. In Linux, this structure is called task_struct. task_struct is defined in /include/linux/sched.h.
source code address
The PCB contains information:
- Process identifier: each process must have a unique identifier
pid_t pid; // The unique identifier of the process pid_t tgid; // The value of the pid member of the lead thread of the thread group
- process status
volatile long state;
- Process priority
int prio, static_prio, normal_prio; unsigned int rt_priority;
prio represents the dynamic priority of the process, static_prio represents the static priority of the process, normal_prio represents the priority calculated based on the static priority of the process and the scheduling policy, and rt_priority represents the priority of the real-time process.
- CPU field protection area
- The corresponding program and data addresses of the process
- Process resource list
- signal processing information
- file system information
- Additional information about the process
3.2 Program segment
Program segment: program code executed by the CPU
3.3 Data segment
Data segment: The original data in the program corresponding to the process or the result generated after the program is executed.
4. The state and transition of the process in the Linux system
4.1 State of the process
- There are three basic states of a process:
Running state (the process actually occupies the CPU at this moment)
Ready state (runnable, but temporarily stopped because other processes are running)
Blocking state (the process cannot run unless some external event occurs)
- In the Linux system, the status of the process is as follows, which are defined in /include/linux/sched.h.
Source code address: https://elixir.bootlin.com/linux/v2.6.37/source/include/linux/sched.h#L182
#define TASK_RUNNING 0 // executable state #define TASK_INTERRUPTIBLE 1 // interruptible sleep state #define TASK_UNINTERRUPTIBLE 2 // uninterruptible sleep state #define __TASK_STOPPED 4 // suspended state #define __TASK_TRACED 8 // trace state /* in tsk->exit_state */ // Termination state #define EXIT_ZOMBIE 16 #define EXIT_DEAD 32
TASK_RUNNING is the ready state, and the process is currently only waiting for CPU resources.
Both TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE are blocked, and the process is currently waiting for system resources other than the CPU; the former can be woken up by a signal, the latter cannot.
ZOMBIE is a zombie process, the process has ended running, but the process control block has not been logged out.
TASK_STOPPED is the pending state, mainly for debugging purposes. The process will enter this state after receiving the SIGSTOP signal, and will resume running after receiving the SIGCONT.
4.2 Process Creation
In Linux, three process creation methods, fork(), vfork(), and clone(), are mainly provided.
In Linux systems, fork() can be used to create a process, and the fork() function is used to create a new process from an existing process. The new process is the child process, and the original process is the parent process. The child process obtained by using the fork() function is a copy of the parent process. It inherits the address space of the entire process from the parent process, including the process context, code segment, process stack, memory information, file descriptors, and signal control settings. , process priority, process group number, current working directory, root directory, resource limit and controlling terminal, etc., and the only thing unique to the child process is its process number, resource usage and timers, etc. The child process is almost the parent process. Full replication, so parent and child processes run one program at the same time. The vfork() function only copies the task_struct and the kernel stack.
4.3 State transitions
The process state transition diagram is shown in the following figure:
1. The process is blocked waiting for input
2. The scheduler chooses another process
3. The scheduler selects this process
4. A valid input appears
5. Process ends
4.4 Termination of the process
- Exit normally
- exit with error
- Serious error
- killed by another process
5. Process scheduling
When the computer system is a multiprogramming system, there are usually multiple processes competing for the CPU at the same time. This happens whenever two or more processes are in the ready state. If only one CPU is available, then the next process to run must be selected. The part that does the selection work is called scheduling and the algorithm used is called the scheduling algorithm.
5.1 When to schedule
A key question about scheduling is when scheduling decisions are made.
- After creating a new process, you need to decide whether to run the parent process or the child process. Both processes are in a ready state and can be arbitrarily decided.
- Scheduling decisions must be made when a process exits. Select a process from ready processes, if there is no ready process, usually a system-provided idle process will run.
- When a process blocks on I/O and semaphores or for other reasons, another process must be chosen to run.
- When an I/O interrupt occurs, scheduling decisions must be made.
5.2 How to schedule
schedule(): The process scheduling function, which completes the scheduling of the process. The main process of this function is as follows: first close the kernel preemption, find the ready queue on the current CPU, check the status of prev, if it is not running and not preempted in the kernel, delete it from the queue rq. But if prev has a pending signal, set its state to TASK_RUNNING state and keep it in queue rq. Then select the next process with high priority, and notify the scheduler that the switch is about to be performed, update the process information saved in the queue, and finally notify the scheduling class to complete the process switch.
Source code address: https://elixir.bootlin.com/linux/v2.6.39/source/kernel/sched.c
asmlinkage void __sched schedule(void) { struct task_struct *prev, *next; // The structure of the current process and the next process unsigned long *switch_count; // Number of process switches struct rq *rq; // Ready queue int cpu; need_resched: preempt_disable(); // Close the kernel to preempt cpu = smp_processor_id(); // Find the ready queue rq on the current CPU rq = cpu_rq(cpu); rcu_note_context_switch(cpu); prev = rq->curr; // Save the running process in prev schedule_debug(prev); if (sched_feat(HRTICK)) hrtick_clear(rq); raw_spin_lock_irq(&rq->lock); switch_count = &prev->nivcsw; // Switch count record if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) { // The current process is not running and is not preempted by the kernel if (unlikely(signal_pending_state(prev->state, prev))) { // If it is not a non-pending signal, set the process to the ready state prev->state = TASK_RUNNING; } else { // If it is a non-pending signal, remove it from the queue /* * If a worker is going to sleep, notify and * ask workqueue whether it wants to wake up a * task to maintain concurrency. If so, wake * up the task. */ if (prev->flags & PF_WQ_WORKER) { struct task_struct *to_wakeup; to_wakeup = wq_worker_sleeping(prev, cpu); if (to_wakeup) try_to_wake_up_local(to_wakeup); } deactivate_task(rq, prev, DEQUEUE_SLEEP); /* * If we are going to sleep and we have plugged IO queued, make * sure to submit it to avoid deadlocks. */ if (blk_needs_flush_plug(prev)) { raw_spin_unlock(&rq->lock); blk_schedule_flush_plug(prev); raw_spin_lock(&rq->lock); } } switch_count = &prev->nvcsw; } pre_schedule(rq, prev); // Notify the scheduler that a process switch is about to happen if (unlikely(!rq->nr_running)) idle_balance(cpu, rq); put_prev_task(rq, prev); // Notify the scheduler that the current process is about to be replaced with another process next = pick_next_task(rq); // Pick the runnable task clear_tsk_need_resched(prev); // Clear the TIF_NEED_RESCHED flag of pre rq-> skip_clock_update = 0 ; if (likely(prev != next)) { // If not the same process rq->nr_switches++ ; rq ->curr = next; // Switch the current process to the selected process ++*switch_count; // Update the number of switches context_switch(rq, prev, next); /* unlocks the rq */ // Process context switching / * * The context switch have flipped the stack from under us * and restored the local variables which were saved when * this task called schedule() in the past. prev == current * is still correct, but it can be moved to another cpu/rq. */ cpu = smp_processor_id(); rq = cpu_rq(cpu); } else raw_spin_unlock_irq(&rq->lock); post_schedule(rq); // Notify the scheduling class to complete the process switch preempt_enable_no_resched(); if (need_resched()) // If the process is set with the TIF_NEED_RESCHED flag by another process, the function is re-executed for scheduling goto need_resched; }
6. Views on the Operating System Model
7. References
https://baike.baidu.com/item/%E8%BF%9B%E7%A8%8B/382503?fr=aladdin
https://blog.csdn.net/u013592097/article/details/52530129
https://blog.csdn.net/hzk8656511/article/details/52204016
http://blog.sina.com.cn/s/blog_9ca3f6e70102wkwq.html