First assignment: Analysis of Linux processes

0. Summary analysis of Linux system processes

  The content of this article includes a brief introduction and personal understanding of Linux, processes, process states, and process scheduling.

1.Linux

  1.1 What is Linux

    Linux is a set of Unix-like operating systems that are free to use and spread freely. It is a multi-user, multi-tasking, multi-threading and multi-CPU operating system based on POSIX and UNIX. It can run major UNIX tools, applications and network protocols. It supports 32-bit and 64-bit hardware. Linux inherits the network-centric design idea of ​​Unix and is a multi-user network operating system with stable performance.

2. Process

  2.1 What is a process

    Process: refers to the basic unit that can run independently in the system and is used as a resource allocation. It is composed of a set of machine instructions, data and stacks, and is an active entity that can run independently.

    • A process is an execution of a program
    • Processes can be executed in parallel with other computations
    • A process is a process in which a program runs on a data set, and it is an independent unit for the system to allocate and schedule resources

2.2 Characteristics of the process

  1. Dynamic: The essence of a process is an execution process of a program. The process is dynamically generated and dynamically dies.
  2. Concurrency: Any process can execute concurrently with other processes.
  3. Independence: A process is a basic unit that can run independently, and it is also an independent unit for the system to allocate resources and schedule.
  4. Asynchrony: Due to the mutual constraints between processes, the process has intermittent execution, that is, the processes advance at their own independent and unpredictable speeds.

  2.3 Organization of the process

    task_struct is a data structure of the Linux kernel, which is loaded into RAM and contains process information. Each process puts its information in the task_struct data structure, which contains the following contents:

  • Identifier: A unique identifier that describes the process and is used to distinguish other processes.
  • Status: task status, exit code, exit signal, etc.
  • Priority: The priority relative to other processes.
  • Program Counter: The address of the next instruction in the program that is about to be executed.
  • Memory pointers: pointers to program code and process-related data, as well as pointers to memory blocks shared with other processes.
  • Context data: The data in the processor's registers when the process is executing.
  • I/O status information: including displayed I/O requests, I/O devices allocated to the process, and a list of files used by the process.
  • Accounting information: can include the sum of the processor time, the sum of the number of clocks used, the time limit, the account number, etc.

The data structure that holds process information is called task_struct and can be found in include/linux/sched.h. So the processes running in the system all exist in the kernel in the form of a task_struct linked list.

      

    In order to manage the creation and death of processes (handling operations such as zombie processes), parent-child and sibling relationships are used;

    In order to uniformly process the same semaphore, the thread group relationship is used; in order to facilitate global search, the hash table relationship is used;

    For the scheduler, run queue, wait queue data structures are used.

  2.4 Classification of processes

Linux distinguishes processes into real-time processes and non-real-time processes (ordinary processes), and non-real-time processes are further divided into interactive processes and batch processes.

Types of

describe

example

interactive process

Such processes frequently interact with the user, and therefore spend a lot of time waiting for keyboard and mouse operations. After accepting user input,

The process must be woken up quickly, otherwise the user will feel that the system is unresponsive

shells, text editors and graphics applications

batch process

Such processes do not have to interact with the user and therefore often run in the background. Because such processes do not have to respond quickly, they are often snubbed by the scheduler

Compilers for programming languages, database search engines and scientific computing

real-time process

These processes have strong scheduling needs, such processes will never be blocked by lower priority processes. And their response time should be as short as possible

Video and audio applications, robot control programs, and programs that collect data from physical sensors

 

 

 

 

 

 

 

 

 

 

 

 

 

2.5 Process Identifier

After knowing the generation of the process, it is natural to think that there are so many processes running in the system at the same time period. How can the operating system manage so many processes effectively?

为了有效的管理在Linux中运用了一个task_struct的数据结构对一个进程做了一个完整的描述
Linux中对进程的描述多达300行代码,定义了非常的成员,大致可分为以下几个部分:
-进程状态(State)
-进程调度信息(Scheduling Information)
-各种标识符(Identifiers)
-进程通信有关信息(IPC)
-时间和定时器信息(Times and Timers)
-进程链接信息(Links)
-文件系统信息(File System)
-虚拟内存信息(Virtual Memory)
-页面管理信息(page)
-对称多处理器(SMP)信息
-和处理器相关的环境(上下文)信息(Processor Specific Context)
-其它信息

 

3.进程的状态

  3.1 进程状态

 
volatile long state;  
int exit_state;
View Code

  3.2 state成员的可能取值

 
#define TASK_RUNNING        0  
#define TASK_INTERRUPTIBLE  1  
#define TASK_UNINTERRUPTIBLE    2  
#define __TASK_STOPPED      4  
#define __TASK_TRACED       8  
/* in tsk->exit_state */  
#define EXIT_ZOMBIE     16  
#define EXIT_DEAD       32  
/* in tsk->state again */  
#define TASK_DEAD       64  
#define TASK_WAKEKILL       128  
#define TASK_WAKING     256
View Code

  3.3 进程的各个状态

  TASK_RUNNING

表示进程正在执行或者处于准备执行的状态

  TASK_INTERRUPTIBLE

进程因为等待某些条件处于阻塞(挂起的状态),一旦等待的条件成立,进程便会从该状态转化成就绪状态

  TASK_UNINTERRUPTIBLE

意思与TASK_INTERRUPTIBLE类似,但是我们传递任意信号等不能唤醒他们,只有它所等待的资源可用的时候,他才会被唤醒。

  TASK_STOPPED

进程被停止执行

  TASK_TRACED

进程被debugger等进程所监视。

  EXIT_ZOMBIE

进程的执行被终止,但是其父进程还没有使用wait()等系统调用来获知它的终止信息,此时进程成为僵尸进程

  EXIT_DEAD

进程被杀死,即进程的最终状态。

  TASK_KILLABLE

当进程处于这种可以终止的新睡眠状态中,它的运行原理类似于 TASK_UNINTERRUPTIBLE,只不过可以响应致命信号

  3.4 状态转换图

                

 

 

4.进程的调度

  4.1进程调度是什么

    调度程序利用一部分信息决定系统中哪个进程最应该运行,并结合进程的状态信息保证系统运转的公平和高效。

    这一部分信息通常包括进程的类别(普通进程还是实时进程)、进程的优先级等。

    通俗的说进程的调度就是利用某些信息并且根据一种的规则来合理的决定进程的运行,那么很显然要合理高效的来组织进程运行,这种规则是至关重要的。

    这种规则就是我们说的调度算法。我们后面就详细的说明调度算法。

    需要强调的是调度器总是选择vruntime跑得最慢的那个进程来执行。

    这就是所谓的“完全公平”。为了区别不同优先级的进程,优先级高的进程vruntime增长得慢,以至于它可能得到更多的运行机会。

  4.2 CFS设计思路

    思路我们可以简单的理解为根据进程的权重来分配运行时间,分配给进程的时间按照公式计算:

      分配给进程的运行时间 = 调度周期   * 进程权重 / 所有进程权重之和。

  4.3 CFS的数据结构

    

 

 

5.对操作系统进程模型的看法

   经过不断的改良,Linux的调度算法从曾经的2.6内核的O(1)调度器转变为以红黑树为基本数据结构的算法。为用户带来了很好的使用体验。随着算法的逐渐完善,算法优化的空间越来越小,但我相信,经过对操作系统的    学习,今后我们一定能找到改良的突破点,或是开发出更好的算法。

 

6.相关资料的链接

https://blog.csdn.net/dyllove98/article/details/9281081

https://blog.csdn.net/zjf280441589/article/details/43339007

http://blog.sina.com.cn/s/blog_79e165ef0102wcvz.html

https://blog.csdn.net/fdssdfdsf/article/details/7894211

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325209415&siteId=291194637