Five minutes to take you to understand the computer operating system - process and thread

Process threads can be said to be the foundation of the operating system . I have read many articles about this aspect of knowledge that are purely theoretical. I am going to take you to learn and master processes and threads in the form of diagrams. The text strives to be simple and clear, and for complex concepts, one concept and one diagram are made. In the study of operating system courses, many people have a general understanding of process threads, which is even more impressive.

process

Process (Process) is the core concept in the operating system and is an abstraction of running programs. Even with only one CPU available, multiple processes can be started, giving the operating system concurrency capabilities.

process model

A process is an instance of a program being executed. Each process has its own virtual CPU, program counter, registers, memory address space, etc. These are private to a process and cannot be accessed or modified by other processes. The real CPU Switch back and forth between processes.

Suppose now there are 4 programs, they load their own virtual program counters, registers at runtime, and have physical CPUs to run the programs. When programs are switched, data such as the physical program counter is saved to memory. When observed for a long enough time, all processes are running, but only one program is executing at each instant. It should be noted that if a program is run twice, it will be counted as two processes.

The core idea of ​​a process is that a process is a certain type of activity that has programs, inputs, outputs, and states. A single processor can be shared by several processes, and it uses a scheduling algorithm to decide when to stop the work of one process and provide services to another process.

When a process creates a new process, it is called the parent process, and the new process is called the child process, and these processes form a hierarchy. In UNIX a special process called init appears in the boot image. When init runs, it reads a file for the number of terminals and creates a new process for each terminal. These terminals wait for the user to log in, and when the login is successful, the shell process will be executed to receive user commands. Throughout the system, all processes belong to a process tree rooted at init. There is no concept of hierarchy in Windows, all processes are equal. When a process is created, the parent process gets  a handle  to control the child process, but it can also pass the handle to other processes.

Process creation and termination

The operating system often needs a way to create a process. Generally, there are 4 main events to create a process:

  1. system initialization
  2. A running program executes a system call (syscall) that creates a process
  3. User requests to create a new process
  4. Initialization of batch jobs

New processes are created by an existing process executing a syscall used to create the process. This system call notifies the operating system to create a new process and specifies, directly or indirectly, the program to run in that process.

On UNIX systems, syscalls are often used  fork to create an identical copy of the calling process. After a successful call, both processes have the same memory image, the same environment strings, and open files. Usually the child process executes  exec a series of syscalls to modify its memory image and run a new program. Allows the child process to handle operations such as its file descriptors before executing  exec .

The initial address space of the child process in UNIX is a copy of the parent process, but the non-writable parts of the two different address spaces are shared. In some UNIX implementations, the child process shares all the memory of the parent process, and the memory is shared through  Copy  -On-Write (COW) technology, that is, when one of the two modifies the memory, this memory is explicitly Copy to ensure that modifications occur in the private area of ​​the process.

After the process is created, it starts running, completes its work, and the process may complete its task at some point in the future and be terminated. Usually terminating a process is caused by the following conditions:

  1. normal exit (voluntary)
  2. error exit (voluntary)
  3. critical error (involuntary)
  4. Killed by other processes (involuntarily)

The first two are the terminations handled by the process itself, completing the work or encountering some handleable errors during runtime. At this time, the process terminates by calling and  exit returns a status value to the parent process.

Serious errors, such as dividing by zero  or  referencing wrong memory , will often cause the process to directly crash and terminate the process   . In UNIX, however, a process can notify the operating system to handle some serious errors by itself, receiving a signal (being interrupted) instead of terminating when an error occurs.

Processes can be killed by other processes, and a syscall is required  kill to do this. kill uses the method of sending a signal to inform the process to perform an operation. The process can capture part of the signal and process it by itself, but the "kill" signal  SIGKILL  cannot be captured, and the process will be killed directly after receiving the KILL signal.

state of the process

Although a process is an independent entity, it is sometimes necessary to interact between processes. For example, when a process waits for input from other processes, it will be blocked, and the state of the process will also become blocked. A running process is a running state process. When a process is deprived of the right to use the CPU and waits for the scheduling of the CPU, it is called the ready state.

The blocking state is similar to the running state in that the process is not using the CPU, but it is completely different. The former requires certain conditions to continue running, such as waiting for the input to be ready or waiting for the hard disk to read data. At this time, the process blocks and actively gives up the CPU for other processes to use. The latter can continue to run, but the operating system has scheduled the CPU to other processes, so it enters the state of waiting for the CPU. Therefore, the blocked state process must meet certain conditions before it can continue, while the ready state process only needs to wait for the scheduling of the operating system, and the CPU is not allocated to it for the time being.

These three states can be converted to each other, as shown in the figure below. When the operating system finds that the process needs to wait for some tasks and cannot continue to execute, it converts the running state to the blocking state; the transition between the running state and the ready state is caused by the scheduling of the operating system for the process, and the scheduling cannot be perceived for the process If the process is blocked, the task it is waiting for has been completed and the process can continue to run, it will switch to the ready state and wait for the operating system to schedule.

process implementation

The operating system uses a process table (process table) to maintain the process model, and each process occupies a process control block. This control block contains important information of the process status, including program counter, stack pointer, memory allocation, open file status, scheduling information, etc., so as to ensure that the process can be started again in the subsequent scheduling, as if it has not been interrupted Same. The table below shows the key fields in the program control block.

 Associated with each I/O class is a   location called an interrupt vector , which is a fixed area near the bottom of memory that contains the entry address of the interrupt service routine. When an interrupt occurs, the operating system will do a series of operations, save the context of the current process, and execute the interrupt program service. The following are the execution steps of the interrupt:

  1. Push program counter, program status word onto stack (hardware)
  2. Load new program counter from interrupt vector (hardware)
  3. Save the value of a register (assembly language)
  4. Set up a new stack (assembly language)
  5. Interrupt service routine (usually in C language)
  6. Schedule the next process to run
  7. interrupt return
  8. Start running the new current process

A process may be interrupted thousands of times during its execution, but the point is that after each interruption, the interrupted process returns to exactly the same state it was in before the interruption occurred.

thread

In traditional operating systems, each process has an address space and a thread of control. Often there are multiple threads of control running in parallel in the same address space, acting like separate processes but sharing address space with each other.

We often introduce threads for the following reasons:

  1. Parallel instances have the ability to share the same address space and all data, which is difficult to express for the multi-process model
  2. Compared with processes, threads are lighter, easier to create, faster, and easier to revoke, which is conducive to dynamic and rapid modification of a large number of threads
  3. If multiple threads are CPU intensive, then there is no performance gain, but if there is a lot of computation and a lot of I/O processing, having multiple threads allows these activities to overlap each other, thus speeding up the execution of the application. speed
  4. In a multi-CPU system, multi-threading is beneficial, and true parallelism is possible

threading model

The process uses some method to gather related resources together, including program text, data and address space, open files, timers, etc. The process has an execution thread (thread), and there is a program counter, registers and stack in the thread, which are used to record information such as instructions and variables. Threads, like traditional processes, have several states for CPU scheduling, and the state transitions of threads are consistent with processes. Processes are used to pool resources together, and threads are entities that are scheduled for execution on the CPU.

In the same process environment, multiple threads with greater independence from each other are allowed to execute simultaneously, which is a simulation of the multi-process model. Since in the multithreading model, all threads have exactly the same address space, it means that they also share the same global variables. There is no protection between threads, and one thread can read, write, and clear the stack of another thread. This makes it easy for threads to work together to complete a certain task.

To implement portable threaded programs, a standard for threads is defined in IEEE 1003.1c, called  pthread , which is supported by most UNIXes. All pthread threads have properties consisting of an  identifier  , a set of *registers* (including the program counter), and a set of attributes stored in a structure, including stack size, scheduling parameters, etc.

When a call is used to create a new thread  pthread_create , the thread's identifier is returned as the function's return value. Doing so looks like a fork call, which is convenient for other threads to refer to this thread. When a thread has finished its work, it can be  pthread_exit terminated by calling, which is similar to  exit call, which terminates the thread and frees the thread's stack. Threads can use  pthread_yield calls to voluntarily relinquish the CPU for other threads.

kernel thread

Threads, scheduling, and management operations are implemented by the kernel. When a thread needs to be created or operated, syscalls are used to complete related operations. The kernel uses a thread table to record all threads in the system, which saves the registers, status and other information of each thread.

Due to the high cost of creating or canceling threads in the kernel, some systems will recycle  threads  to reduce overhead. When a thread is revoked, it is marked as non-runnable, but its kernel data structure is not affected at all, and an old thread can be re-enabled when a new thread needs to be created.

user mode thread

User-mode threads can put the entire thread package in user space. The kernel knows nothing about the thread package, and the kernel only needs to manage single-threaded processes. Threads can be implemented this way on operating systems that don't support them. When threads are managed in user space, each process requires a dedicated thread table to track threads in the process, and the process table and scheduling method are managed by the runtime system.

When the user-mode thread model is scheduled, the process of saving the thread state and the scheduler are performed locally, and there is no need to fall into the kernel, context switching and other operations, which is much faster (an order of magnitude or more) than the kernel-mode thread. And for different processes, users are allowed to formulate different scheduling algorithms.

One obvious problem with user-mode threads is how to implement  blocking system calls . When a thread makes a blocking system call, it will block the entire process until it is ready, causing other threads to also stop running. The solution is: use IO multiplexing in the runtime system for blocking system calls, when blocked do not call and run another thread until the current thread is safe to run. This method needs to rewrite part of the system call and wrap it to ensure the correct operation of the user mode thread.

Another problem with user-mode threads is that if a thread starts running, no other threads in the process can run unless the first thread automatically relinquishes the CPU. Inside a single process, there is no clock interrupt, so round-robin scheduling cannot be used to call threads. Unless a thread is able to enter the runtime system of its own volition, the scheduler has no chance. It can be considered to let the runtime system request a clock signal (interrupt) once per second, but the high-frequency periodic clock interrupt overhead is objective, and if the thread uses clock interrupts, it may disturb the clock of the runtime system.

People have studied various methods that try to combine the advantages of user-level threads and kernel-level threads, one of which is to multiplex user-mode threads with some/all kernel-mode threads, and it is up to the programmer to use How many kernel threads and how many user threads. The kernel only recognizes kernel threads and schedules kernel threads; kernel threads are multiplexed by user threads, and each kernel thread has a user thread that can be used in turn.

interprocess communication

Inter-process communication is sometimes required, which is called  inter-process communication  (Inter Process Communication, IPC). At the same time, we need to consider three issues:

  1. How a process passes information to another process
  2. Ensure that two or more processes do not intersect in critical activities
  3. Make sure that the order is correct, if there is an order dependency between processes

Except that the first problem is well solved in threads because they share memory addresses, but the latter two problems apply to threads as well.

Race Conditions and Critical Sections

Cooperative threads may complete communication by sharing a common storage area, two or more processes (or threads) read and write some shared data, and the final result depends on the precise timing of the process (or thread) running, which is called It is  a race condition  .

Take the simplest loop plus one program as an example, two threads add one to a memory variable at the same time, each loops 10000000 times, and the final result may not be 20000000. The plus one condition is the competition condition of the program, and they compete for the same memory address.

int sum = 0; // 可能的结果 13049876
void* add(void* atgs) {
  for (int i = 0; i < 10000000; i++) ++sum;
}
int main(void) {
  pthread_t t1;
  pthread_t t2;
  pthread_create(&t1, NULL, add, NULL); // 创建线程 1
  pthread_create(&t2, NULL, add, NULL); // 创建线程 2
  pthread_join(t1, NULL); // 等待线程 1 结束
  pthread_join(t2, NULL); // 等待线程 2 结束
  printf("%d\n", sum);
}

In fact, whenever shared memory, shared files, or any resource is shared, errors caused by race conditions must be avoided by preventing multiple processes from reading and writing shared data at the same time. In other words,  mutual exclusion  (mutual exclusion), which means to ensure that when one process uses a shared resource, other processes cannot perform the same operation. The primitives chosen to implement mutual exclusion are one of the major design considerations of an operating system.

The problem of avoiding race conditions can also be described in an abstract way. Part of a process's time doing internal calculations or other operations that don't cause race conditions. At some point processes may need to access shared data or perform other operations that cause race conditions. We call the program fragment that accesses the shared memory  a critical region  or  a critical  section. If we can arrange it so that it is impossible for two processes to be in the critical section at the same time, we can avoid the race condition.

Although such a requirement avoids race conditions, it does not yet guarantee that concurrent processes using shared data will cooperate correctly and efficiently. For a good solution, the following 4 conditions need to be met:

  1. No two processes can be in critical section at the same time
  2. No assumptions should be made about the speed and number of CPUs
  3. A process running outside a critical section must not block other processes
  4. A process must not be made to wait indefinitely to enter a critical section

busy-wait mutex

mask interrupt

On a uniprocessor system, the simplest approach is to disable all interrupts for each process immediately upon entering the critical section, and to enable interrupts again before leaving. But masking interrupts also causes clock interrupts to be masked. The CPU will switch processes only when clock interrupts or other interrupts occur, and the CPU will not switch to other processes after shielding.

It is convenient for the kernel to mask interrupts during a few instructions when updating variables or lists. However, it is not safe for the system that users can freely shield interrupts. If users shield interrupts and no longer enable interrupts, the entire system will be terminated. However, if the system has multiple processors, masking interrupts is only valid for the CPU that currently disables interrupts, and other CPUs will still continue to run. So this is an inappropriate mutual exclusion mechanism in user processes.

lock variable

Suppose there is a shared (lock) variable whose initial value is 0. When the process wants to enter the critical area, if the value of the lock is 0, it is set to 1 and enters the critical area, otherwise it waits. When this is set, it may be scheduled by the CPU and cause multiple processes to be in the critical section at the same time.

strict rotation

Set a shared memory to record the process ID that can currently enter the critical section. When it is about to enter the critical section, check whether the variable is equal to its own process ID. When the critical section is about to be left, set the shared memory to the next process ID, poll each competing process. Continuously testing a variable until a certain value occurs is called  busy  waiting. Since this is a waste of CPU time, it should generally be avoided and only busy-wait when the wait time is reasonably expected to be very short. Locks used for busy waiting are called  spin locks  .

Peterson solution

This is a simple software mutex algorithm that does not require strict rotation. When the current process is about to enter the critical section, the element corresponding to the flag array is TRUE, and the shared variable is set to the current process. Loop to determine whether the shared variable is the current process, and whether there are other processes set to TRUE in the flag array. If the check succeeds, the loop continues until the condition fails and the code enters a critical section. When leaving the critical section, set the flag element corresponding to the current process to FALSE.

int turn;
int intersted[N]; // N = 2
void enter_region(int process) {
  int other = 1 - process;
  intersted[process] = TRUE;
  turn = process;
  while (turn == process && interested[other] == TRUE);
}
void leave_region(int process) {
  intersted[process] = FALSE;
}

TSL command

TSL is a hardware-supported solution. The instruction  TSL RX, LOCKis called  test and  set lock. It reads the memory word lock into the register RX, and then stores a non-zero value at the memory address. Word read and write operations are inseparable, that is, other processors are not allowed to access the memory word until the end of the instruction. A CPU executing a TSL instruction will lock the memory bus to prevent other CPUs from accessing memory until the instruction completes.

When entering the critical section, set the LOCK variable to 1 through the TSL instruction, and enter the critical section. If it has been set to 1, it means that a process has entered the critical section, and the loop checks whether the condition is met. When leaving the critical section, write 0 to LOCK. So when the request enters the critical section, it will cause a busy wait until the lock is free.

XCHG  is an alternative instruction to TSL that atomically swaps the contents of two locations. All Intel x86 CPUs use the XCHG instruction for low-level synchronization.

sleep and wake

Both Peterson and TSL are correct, but both have the disadvantage of busy waiting, that is, when the process is about to enter the critical section, it first checks whether it is allowed to enter, and if not, the process will wait in place until it is allowed. Waiting in place will cause the CPU to idle and waste CPU resources.

Another problem of busy waiting is  the priority inversion problem  (priority inversion problem): in two processes H (high priority process) and L (low priority process), they can run when the high priority person is ready during scheduling . At this point L is in the critical section and H is ready to run. When H starts busy waiting, but due to the scheduling relationship, L will not be scheduled, so L cannot leave the critical section and H will wait forever.

We discuss the following interprocess communication primitives that will block a process if it cannot enter a critical section, instead of busy waiting. sleep is a system call that causes the calling process to block until some other process wakes it up, wakeup waking up the process specified by the parameter.

The following discusses   the practical application of the producer-consumer problem  (also known as  the bounded-buffer problem). The production-consumption model means that two processes share a common fixed-size buffer. One of the two processes is a producer, which puts messages into the public buffer; the other is a consumer, which reads messages from the buffer. When the buffer is full, if the producer produces a message again, it will block the process through sleep until the consumer wakes up again when consuming the message and generates a message. Similarly, when a consumer consumes a buffer without a message, it is blocked until the producer wakes up when it fills the buffer with a message.

int size;
const int capcity = 100;
void producer(void) {
  while (true) {
    int item = product_item();
    if (size == capcity) sleep();
    insert_item(item);
    ++size;
    if (1 == size) wakeup(consumer);
  }
}
void consumer(void) {
  while (true) {
    if (size == 0) sleep();
    int item = remove_item();
    --size;
    if (size == capcity - 1) wakeup(producer);
    consume_item(item);
  }
}

Although the above code shows the production-consumption model, there is a data race, that is, there is no restriction on the access to the buffer size size. It is possible that when the buffer is empty, the consumer reads the value when size is equal to 0, but at this time the scheduler starts the producer and suspends the consumer. The producer generates data and adds size + 1. At this time, the producer uses wakeup to wake up a consumer. Since the consumer has not been blocked, the wakeup is lost. When the consumer is scheduled next time, it will be blocked because it read a value of 0 last time. When the buffer is filled by the producer, both the producer and the consumer will be blocked.

The essence of the problem is that the wakeup signal sent to an unblocked process is lost, and if the signal is not lost then there will be no problem. We can add a  wake-up wait bit to the process. When sending a wakeup signal to an awake process, set the wake-up wait bit to 1. Then if the process is to be blocked, check the wake-up wait bit. If it is 1, clear it and continue run.

amount of signal

The semaphore (Semaphore) was proposed by EWDijkstra in 1965, using integer variables to accumulate the number of wake-ups for later use. The value of a semaphore can be 0 (no saved wake-up operation) or positive integer (representing multiple wake-up operations). When the Semaphore has only values ​​of 0 and 1, it is called  a binary semaphore  (binary semaphore). The semaphore supports two operations: down (consume one wakeup until 0 blocks) and up (add one wakeup operation). However, in Dijkstra's paper, P (Dutch Proberen, try, means down operation) and V (Dutch Verhogen, increase, means up operation) are used to represent the two operations of Semaphore. Another important use of Semaphore is to achieve  synchronization  (synchronization), to ensure that a certain sequence of events occurs or does not occur, which was first introduced in the programming language Algol 68.

When operating on a Semaphore, all actions are  atomic  . Atom (Atom) is transformed from the Greek ἄτομος (atomos, indivisible). An atomic operation means that a group of related operations are either executed without interruption or not executed at all. The whole process is indivisible. Atomic operations are very important in the field of computer science to solve synchronization and race problems. In order to complete the atomic operation, the quote quantity uses the TSL or XCHG instruction to ensure that only one CPU operates on the semaphore. This is different from busy waiting, which may take an arbitrarily long time, while Semaphore only takes a few milliseconds.

mutex

When the counting power of a semaphore is not needed, it can be simplified to a binary version called  a mutex  . Mutex is only suitable for managing shared resources or a small piece of code, but because of its easy and effective implementation, it is very useful in the implementation of user space thread package.

Effective synchronization and locking mechanisms are critical to performance as parallelism increases. If the wait time is small, the spinlock will be fast, but as the wait time grows, a lot of CPU cycles will be wasted. If there is a lot of contention, then blocking the process and letting the kernel unblock it only when the lock is released can be effective, but with very little contention, the overhead of frequently trapping into the kernel and switching threads becomes enormous. And the amount of lock contention is not very predictable.

Linux implements  fast userspace mutual exclusion  (futex), which implements basic locking but avoids trapping in the kernel. A futex consists of two parts, a kernel service and a user library. The kernel service provides a wait queue, which allows multiple processes to wait on a lock. They will not run unless the kernel explicitly unblocks them. Putting a process on the waiting queue requires a system call, so we try to avoid doing this as much as possible. In the absence of races, futex works entirely in user space.

pthread provides lock facilities for data competition, and also provides a synchronization mechanism: condition variables. Mutexes are useful for allowing or blocking access to critical sections, while condition variables allow threads to block due to some unfulfilled condition. In the vast majority of cases, these two methods are used in conjunction.

It should be noted that, unlike semaphores, condition variables do not exist in memory. So one passes a signal to a condition variable that no thread is waiting on, and the signal goes away.

Monitor

Because when writing multi-process and multi-thread code, it is very easy to have problems, and these errors are race conditions, deadlocks, and some unpredictable and non-reproducible behaviors. In order to make it easier to write correct programs, a high-level synchronization primitive  monitor  (monitor) has emerged. A monitor is a collection of procedures, variables, and data structures that make up a special module or software package. Processes can call procedures in the monitor whenever they want, but they cannot directly access internal data structures outside the monitor.

The monitor has a very important feature, that is, there can only be one active process in the monitor at any time, which is an effective mutual exclusion primitive. The monitor is a part of the programming language, and the compiler can treat it specially to ensure its mutual exclusion.

Messaging and barriers

Semaphores are a low-level means of interprocess communication, and monitors exist in a few languages, so they cannot be used for interprocess communication across computers. The operating system provides a cross-machine process communication primitive, message passing (message passing), which consists of two calls send (send a message to a specific target) and receive (receive a message from a given source). If no messages are currently available, the receiver may block (until a message arrives) or return an error code.

Since data is transmitted over the network, message delivery must account for message loss in the event of a poor network. And also need to solve the process naming problem, the process in the message passing must be unambiguous. Secondly, identity authentication is also a problem.

In message delivery, each process can be assigned a unique address, so that messages can be addressed according to the address of the process, or a new data structure  mailbox  (mailbox) can be introduced to cache a certain number of messages. When using a buffered mailbox, the sender will be blocked when the mailbox is full, and the receiver will be blocked when the mailbox is empty. For mailboxes without cache, the sender will be blocked until the receiver calls receive to receive the message, and vice versa. This situation is called  rendezvous  .

A barrier is a synchronization mechanism for process groups. In some applications, several stages are divided, and it is stipulated that no process will enter the next stage unless all processes are ready. At this time, a barrier can be placed at the end of each stage.

scheduling

Often there are multiple processes or threads competing for the CPU at the same time, and scheduling occurs whenever two or more processes are ready. If only one CPU is available, the next process to run must be chosen. In the operating system, the part that completes the selection work is called  the  scheduler, and the algorithm used by the program is called  the scheduling  algorithm.

In the early days of batch systems that took cards on tape as input, the scheduling algorithm was simple, running each job on the tape in turn. For multiprogrammed systems, the scheduling algorithm is a bit more complicated because there are often multiple users waiting for service. Some mainframe systems still use a combination of batch processing and time-sharing, requiring the scheduler to decide whether a batch job will run next or an interactive user at the terminal. CPUs are a scarce resource, so a good scheduler can go a long way in improving performance and user satisfaction. The scheduler is not very important in a personal computer where there is only one active process most of the time, and the main limit is the current rate of user input rather than the CPU processing rate. At present, personal computers mainly require a large number of high-intensity calculations when rendering high-precision videos. On web servers, multiple processes often compete for the CPU, so scheduling functions become very important. On mobile devices, the resources are not sufficient, the CPU is weak and the power consumption is also an important limitation, so the scheduling algorithm is mainly to optimize the power consumption.

In order to select the correct process to run, the scheduler also considers the CPU utilization, because the cost of process switching is relatively high. First, the user mode must be switched to the kernel mode, and then the state of the current process must be saved, including the registers stored in the process table. In many systems, the memory image (memory access bits in the page table) will also be saved, and then the core process's The memory image is loaded into the MMU, and finally the new process starts running. Process switching also fails the entire cache, forcing the cache to be dynamically loaded twice from memory.

A process is said to be CPU-intensive when it has high CPU-intensive usage and less frequent I/O waits   . On the contrary, processes with short CPU intensive usage and frequent I/O waiting are called  I/O-intensive  processes. These processes do not have particularly long I/O time, but their I/O requests frequently. They all spend the same amount of time making hardware requests after the I/O starts regardless of whether they are processing more or less data. As the CPU becomes faster and faster, more processes tend to be I/O-intensive, so the scheduling of I/O-intensive processes in the future becomes particularly important.

A key question about scheduling processing is when scheduling decisions are made.

  1. After creating a new process, you need to decide whether to run the parent process or the child process. Since both processes are ready, the scheduler is free to decide which process to run.
  2. Scheduling decisions must be made when a process exits. After the process exits, the scheduler must select a process to run from the ready process set. If there is no runnable process, an idle process provided by the system will usually be run.
  3. When a process is blocked due to reasons such as I/O or Semaphore, a process must be selected to run. Sometimes the cause of the blockage becomes a factor of choice.
  4. When an I/O interrupt occurs, scheduling decisions must be made. If the interrupt came from an I/O device, and that device has now completed its work, some process that was blocked waiting for that I/O becomes runnable. It is up to the scheduler to let the newly ready process run, or the process that was running when the interrupt occurred.

If the hardware clock provides periodic interrupts at 50, 60 Hz, or other frequencies, scheduling decisions can be made every or every k clock interrupts. Depending on how clock interrupts are handled, scheduling algorithms can be divided into two categories.

  1. A non-preemptive  scheduling algorithm picks a process and lets that process run until it blocks, or until the process voluntarily relinquishes the CPU. So non-preemptive scheduling does not force a process to hang, even if the process has been running for several hours. This scheduling method is the only option when there is no clock.
  2. The preemptive  scheduling algorithm picks a process and lets that process run for a fixed period of time with a maximum value. If the process is still running after the period ends, it is suspended and other processes are scheduled. This type of scheduling requires a clock interrupt at the end of the time interval so that the CPU can be allocated to the scheduler.

Because different application fields and operating systems have different goals, different scheduling algorithms are required in different environments, and the optimization of scheduling programs is also different. The environment is mainly divided into three types:

  1. batch processing
  2. interactive
  3. real time

The goals of the scheduling algorithm will also vary depending on the environment.

  1. all systems
    1. fair. Fair CPU share per process
    2. Policy enforcement. Ensuring that defined policies are enforced
    3. balance. Keep all parts of the system busy
  2. batch system
    1. throughput. Maximum number of jobs per hour
    2. Turnaround time. Minimum time from commit to terminate
    3. CPU utilization. Keep CPU busy all the time
  3. interactive system
    1. Response time. Quick response to requests
    2. balance. Meet user expectations
  4. real time system
    1. Meet the deadline. avoid losing data
    2. predictability. Avoid quality loss in multimedia systems

Scheduling in batch systems

first come first serve

Among all scheduling algorithms, the simplest non-preemptive one is the First-Come First-Served (FCFS) algorithm. Using this algorithm, that is, using a queue data structure, the process that first requests to use the CPU is scheduled first.

The main advantage of FCFS is 易于理解and 便于运行. In this algorithm, the singly-linked list records all ready processes. When you need to select a process, you only need to select it from the head of the list; when adding a new job or blocking a process, you only need to add the process to the end of the linked list. But its disadvantages are obvious, viz 平均等待时间过长. When the front process has a considerable CPU execution time, the queued process will significantly increase its waiting time, eventually leading to an increase in the average waiting time.

For example, there are 3 processes, P_1the process executes for 24 ms, P_2and P_3the process executes for 3 ms each. The average waiting time when executing according to FCFS is (0 + 24 + 27) / 3 = 17 ms.

shortest job first

When the running time can be predicted, the non-preemptive algorithm Shortest Job First (SJF) can be used. That is, the priority queue is used to sort the processes to be scheduled by the process running time, the process with the shortest running time is scheduled first, and the process with the longest running time is scheduled last. When a group of processes with a given and known time can be obtained, the SJF algorithm is optimal, and its average waiting time is the shortest.

Taking the example in FCFS to calculate, the average waiting time using the SJF algorithm is (0 + 3 + 6) / 3 = 3 ms.

The preemptive version of SJF is the shortest remaining time first (Shortest Remaining Time Next, SRTN) algorithm. The scheduler always chooses the process with the shortest remaining running time to run. This way new short jobs can be well served.

Scheduling in Interactive Systems

round robin

Round robin is one of the oldest , simplest , fairest and most widely used scheduling algorithms. Each process is assigned a time period called quantum , which allows the process to run in. If the process is still running when quantum ends, the CPU is deprived and allocated to other processes. If the process blocks or dies before quantum ends, the process is switched immediately.

The most interesting thing about round-robin scheduling is the size of the time slice. Switching from one process to another requires a certain amount of time for transaction processing, that is, the saving, loading of register values ​​and memory images we mentioned earlier. Assuming that process switch (process switch) or context switch (context switch) takes 1 ms, and the time slice size is 4 ms, then 20\% some wasted in process switching. If the time slice is adjusted to 100 ms, then the CPU is only wasted 1\% time, but if there are 50 runnable processes and each process uses the full time slice, then the last process needs to wait 5 seconds before using the CPU. If the time slice size is longer than the average CPU burst time, preemption will not occur very often. Instead, most processes will complete a blocking operation before the time slice is exhausted causing a process switch. The disappearance of preemption improves performance because process switching only occurs when it is really logically necessary, that is, the process is blocked and cannot continue to run.

In short, too short a quantum will lead to frequent process switching and reduce CPU efficiency; too long a quantum may cause longer response time to short interactive requests. Therefore, the quantum is generally set to a compromise length of 20 ~ 50 ms.

priority scheduling

Round-robin scheduling makes an implicit assumption that all processes are equally important, and people who own and operate multi-user computer systems often see this differently. The need to take external factors into account leads to priority . The basic idea is clear, each process is given a priority, allowing the runnable process with the highest priority to run first. In order to prevent the high-priority process from running endlessly, the low-priority process may be starved, and the scheduler may reduce the priority of the current process every clock interrupt. If the priority of the current process is lower than that of the second-highest priority process, A process switch is performed. Another method is that each process has a maximum time slice for running, and when the time slice is exhausted, the next highest priority process is scheduled.

The priority can be given statically or dynamically, and can also be dynamically determined by the system. For example, some I/O-intensive processes spend most of their time waiting for the end of I/O. When such a process needs CPU, the CPU should be allocated immediately to start the next I/O request, so that it can be processed while another process is performing calculations. Perform I/O operations. Keeping such an I/O-bound process waiting for the CPU for a long time will only cause it to use memory for an unnecessarily long time. A simple way to give an I/O-intensive process better service is to set its priority to 1/f, where is the fraction of the previous time slice fthat the process took. Another way is to allow the user process to actively adjust the priority of the process, so that the scheduling mechanism can adopt a scheduling strategy suitable for the current environment.

compatible time-sharing system

The Compatible Time Sharing System (CTSS) is one of the first systems developed by MIT on the IBM 7094 to use priority scheduling. There is a problem that the process switching speed is too slow in CTSS, which is related to the fact that only one process can exist in the IBM 7094 memory. The designers of CTSS quickly realized that it was more efficient to give CPU-intensive processes long time slices rather than giving them frequent time slices, but long time slices would affect response time. The solution is to establish a priority class. The process with the highest priority runs 1 quantum, the process with the second highest priority runs 2 quantums, the next level runs 4 quantums, and so on.

When a process runs out of allocated quantum, it is moved to the next level. In this way, the process that may need to be scheduled 100 times in the past can be reduced to 7 times, and as the priority continues to decrease, its operating frequency will continue to slow down, thereby freeing up the CPU for short interactive processes. For a process that has just been running for a long time but needs to be interacted with, in order to prevent it from being permanently punished, it can be adjusted to a high-priority team to improve its response speed.

shortest process first

The SJF algorithm tends to have the shortest response time, so it is best if it can be used for interactive processes. The current problem is how to find out the shortest process among the currently runnable processes.

You can speculate based on the past behavior of processes and execute the one with the shortest estimated running time. Assuming that the estimated running time of each command on a terminal is T_0, now suppose the measured next running time is T_1. A weighted sum of these two values ​​can be used \alpha T_0 + (1-\alpha) T_1to improve the estimated time. By choosing the value of α\alphaα, one can decide whether to forget old runtimes as quickly as possible, or to always remember them over a long period of time. At that time\alpha = 1 / 2 , the following sequence can be obtained:

T_0, \quad \frac{T_0}{2}+\frac{T_1}{2}, \quad \frac{T_0}{4}+\frac{T_1}{4}+\frac{T_2}{2}, \quad \frac{T_0}{8}+\frac{T_1}{8}+\frac{T_2}{4}+\frac{T_3}{2}, \quad \dots

It can be seen that after three rounds, T_0the proportion of ​ in the new estimate drops to 1 / 8. This technique of obtaining the next estimate by weighting the current measurement with previous estimates is called aging , and it applies to situations where many measurements must be based on previous values.

Other scheduling algorithms

a. Guaranteed scheduling

It is a unique scheduling algorithm to make an explicit performance guarantee to the user and realize it. If there are n users logging in the terminal now, it can be guaranteed that each user will get CPU processing power \frac{1}{n}. In other words, in a single-user system, each of the n processes running on the ticket can get \frac{1}{n}CPU time. To achieve the guarantees made, the system must keep track of how much CPU time each process has used since it was created, and then calculate how much time each process should get.

b. Lottery scheduling

Making promises for users and delivering on them is a good approach, but difficult to achieve. There is an algorithm that can give similar prediction results and is simple to implement, that is, lottery scheduling. The basic idea is: provide a lottery ticket for each system resource for the process, once a scheduling decision needs to be made, a lottery ticket is randomly drawn, and the process that owns the lottery ticket gets the resource. For example, the system can handle 50 lottery tickets per second, and each winner can get 20 ms of CPU time as a reward. All processes are equal, but some processes are more equal , and additional lotteries can be assigned to important processes in order to increase their chances of winning. 20\% For example, there are a total of 100 lottery tickets, and a process holds 20 of them, then the process will get more system resources during a long run . That is, a process that owns f shares of the lottery gets approximately f shares of system resources.

c. Fair Share Scheduling

We assume that what is scheduled is the process itself, regardless of who its owner is. The result of this is that if user A has 9 processes and user 2 has 1 process, then in the round-robin scheduling or equal-priority scheduling algorithm, user 2 can only use the CPU 90\% . To avoid such a situation, you can consider who owns the process before scheduling, so that each user can allocate CPU time fairly, and the scheduler selects the process in a forced manner.

Scheduling in real-time systems

A real-time system is a system in which time plays a dominant role. Typically, one or more external physical devices send the computer a service request, and the computer must respond appropriately within a certain time frame. For example, a music player has to stream bits into music in very short intervals, otherwise it will sound very weird. So a correct .

Real-time systems can usually be divided into hard real time (hard real time) and soft real time (soft real time), the former must meet the absolute deadline, while the latter is tolerable although it is not expected to miss the deadline occasionally. Real-time performance is achieved by dividing the program into a set of processes, the behavior of each process is predictable and known in advance. These processes are generally short-lived and complete extremely quickly. When an external signal is detected, the scheduler's task is to schedule the program in such a way that all deadlines are met.

Time in a real-time system can be further divided into periodic and aperiodic events according to the effect mode, and a system may respond to multiple periodic event streams. Depending on how long each event takes to process, the system may not even be able to process all events. If there are m periodic events, an event i P_ioccurs with a period of , and takesC_i​ seconds of CPU time to process, then the conditions under which the load can be handled are:

\sum_{i=1}^{m} \frac{C_i}{P_i} \leq 1

A real-time system that satisfies this condition is said to be  schedulable , meaning it can actually be implemented. A process that does not meet this test cannot be scheduled because the sum of the CPU time required by these processes is greater than what the CPU can provide. It is assumed here that the overhead of process switching is so small that it can be ignored.

The scheduling algorithm of a real-time system can be static or dynamic, the former makes scheduling decisions before the system starts running, and the latter makes scheduling decisions during the running process. Static scheduling works only if all information about the work being done and the deadlines that must be met is known in advance, whereas dynamic algorithms do not require these constraints.

Classic IPC problem

dining philosophers problem

Dijkstra proposed and solved  the dining-philosophers  synchronization problem in 1965, which is a competition problem for mutually exclusive access to a finite resource. Since then, everyone who invents a new synchronization primitive hopes to solve the dining-philosophers problem. Showcasing the subtlety of its primitives.

The most intuitive solution is to specify the forks available to the philosopher and take the specified fork when it is available. But there is an obvious problem, that is, if all philosophers pick up the fork on the left at the same time, no one can get the fork on their right. In this case, a deadlock will occur   . With a slight modification, you can check whether the other fork is available after picking up the left fork, and put down the fork if it is not available. This also has an obvious problem, that is, all philosophers get the left fork at the same time, find that the right fork is not available, and put down the fork in their hands, and repeat after a while. For this situation, all programs are running non-stop, but can not make progress, so called  starvation  (starvation).

There is neither deadlock nor starvation when using a binary semaphore, the P operation is performed on the semaphore before picking up the fork, and the V operation is performed at the end of the meal. This method works, but only one philosopher can eat. To expand it, a binary semaphore array is used to represent the philosopher's dining state, which can achieve the maximum degree of parallelism.

reader-writer problem

The reader-writer problem establishes a model for database access. Multiple read processes can be performed at the same time, but the read process and the write process are mutually exclusive. When the write process enters the critical section, it will be mutually exclusive with any process.

Guess you like

Origin blog.csdn.net/qq_62464995/article/details/129777191