Operating System Notes - Process Management

Operating System Notes - Process Management

2. Process management

2.1 Processes and Threads

2.1.1 Introduction of process

In a computer operating system,Process is the basic unit of resource allocation and the basic unit of independent operation.

Precursor graph

Insert image description here

The predecessor graph is a directed acyclic graph,Each node can represent a statement, a program segment, or a process. The directed edges between nodes represent the partial order or predecessor relationship between the two nodes.

→={(P i , P j )|P i must be completed before P j starts execution}

Insert image description here

Sequential execution of programs

A program usually consists of several program segments.They must be executed in a certain order, and subsequent operations can only be executed after the previous operation is executed., this type of calculation process is the sequential execution process of the program. For example, when processing a job, the user's program and data are always input first, then calculations are performed, and finally the results are printed out.

  • Sequentiality. The operations of the processor are executed strictly in the order specified by the program, that is, each operation must end before the next operation starts.
  • Closeness. Once the program starts running, its execution results are not affected by external factors. Because the program monopolizes various resources of the system when it is running, the state of these resources (except the initial state) can only be changed by this program.
  • Reproducibility. As long as the initial conditions and execution environment when the program is executed are the same, the same result will be obtained when the program is executed repeatedly (that is, the execution result of the program has nothing to do with time).

Concurrent execution of programs

The concurrent execution of a program means that several programs (or programs) are running in the system at the same time. The execution of these (sequences) overlaps in time, that is, the execution of one program (or program segment) has not yet ended, and the execution of another program (or program segment) has not yet ended. Program section) line has started.

Although the concurrent execution of programs improves the system's processing power and resource utilization, it also brings some new problems and produces some characteristics that are different from those of sequential execution:

  • Intermittent. When programs are executed concurrently, because they share resources or cooperate with each other to complete the same task, concurrent programs form a mutually restrictive relationship. In Figure 2-1, if C 1 is not completed, P 1 cannot be performed, causing the printing operation of job 1 to be suspended. This is a direct restriction relationship caused by mutual cooperation to complete the same task: if I 1 is not completed, Then I 2 cannot be performed , causing the input operation of job 2 to stop. This is an indirect restriction relationship caused by shared resources. This mutual restriction relationship will cause concurrent programs to have an intermittent activity pattern of "execution, pause, execution, execution".

  • Loss of closure. When programs are executed concurrently, multiple programs share various resources in the system, so the status of these resources will be changed by multiple programs, causing the program to run without closure. When such a program is executed, it will inevitably be affected by other programs. For example, when the processor is occupied by a program, other programs must wait.

  • irreproducibility. When programs are executed concurrently, due to the loss of closure, the reproducibility of their running results will also be lost. For example, there are two loop programs A and B, which share a variable N. Every time program A is executed, N=N+1 operations must be performed; every time program B is executed, print(N) operations must be performed, and then N=0. Since the execution of program A and program B both advance at independent speeds, the N=N+1 operation of program A can be sent before the print(N) operation and the N=0 operation of B, or it can occur before it. After or in the middle. Assume that the value of N at a certain moment is n. For the two situations where N=N+1 appears before and after the two operations of B (see Figure 2-2), the N values ​​printed after executing a cycle are n+ 1 and n.

Insert image description here

  • Conditions for concurrent execution of programs

    When programs are executed concurrently, the results are non-reproducible, which is not the result that users want to see. To this end, programs must remain closed and reproducible when executed concurrently. Since the loss of closure in concurrent execution is the impact of shared resources, the work to be done now is to eliminate this impact.

Insert image description here

If two programs p 1 and p 2 can meet the following three conditions, they can be executed concurrently and their results are reproducible.

Insert image description here

2.1.2 Definition and description of process

In a multiprogramming environment, the concurrent execution of programs destroys the closedness and reproducibility of the program, so that programs and calculations no longer correspond one to one, program activities are no longer in a closed system, and many new problems arise in the operation of the program. feature. In this case, the static concept of program can no longer faithfully reflect these characteristics of program activities, so a new concept - process is introduced.

process definition

  • A process is an execution of a program on a processor.
  • A process is a computation that can be performed in parallel with other processes.
  • A process is the running process of a program on a data collection and is an independent unit for resource allocation and scheduling in the system.
  • A process can be defined as a data structure and a program that can operate on it.
  • A process is the activity that occurs when a program executes a sequence of data on a processor.

Characteristics of the process

  • Dynamic. A process is an execution process of a program on the processor, so it is dynamic. The dynamic characteristics are also manifested in that it is generated by creation, executed by scheduling, suspended due to lack of resources, and finally dies due to cancellation.
  • Concurrency. Concurrency means that multiple processes exist in memory at the same time and can run simultaneously for a period of time. The purpose of introducing a process is to enable the program to execute concurrently with other programs to improve resource utilization.
  • independence. A process is a basic unit that can run independently and is also an independent unit for resource allocation and scheduling by the system.
  • Asynchronicity. Asynchronicity means that processes move forward at independent and unpredictable speeds.
  • Structure. In order to describe and record the movement changes of the process and enable it to run correctly,A Process ControlBlock (PCB) should be configured for each process. From a structural point of view, each process consists of a program segment, a data segment and a process control block.

The relationship between process and program

  • The process is dynamic and the program is static. A process is the execution of a program. Each process contains a program segment, a data segment, and a process control block (PCB). A program is a collection of ordered codes with no execution meaning.
  • Processes are temporary, programs are permanent. A process is a process of state changes, and the program can be saved for a long time.
  • The components of processes and programs are different.The process consists of program segments, data segments and process control blocks.
  • Through multiple executions, one program can generate multiple different processes, and through calling relationships, one process can execute multiple programs. Processes can create other processes, but programs cannot form new programs
  • Processes have parallel characteristics (independence, asynchronousness), but programs do not.

The process image is composed of three parts: program segment, related data segment and PCB, also called process entity.The process image is static, the process is dynamic, and the process is the running process of the process entity.

The difference between process and job

A job is a collection of work that a user requires the computer to do to complete a certain task. The completion of a job goes through four stages: job submission, job containment, job execution and job completion. The process is the execution process of the submitted job and is the basic unit of resource allocation.

  • A job is a task entity that a user submits to a computer. After the user submits a job to the computer, the system puts it into the job waiting queue in external memory to wait for execution: the process is the execution entity that completes the user's task and is the basic unit that applies to the system for allocating resources.. For any process, as long as it is created, there is always a corresponding part in memory.
  • A job can be composed of multiple processes, and must be composed of at least one process, but a process cannot constitute multiple jobs
  • The concept of jobs is mainly used in batch processing systems. Time-sharing systems like UNIX do not have the concept of jobs; the concept of processes is used in almost all multiprogramming systems.

Process composition

  • Process Control Block (PCB).Each process has a PCB, which is a data structure that can both identify the existence of the process and characterize the execution moment.. When a process is created, the system allocates and constructs a corresponding PCB for it.
  • program segment.A program segment is a segment of program code in a process that can be scheduled by the program to be executed on the CPU., can realize the corresponding specific functions.
  • data segment. data segment of a processIt can be the original data processed by the program corresponding to the process, or it can be the intermediate or result data generated when the program is executed.

The PCB is the only sign of the existence of the process .

PCB includes:

  • Process identifier (PID). Each process has a unique process identifier to distinguish it from other processes within the system. When a process is created, the system assigns it a unique process identification number.
  • The current status of the process. Describes the current state of the process as a basis for the process scheduler to allocate processors.
  • Process queue pointer. Used to record the address of the next PCB in the PCB queue. PCBs in the system may be organized into multiple queues, such as ready queue, blocking queue, etc.
  • program and data addresses. Indicates the address where the process's programs and data are located.
  • Process priority. Reflects the urgency of the process's CPU requirements. Processes with higher priority get the processor first.
  • CPU site protection zone. When the process releases the processor for some reason, the CPU local information (such as counter status register, general register, etc.) is saved in this area of ​​the PCB so that the process can continue to execute after regaining the processor.
  • Communication information. Record the information exchange that occurs between a process and other processes during execution.
  • Family connections. Some systems allow processes to create child processes, thus forming a process family tree. In PCB, the relationship between this process and its family must be specified, such as the identification of its child process and parent process.
  • Possession resource list. A list of resources required by the process and currently allocated resources.

In a system, there are usually many processes, some in ready state, some in blocked state, and the reasons for blocking are different.In order to facilitate the scheduling and management of processes, the PCBs of each process need to be organized in appropriate ways. Currently commonly used organizational methods include linking and indexing.

The role of PCB is to ensure the concurrent execution of programs. Creating a process is essentially creating the PCB of the process; and canceling the process is essentially canceling the PCB of the process.

Why is the PCB the only sign of the existence of a process?

During the entire life cycle of a process, the system always controls the process through the PCB, that is, the system perceives the existence of the process based on the PCB of the process. Therefore, the PCB is the only sign of the existence of the process.

2.1.3 Process status and transitions

5 basic states of processes

  • Ready state. The process has obtained all resources except the processor. Once it obtains the processor, it can be executed immediately. At this time, the process is in the ready state.
  • Execution status (running status). When a process has obtained the necessary resources and is executing on the CPU, the process is in the executing state.
  • Blocking state (waiting state). The process being executed is temporarily unable to execute due to the occurrence of an event (such as waiting for I/O to complete). At this time, the process is in a blocked state. When a process is blocked, it cannot run even if the processor is assigned to it.

When doing the questions, pay special attention to distinguishing between the ready state and the blocked state. The key to distinguishing the two is whether it can be executed immediately when the processor is assigned to the process. If it can be executed immediately, it is in the ready state. Otherwise, it is blocked. state.

  • Create status. The process is being created and has not yet moved to the ready state. Apply for a blank PCB and fill in some information about the control and management process into the PCB; then the system allocates the resources required for the process to run; finally, the process is transferred to the ready state.
  • end state. The process is disappearing from the system. It may be terminated normally or interrupted for other reasons.

Mutual conversion of process states

Insert image description here

  • Ready state - execution state. A process is selected by the process scheduler.
  • The execution state is a blocking state. Request and wait for an event to occur.
  • The execution state is ready state. The time slice runs out or a process with a higher priority in preemptive scheduling becomes ready.
  • Blocking state - ready state. The process is awakened because it is waiting for a certain condition to occur.

It can be concluded from the above

  • Not all process state transitions are reversible.A process can neither change from the blocking state to the executing state nor from the ready state to the blocking state.
  • The state transitions between processes are not all active. In many cases, they are passive. Only the transition from the execution state to the blocking state is the program's self-behavior (the blocking primitive is actively called due to events), and the rest are passive.
  • Uniqueness of process status. A specific process must and can only be in one state at any given moment.

2.1.4 Process control

The responsibility of process control is to effectively manage all processes in the system. Its functions include process creation, process cancellation, process blocking and awakening, etc. These functions are generally implemented by the kernel of the operating system.

Creation of process

  • Process Precursor Graph

Insert image description here

  • Create primitives

    • User login. In a time-sharing system, the user enters login information at the terminal. After the system detects and passes the process, a new process will be created for the terminal user and inserted into the ready queue.
    • Job scheduling. In a batch processing system, when the job scheduler schedules a job according to a certain algorithm, it loads the job into the memory, allocates resources to it, creates a process, and inserts it into the ready queue.
    • Request service. Based on the needs of the process, a new process is created by itself and completes a specific task.

    The creation primitive of the process is implemented, and its main operation process is as follows:

    • First apply for a free PCB from the system and specify a unique process identifier (PID).
    • Allocate necessary resources to the new process.
    • Initialize the PCB of the new process. Fill in the process name, family information, program data address, priority and other information for the PCB of the new process.
    • Insert the new process's PCB into the ready queue.

process pair undo

A process should be canceled after completing its tasks in order to promptly release the various resources it occupies.The cancel primitive can adopt two strategies: one is to cancel only a process with the specified identifier, and the other is to cancel the specified process and all its descendant processes.. Events that lead to process cancellation include normal process termination, abnormal process termination, and external intervention.

The function of the cancel primitive is to cancel a process. Its main operation process is as follows:

  • First find the PCB of the undo process in the PCB collection.
  • If the revoked process is in the execution state, the execution of the process should be stopped immediately and the rescheduling flag should be set so that the processor can be allocated to other processes after the process is revoked.
  • For the latter revocation strategy, if the revoked process has descendant processes, the descendant processes of the process should also be revoked.
  • Recycle the resources occupied by the revoked process, or return them to the parent process, or return them to the system.Finally, recycle its PCB

Process blocking and awakening

The function of the blocking primitive (P primitive) is to change the process from the execution state to the blocking state, while the function of the wake-up primitive (V primitive) is to change the process from the blocking state to the ready state.

The main operation process of blocking primitive is as follows:

  • First stop the current process from running. Because the process is executing, the processor should be interrupted.
  • Saves the CPU context of the process so that the process can be recalled later and execution resumes from the point of interruption.
  • Stop the process, change the process state from execution state to blocking state, and then insert the process into the waiting queue for the corresponding event.
  • Go to the process scheduler and select a new process from the ready queue to run.

The main operation process of the wake-up primitive is as follows:

  • Remove the awakened process from the corresponding waiting queue.
  • Change the status to ready and insert into the corresponding ready queue.

Process switching

Process switching isRefers to the processor switching from the running of one process to the running of another process. During this process, the running environment of the process has undergone substantial changes.

The process of process switching is as follows:

  • Saves processing and context, including program counter and other registers.
  • Update PCB information.
  • Move the PCB of the process into the corresponding queue, such as ready, blocking queue of a certain event, etc.
  • Select another process to execute and update its PCB.
  • Update memory management data structures.
  • Restore processor context.

Process switching will definitely generate an interrupt and processor mode switching, that is, from user mode to kernel mode, and then back to user mode; however, processor mode switching does not necessarily result in process switching. For example, system calls will also go from user mode to kernel mode. , and then return to user mode, but logically, the same process still occupies the processor for execution.

2.1.5 Process communication

Process communication isRefers to the exchange of information between processes. Mutual exclusion and synchronization of processes is a communication method between processes. Since the amount of information exchanged between process mutual exclusion and synchronization is small and the efficiency is low, these two process communication methods are called low-level process communication methods .

Correspondingly, the P and V primitives can also be called two low-level process communication primitives.

Advanced process communication methods can be divided into three major categories: shared memory systems, message passing systems and pipeline communication systems .

shared memory system

In order to transfer large amounts of data, a shared storage area is allocated in the memory, and multiple processes can communicate by reading and writing to the shared storage area. Before communication, the process applies to the system to establish a shared storage area and specifies the keyword of the shared storage area. If the shared storage area has been established, the descriptor of the shared storage area is returned to the applicant. The applicant then attaches the obtained shared storage area to the process. In this way, the process can read and write the shared memory area just like reading and writing ordinary memory.

messaging system

In a messaging system,Data is exchanged between processes in message units, and users directly use a set of communication commands (primitives) provided by the system to implement communication.. The operating system hides the implementation details of communication, simplifies communication procedures, and is widely used.

Depending on how they are implemented, messaging systems can be divided into the following two categories:

  • direct communication method. The sending process directly sends the message to the receiving process and hangs it on the message buffer queue of the receiving process. The receiving process obtains the message from the message buffer queue.
  • Indirect communication method. The sending process sends the message to an intermediate entity (usually called a mailbox), and the receiving process obtains the message from it. This communication method is also called mailbox communication. This communication method is widely used in computer networks, and the corresponding communication system is called an email system.

pipe communication system

A pipe is a shared file used to connect reading and writing processes to enable communication between them.. The sending process that provides input to the pipe (i.e., the writing process) sends a large amount of data into the pipe in the form of a character stream, and the process that receives the output of the pipe (i.e., the reading process) can receive data from the pipe.

2.1.6 Threads

Threads are a very important technology that has emerged in the field of operating systems in recent years, and their importance is no less important than processes. The introduction of threads improves the degree of concurrent execution of programs, thereby further improving system throughput.

The concept of thread

  • Introduction of threads: If the purpose of introducing processes into the operating system is to enable multiple programs to execute concurrently to improve resource utilization and system throughput, then,The purpose of reintroducing threads into the operating system is to reduce the time and space overhead incurred when programs are executed concurrently, so that the operating system has better concurrency.

  • Definition of thread

    • A thread is an execution unit within a process and is smaller than a process.
    • A thread is a schedulable entity within a process.
    • A thread is a relatively independent control flow sequence in a program (or process).
    • A thread itself cannot run alone, it can only be included in a process and can only be executed within a process.

    To sum up, we might as well define a thread as:A thread is a relatively independent, schedulable execution unit within a process.. The thread itself basically does not own resources, only a few resources that are essential during runtime (such as a program counter, a set of registers, and a stack), but it can share all the resources owned by the process with other threads that belong to the same process.

  • Thread implementation

    There are many ways to implement thread support in the operating system.The most natural method is to provide the thread control mechanism by the operating system kernel. In an operating system that only has the concept of a process, the user program can use the function library to provide a thread control mechanism.. Another approach is to provide thread control mechanisms at both the operating system kernel and user program levels.

    • Kernel-level threads. Refers to threads that depend on the kernel and are created and destroyed by the operating system kernel. In an operating system that supports kernel-level threads, the kernel maintains context information for processes and threads and completes thread switching.When a kernel-level thread is blocked due to I/O operations, it will not affect the operation of other threads.. At this time, the processor time slice is allocated to threads, so processes with multiple threads will get more processor time.
    • User-level threads refer to threads that do not rely on the operating system core and are controlled by the application process using the thread library to provide functions for creating, synchronizing, scheduling, and managing threads. Since the maintenance of user-level threads is completed by the application process, the operating system kernel does not need to know the existence of user-level threads, so it can be used in multi-process operating systems that do not support kernel-level threads, or even single-user operating systems. User-level thread switching does not require kernel privileges, and the user-level thread scheduling algorithm can be optimized for the application. Many application software have their own user-level threads. Since the scheduling of user-level threads is performed within the application process,Usually non-preemptive and simpler rules are used, and there is no need to switch between user mode/kernel mode, so the speed is particularly fast.. Of course, since the operating system kernel is unaware of the existence of user threads, when one thread blocks, the entire process must wait . At this time, the processor time slice is allocated to the process. When there are multiple threads in the process, the execution time of each thread is relatively reduced.
  • thread lock

    Thread locks include: mutual exclusion locks, conditional locks, spin locks, and read-write locks. Generally speaking, the more powerful the lock, the lower the performance.

    • Mutex lock. A mutex is a semaphore used to control multiple threads' mutually exclusive access to resources shared between them.
    • Conditional lock. A conditional lock is a condition variable. A conditional lock can be used to block a thread when a certain condition is met. Once the condition is met, a thread blocked due to the condition is awakened in the form of a "semaphore".
    • Spin lock. Similar to a mutex, but different.When a thread fails to apply for a mutex lock, it will switch to other tasks. When it fails to apply for a spin lock, it will continue to loop and detect.
    • Read-write lock. Locks for the reader-writer operation model have been implemented.

Thread status and transitions

Insert image description here

  • Six states of threads
    • Initial (NEW): A new thread object is created, but the start() method has not been called yet.
    • Ready state (READY): After the thread object is created, the start() method of the process is called, and the process enters the "runnable thread pool" and becomes runnable. It only needs to wait to obtain the right to use the CPU before running. Right nowIn addition to the CPU, the thread in the ready state has obtained all other resources required for operation.. Since the newly created thread must not be in the running state before entering the ready state, it cannot call the start() method itself but is called by other threads in the running state.
    • Running state (RUNNING): After the thread in the ready state obtains the CPU, it starts executing the thread's code.
    • BLOCKED: The thread gives up the right to use the CPU for some reason and temporarily stops running.
    • Waiting (WAITING): When a thread uses the wait() method, it enters the waiting state (enters the waiting queue). Entering this state will release the occupied resources, which is different from blocking.This state cannot be awakened automatically and must rely on other threads to call the notify() method to be awakened.
    • Timeout waiting (TIMED_WAITING): This state is different from but similar to the waiting state. The only difference is whether there is a time limit, that is, the thread in this state will be awakened after waiting for a certain period of time. Of course, it can also be awakened before this time. Previously awakened by notify() method.
    • TERMINATED: Indicates that the thread has completed execution.

Comparison of threads and processes

  • Scheduling.In traditional operating systems, the basic unit with resources and independent scheduling is a process. In operating systems that introduce threads, threads are the basic unit of independent scheduling, and processes are the basic unit of resource ownership.. In the same process, thread switching will not cause process switching . Thread switching in different processes, such as
    switching from a thread in one process to a thread in another process, will cause process switching
    .
  • Have resources. Whether it is a traditional operating system or an operating system with threads, a process is the basic unit that owns resources, and a thread does not own system resources (there are some essential resources, not all resources), but the thread can access its subordinate The system resources of the process.
  • Concurrency. In an operating system that introduces threads, not only processes can execute concurrently, but multiple threads within the same process can also execute concurrently. This allows the operating system to have better concurrency and greatly improves the throughput of the system.
  • System overhead. Because when a process is created or canceled, the system must allocate or reclaim resources for it, such as memory space, I/0 devices, etc.The overhead paid by the operating system is much greater than the overhead when creating or destroying threads.. Similarly, when switching processes, it involves saving the CPU environment of the entire current process and setting the CPU environment of the newly scheduled process: when switching threads, only a small amount of register content needs to be saved and set, so the overhead is very small. In addition, since multiple threads in the same process share the address space of the process, synchronization and communication between multiple threads are very easy to achieve, even without the intervention of the operating system.

multi-threaded model

Some systems support both user-level threads and kernel-level threads, so there are three different multi-threading models based on the connection methods of user-level threads and kernel-level threads.

  • Many-to-one model. The many-to-one model maps multiple user-level threads to one kernel-level thread. In systems using this model, threads are managed in user space and are relatively efficient. However, since multiple user-level threads map to one kernel-level thread,As long as one user-level thread blocks, it will cause the entire process to block. And because the system can only recognize one thread (kernel-level thread), even if there are multiple processors, several user-level threads of the process can only run one at the same time and cannot be executed in parallel.
  • One-to-one model. The one-to-one model maps kernel-level threads to user-level threads one-to-one. The advantage of doing this isWhen a thread blocks, it does not affect the running of other threads, so the concurrency of the one-to-one model is better than the many-to-one model.. And by doing so, multi-thread parallelism can be achieved on multiple processors. The disadvantage of this model is that creating a user-level thread requires creating a corresponding kernel-level thread.
  • Many-to-many model. The many-to-many model maps multiple user-level threads to multiple kernel-level threads (the number of kernel-level threads is not more than the number of user-level threads, and the number of kernel-level threads is determined based on specific circumstances).Adopting such a model can break the limitations of the first two models on user-level threads, not only allowing multiple user-level threads to execute in parallel in a true sense, but also not limiting the number of user-level threads.. Users are free to create the required user-level threads. Multiple kernel-level threads call user-level threads as needed. When a user-level thread blocks, other threads can be scheduled for execution.

2.2 Processor Scheduling

2.2.1 Three-level scheduling of processors

Scheduling is a basic function of the operating system, and almost all resources need to be scheduled before use. Since the CPU is the primary resource of the computer, the scheduling design revolves around how to efficiently utilize the CPU.

In a multiprogramming environment, a job usually undergoes multi-level scheduling from submission to execution, such as high-level scheduling, intermediate scheduling and low-level scheduling. The operating performance of the system depends to a large extent on scheduling, so scheduling becomes the key to multiprogramming.

Advanced Scheduling (Job Scheduling)

Advanced scheduling is also called macro scheduling, job scheduling or long-range scheduling.The main task is to select one or more jobs in the backup state on external memory according to certain principles, allocate necessary resources such as memory input/output devices to them, and establish corresponding processes so that the job has the ability to obtain a competing processor. Rights (a job is the sum of the work that the user requires the computer to do in a computing process or a transaction). Job scheduling runs less frequently, usually every few minutes.

The scheduler must decide how many jobs the operating system can admit?

How many jobs are admitted into the memory each time by job scheduling depends on the degree of concurrency of the multiprogram, that is, how many jobs are allowed to run in the memory at the same time. When there are too many jobs that can run simultaneously in the memory, it may affect the system's service quality, such as causing the turnaround time to be too long. When there are too few jobs running in memory at the same time, system resource utilization and throughput will decrease. Therefore, the degree of concurrency of multiprogramming should be determined based on the size and running speed of the system.

The scheduler must decide which jobs to admit?

Which jobs should be transferred from external storage to memory depends on the scheduling algorithm adopted. The simplest scheduling algorithm is the first-come-first-serve scheduling algorithm, which transfers the earliest jobs into the external memory to the memory first. A more commonly used scheduling algorithm is the short-job first scheduling algorithm, which transfers the jobs with the shortest execution time on the external storage to the memory first. Jobs are loaded into memory first, and there are other scheduling algorithms.

Intermediate Scheduling

Intermediate scheduling is also called mid-range scheduling or switching scheduling.Intermediate scheduling is introduced to improve memory utilization and system throughput. Its main task is to transfer the processes with running conditions in the external memory swap area into the memory according to the given principles and strategies, and modify their status to Ready state, waiting on the ready queue, or exchanging the temporarily inoperable process in the memory to the external memory swap area. The process state at this time is called the suspended state.. Intermediate scheduling mainly involves memory management and expansion (in fact, intermediate scheduling can be understood as scheduling pages between the outside and the inside during paging).

Low-level scheduling (process scheduling)

Low-level scheduling is also called micro-scheduling, process scheduling or short-range scheduling.Its main task is to select a process from the ready queue according to a certain strategy and method and allocate the processor to it. The running frequency of process scheduling is very high, usually every tens of milliseconds.

The difference between high-level scheduling and low-level scheduling:

  • Job scheduling prepares the process for being called, and process scheduling enables the process to be called. In other words,The result of job scheduling is that a process is created for the job, and the result of process scheduling is that the process is executed
  • The number of job scheduling is small and the frequency of process scheduling is high.
  • Some systems do not need to set job scheduling, but process scheduling is required.

2.2.2 Basic principles of scheduling

Different scheduling algorithms have different scheduling strategies, which also determines that scheduling algorithms have different impacts on different types of jobs. When choosing a scheduling algorithm, we must consider the characteristics of different algorithms. In order to measure the performance of scheduling algorithms, some evaluation criteria have been proposed.

CPU utilization

The CPU is the most important and expensive resource of the system, and its utilization is an important indicator for evaluating scheduling algorithms. In batch processing and real-time systems, CPU utilization is generally required to reach a relatively high level. However, for PCs and some systems that do not emphasize utilization, CPU utilization is not the most important.

System throughput

System throughput represents the number of jobs completed by the CPU per unit time. For long jobs, because it takes up a long time for CPU processing, it will cause the system throughput to decrease; for short jobs, the opposite is true.

Response time

Relative to system throughput and CPU utilization, response time is primarily user-oriented. In interactive systems, especially in multi-user systems, multiple users operate the system at the same time, all requiring a response within a certain period of time, and the processes of some users cannot be called for a long time. Therefore, from the user's perspective, the scheduling strategy should ensure the shortest possible response time so that the response time is within the user's acceptable range.

Turnaround time

From the perspective of each job, the time it takes to complete the job is critical and is usually measured by turnaround time or weighted turnaround time.

  • Turnaround time: refers to the time interval from submission to completion of a job, including waiting time and execution time. Turnaround time Ti ; expressed asTurnaround time T i of job i = completion time of job i – submission time of job i

  • Average turnaround time: The average turnaround time refers to the average turnaround time of multiple jobs (such as n jobs). The average turnaround time T is expressed asT=(T1+T22+…+Tn) /n

  • Weighted turnaround time: Weighted turnaround time is the ratio of job turnaround time to runtime. The weighted turnaround time Wi of job i is expressed by the formulaW i = turnaround time of job i/running time of job i

  • Average Weighted Turnaround Time: Similar to average weighted turnaround time, average weighted turnaround time is the average of the weighted turnaround times for multiple jobs.

2.2.3 Process Scheduling

In a multiprogramming system, the number of user processes is often greater than the number of processors, which will cause user processes to compete for the processor. In addition, system processes also require use of the processor. Therefore, the system needs to dynamically allocate the processor to a process in the ready queue according to a certain strategy in order to execute it. Tasks assigned by the processor are completed by the process scheduler.

Process scheduling function

  • Record the relevant information and status characteristics of all processes in the system.
  • Select the process to get the processor.
  • Processor allocation.

What causes process scheduling

  • The currently running process ends. It ends normally because the task is completed, or it ends abnormally because an error occurred.
  • The currently running process enters the blocking state from the running state due to some reasons, such as I/request, P operation, blocking primitive, etc.
  • After executing system calls and other system programs, return to the user process. At this time, it can be regarded that the system process has been executed, and a new user process can be scheduled.
  • In a system that uses preemptive scheduling, if a process with a higher priority requests to use the processor, the currently running process will enter the ready queue (this is related to the scheduling method).
  • In a time-sharing system, the time slice allocated to the process has been exhausted (this depends on the system type).

When process scheduling cannot be performed

  • In the process of handling interrupts. The interrupt processing process is complex and it is difficult to switch processes in implementation. Moreover, interrupt processing is part of the system work and does not logically belong to a certain process and should not be deprived of processor resources.
  • In the critical section of the operating system kernel program. After the process enters the critical section, it needs to access the shared data exclusively and theoretically must be locked to prevent other parallel programs from entering. It should not switch to other processes before unlocking to speed up the release of the shared data.
  • During other atomic operations that require complete masking of interrupts. Atomic operations such as locking, unlocking, interrupting on-site protection, and recovery. Atomic operations cannot be subdivided and must be completed once, and process switching is not possible.

How processes are scheduled

The process scheduling method isRefers to how when a process is executing on the processor, if there is a more important or urgent process that needs to be processed (that is, a process with a higher priority enters the ready queue), how should the processor be allocated at this time?

  • Preemptive: Also known as deprivable method. This scheduling method means that when a process is executing on the processor, if a process with a higher priority enters the ready queue, the executing process is immediately suspended and the processor is assigned to the new process.
  • Non-preemptive method. Also known as inalienable manner. This method means that when a process is executing on the processor, even if a process with a higher priority enters the ready queue, the executing process will continue to execute until the process is completed or due to some event. The processor is assigned to the new process only when it enters the completion or blocking state.

2.2.4 Common scheduling algorithms

First come, first served scheduling algorithm (job scheduling, process scheduling)

The first come first served scheduling algorithm (FCFS) is the simplest scheduling algorithm and can be used for job scheduling and process scheduling. The basic idea is to allocate processors in the order in which processes enter the ready queue . The first-come, first-served scheduling algorithm adopts a non-preemptive scheduling method , that is, once a process (or job) occupies the processor, it will continue to run until the process (or job) completes its work or cannot continue due to waiting for an event. The processor is released only when executed.

On the surface, the first-come, first-served scheduling algorithm is fair to all processes (or jobs), that is,Serve them in the order they arrive. But suppose there are equal numbers of long processes (10t) and short processes (t). Because the numbers are equal, the probability of who arrives first is also equal. When the long process comes first, the waiting time of the short process is 10t, and when the short process comes first, the waiting time of the long process is only t. soThe first come, first served scheduling algorithm is good for long processes (jobs) and not good for short processes (jobs).

Nowadays, the first come first served scheduling algorithm is rarely used as the main scheduling strategy, especially it cannot be used as the main scheduling strategy for time-sharing systems and real-time systems, but itOften used in combination with other scheduling strategies. For example, in a system that uses priority as a scheduling policy, multiple processes or jobs with the same priority are often processed on a first-come, first-served basis.

Short job priority scheduling algorithm (job scheduling, process scheduling)

The short job first (SJF) scheduling algorithm is called the short process first scheduling algorithm when used for process scheduling . This algorithmIt can be used for both job scheduling and process scheduling.

The basic idea of ​​the short job (or process) priority scheduling algorithm is to allocate the processor to the fastest completed job (or process) . In job scheduling, the short job priority scheduling algorithm selects one or several jobs with the shortest estimated running time from the backup job queue each time and transfers them into the memory, allocates resources, creates processes and puts them into the ready queue. In process scheduling, the short process priority scheduling algorithm selects the process with the shortest estimated running time from the ready queue each time and assigns the processor to it, allowing the process to run and not releasing the processor until it is completed or blocked for some reason.

When all jobs arrive at the same time, the SJF scheduling algorithm is the best algorithm with the shortest average turnaround time (if the short process executes first, the long process waits longer. The long process executes first is much shorter, so the average waiting time is the shortest, while the process runs Time is determined and unchanged). But the algorithm is obviousIt is not good for long jobs. When there are many short jobs constantly entering the ready queue, long jobs will be "starved" because they cannot be scheduled for a long time.(The "starvation" phenomenon means that the process cannot be scheduled for execution or cannot obtain the required resources for a period of time).

Priority scheduling algorithm (job scheduling, process scheduling)

The priority scheduling algorithm is a commonly used process scheduling algorithm, which can be used for both job scheduling and process scheduling. The basic idea is to allocate the processor to the process with the highest priority. The algorithm'sThe core issue is how to prioritize processes. The priority of a process is used to express the importance of the process, that is, the priority of running.

The priority of a process is usually divided into two types: static priority and dynamic priority.

  • Static priority is determined when the process is created and does not change throughout the entire process. The basis for determining static priority is as follows:
    • Determined by process class. System process priority > User process priority
    • Determined based on the resource requirements of the job. Processes that apply for more resources > processes that apply for less resources
    • Determined by user type and requirements. The higher the user's charging standard, the higher the priority of the process corresponding to the user's job.
  • Dynamic priority means that when a process is created, a priority is determined based on the characteristics of the process and related situations, and the priority is adjusted according to changes in the situation while the process is running. The basis for determining dynamic priority is as follows:
    • Determined based on the length of time the process occupies the CPU.The longer a process takes up the CPU, the lower its priority., the less likely it is to be scheduled again; conversely, the shorter the time a process occupies the CPU, the higher the priority, and the greater the
      possibility of being scheduled again.
    • It depends on how long the process waits between CPUs.The longer a process waits in the queue, the higher its priority., the greater the possibility of getting scheduled; conversely, the shorter the time a ready process waits in the ready queue, the lower the priority, and the smaller the possibility of getting scheduled.

Priority-based scheduling algorithms can also be divided into non-preemptive priority scheduling algorithms and preemptive priority scheduling algorithms according to different scheduling methods.

  • The implementation idea of ​​the non-preemptive priority scheduling algorithm is that once the system assigns the processor to the process with the highest priority in the ready queue, the process will continue to run until it voluntarily gives up due to its own reasons (task completion or equipment application, etc.) The processor is assigned to another process with the highest current priority.
  • The idea of ​​implementing the preemptive priority scheduling algorithm is to allocate processing to the process with the highest priority and let it run. During the running of the process, once another process with a higher priority appears (for example, a higher priority process becomes ready due to the occurrence of a waiting event), the process scheduler will stop the current process and use the processor Assigned to new higher priority processes.

Time slice round-robin scheduling algorithm (process scheduling)

Process scheduling usually uses a time slice round robin scheduling algorithm. In the time slice round-robin scheduling algorithm, the system arranges all ready processes into a queue in the order of arrival time. The process scheduler always selects the first process in the queue for execution and stipulates a certain execution time, which is called a time slice. (eg 100ms).When the process uses up this time slice (even if the process has not ended), the system sends it to the end of the ready queue, and then allocates the processor to the next ready process. In this way, the processes in the ready queue can take turns to obtain a time slice of processing time, and then return to the end of the queue to wait for execution, and this cycle continues until completion.

If the time slice is set too large and all processes can be executed within one time slice, then the time slice rotation scheduling algorithm will degenerate into a first-come, first-served scheduling algorithm; if the time slice is set too small, then the processor will be in the process. By switching frequently, the time the processor actually spends running user processes will be reduced. Therefore, the time slice size should be set appropriately.

The size of the time slice is determined by the following factors:

  • System response time. The time-sharing system must meet the system's response time requirements. The relationship between the system response time and the time slice can be expressed as

    T=Nxq
    in,T is the response time of the system, is the size of the time slice, and N is the number of processes in the ready queue.. According to this relationship, it can be known that if the number of processes in the system is constant, the size of the time slice is proportional to the system response time .

  • The number of processes in the ready queue. When the response time is fixed, the number of processes in the ready queue is inversely proportional to the size of the time slice.

  • System processing capabilities. Frequently used commands that usually require users to type can be processed within a time slice. Therefore, the faster the computer, the more commands it can process per unit time, and the smaller the time slice can be.

High response ratio priority scheduling algorithm (job scheduling)

The high response ratio priority scheduling algorithm combines the characteristics of the first come first served and short job priority scheduling algorithms, that is, it takes into account the two factors of job waiting time and job running time, making up for the fact that the previous two scheduling algorithms only consider one of them. Insufficiency of factors.

High response ratio priority scheduling algorithm is mainly used for job scheduling. The basic idea is that each time a job is scheduled, the response ratio of each job in the ready queue is first calculated, and the job with the highest response ratio is selected and put into operation.

Response ratio = job response time/estimated running time

Right nowResponse ratio = (job waiting time + estimated running time) / estimated running time

This algorithm favors short jobs (when the job waiting time is the same, the shorter the estimated running time, the higher the response ratio), while taking into account long jobs (as long as the job waiting time is long enough, the response ratio will become the highest.This algorithm considers both short jobs and long jobs, but it increases system overhead because it needs to calculate the response ratio of each backup job.

Multi-level queue scheduling algorithm (process scheduling)

Multi-level queue scheduling algorithmThe basic idea is to divide the ready queue into several independent queues according to the nature or type of the process, and each process belongs to a fixed queue. Each queue uses a scheduling algorithm, and different queues can use different scheduling algorithms.For example, set up a ready queue for interactive tasks, which uses a time slice round-robin scheduling algorithm; another example is to set up another ready queue for batch tasks, which uses a first-come, first-served scheduling algorithm.

Multi-level feedback queue scheduling algorithm (process scheduling)

The multi-level feedback queue scheduling algorithm is the synthesis and development of the time slice rotation scheduling algorithm and the priority scheduling algorithm. By dynamically adjusting the process priority and time slice size, the multi-level feedback queue scheduling algorithm can take into account multiple system goals.

Insert image description here

First, you should set up multiple ready queues and give each queue a different priority.The first queue has the highest priority, the second queue has the second highest priority, and the priorities of the remaining queues decrease successively.

Secondly, the size of the execution time slice of the process in each queue is also different. The higher the priority of the queue where the process is located, the shorter the corresponding time slice will be.Usually, the time slice of the i+1 queue is twice the time slice of the i queue.

When a new process enters the system, it should be placed at the end of the first queue and queued for scheduling on a first-come, first-served basis. When it is the process's turn to execute, if it can be completed within this time slice, the system can be prepared to evacuate: if the process has not been completed at the end of a time slice, the scheduler will transfer the process to the end of the second queue. Then wait for scheduling execution according to the first-come-first-serve principle: if the process is not completed after running in the second queue for a time slice, it will be transferred to the third queue in the same way. Continuing like this, the time slice round robin scheduling algorithm is used in the last queue.

Finally, the scheduler schedules the process in the second queue to run only when the first queue is idle; the process in the -1th queue is scheduled to run only when the first to -1th queues are empty. . When the processor is serving a process in the i-th queue, if a new process enters the queue with a higher priority, the new process will seize the processor of the running process, that is, the scheduler will The executing process is placed back at the end of the queue and the processor is reassigned to the new process.

Insert image description here

Insert image description here

Insert image description here

2.3 Synchronization and mutual exclusion

2.3.1 Basic concepts of process synchronization

Two forms of constraints

  • Indirect mutual restriction relationship (mutual exclusion): If a process requires the use of a certain resource, and the resource is being used by another process, and the resource is not allowed to be used by two processes at the same time, then the process has to wait for the process that has occupied the resource. Release the resource before using it again. The basic form of this restriction relationship is "process-resource-process" .

    This constraint relationship stems from the need for multiple processes of the same type to share certain system resources (such as printers) mutually.Mutual exclusion is set up between processes of the same type to achieve mutual exclusive access to resources (for example, in the producer-consumer problem, producers need to mutually exclusive access to the buffer pool)

  • direct constraint relationship (synchronization)

    A process cannot continue to run if it does not receive the necessary information provided by another process. This situation indicates that the two processes need to exchange information at certain points and communicate with each other about their operation status. The basic form of this restriction relationship is "process-process" .

    This restriction mainly stems from the cooperation between processes.Synchronization is set between different processes to achieve synchronization between multiple processes (for example, in the producer-consumer problem, the producer can produce products and put them into the buffer pool, and the consumer takes the products from the buffer pool for consumption. If the producer If the producer does not produce the product, then the consumer cannot consume it)

As long as processes of the same type are in a mutually exclusive relationship, processes of different types are in a synchronization relationship. For example, consumers and consumers are in a mutually exclusive relationship, and consumers and producers are in a synchronization relationship.

Critical resources and critical sections

When a process is running, it generally shares resources with other processes, and some resources are used exclusively.Resources that are only allowed to be used by one process at a time are called critical resources. Many physical devices are critical resources, such as printers, plotters, etc.

Access to critical resources can be divided into 4 parts:

Insert image description here

  • Enter the area. In order to enter the critical section and use critical resources, it is necessary to check whether the critical section can be entered in the entry area;If the critical section can be entered, the corresponding "accessing critical section" flag is usually set to prevent other processes from entering the critical section at the same time.
  • critical section.Code used to access critical resources in a process, also known as critical section
  • Exit zone.The part after the critical section used to clear the "Critical Section Accessing" log
  • remaining area. Other parts of the process other than the above 3 parts.

simply put,A critical resource is a system resource that requires mutual exclusive access by different processes, while a critical section is a piece of code in each process that accesses critical resources and belongs to the corresponding process., the entry area and exit area need to be set before and after the critical section for inspection and recovery.Critical sections and critical resources are different. Critical resources are resources that must be accessed mutually. This resource can only be used by one process at the same time, but there is more than one process that needs this resource. Therefore, processes that use critical resources need to be processed. manage, which also gave rise to the concept of critical section.

The critical section code of each process can be different . What operations a process performs on critical resources has nothing to do with critical resources and mutual exclusion synchronization management.

Mutually exclusive concepts and requirements

According to the definition of mutual exclusion, when a process enters a critical section to use a critical resource, the other process must wait until the process occupying the critical resource exits the critical section before the new process is allowed to access the critical resource.

In order to prevent two processes from entering the critical section at the same time, the software algorithm or synchronization mechanism should follow the following guidelines:

  • Let in when free . When no process is in the critical section, a process that requests to enter the critical section can be allowed to immediately enter its own critical section.
  • If you are busy, wait . When a process has entered its critical section, other processes trying to enter the critical section must wait.
  • Limited waiting . Processes that require access to critical resources should be guaranteed to be able to enter their critical section within a limited time.
  • Let the power wait . When a process cannot enter its critical section for some reason, the processor should be released to other processes.

Synchronization concept and implementation mechanism

Generally speaking, the speed at which one process runs relative to another process is undefined. In other words, processes run in an asynchronous environment. But mutually cooperative processes require coordinating their efforts at certain critical points.The so-called process synchronization means that multiple cooperating processes may need to wait for each other or exchange information with each other at some key points. This mutual restriction relationship is called process synchronization.. Synchronization can be achieved using semaphores .

2.3.2 Mutual exclusion implementation method

Mutual exclusion can be implemented using either software or hardware methods.

software approach

Algorithm 1: Set a public integer variable turn to represent the process ID that is allowed to enter the critical section. If turn is 0, process P 0 is allowed to enter the critical section; otherwise, the variable is checked in a loop until turn becomes the identity of the process; in the exit area, the identity of the process P 0 allowed to enter is modified to 1. The algorithm for process P 1 is similar to this. The program structure of the two processes is as follows:

Insert image description here

This method guarantees mutually exclusive access to critical resources, butThe problem is that forcing two processes to enter the critical section in alternating order can easily lead to insufficient resource utilization.

For example, when process P 0 exits the critical section, turn is set to 1 to allow process P 1 to enter the critical section. However, if process P 1 does not temporarily require access to the critical resource, and P 0 wants to access the critical resource again, then It will not be able to enter the critical section. It can be seen that this algorithm cannot guarantee the implementation of the "idle give in" criterion .

Algorithm 2: Set the flag array flag to indicate whether the process is executed in the critical section, and the initial values ​​are all false. Before each process accesses the critical resource, first check whether another process is in the critical section. If not, modify the critical section flag of this process to true and enter the critical section. Modify the critical section flag of this process to false in the exit area. . The program structure of the two processes is as follows:

Insert image description here

This algorithm solves the problem of "free entry",But a new problem arises, that is, when neither process enters the critical section, their respective access flags are false. If at this time both processes want to enter the critical section at the same time, and both find that the other party's flag value is false (when the two processes alternately execute the check statements, both meet the flag[]=false condition), so the two processes enter their respective critical sections at the same time, which violates the access rule of the critical section "wait when busy"

Algorithm 3: This algorithm still sets the flag array flag, but the flag is used to indicate whether the process wants to enter the critical section. Before each process accesses the critical resource, it first sets its own flag to true, indicating that it hopes to enter the critical section, and then checks the other process. A process flag. If the flag of another process is true, the process waits; otherwise, it enters the critical section. The program structure of the two processes is as follows:

Insert image description here

This algorithm can effectively prevent two processes from entering the critical section at the same time.However, there is a problem that neither process can enter the critical section. That is, when two processes want to enter the critical section at the same time, they set their own flags to true, and check the status of the other party at the same time, and find that the other party also wants to enter the critical section. area, so they both block themselves, resulting in both being unable to enter the critical area, resulting in a "dead wait" phenomenon, which violates the "limited wait" principle

Algorithm 4: The idea of ​​this algorithm is a combination of Algorithm 3 and Algorithm 1 . The flag array flag[] indicates whether the process wants to enter the critical section or execute in the critical section. In addition, a turn variable is set to represent the process ID that is allowed to enter the critical section. The program structure of the two processes is as follows:

Insert image description here

Insert image description here

At this point, Algorithm 4 can work completely normally,Use flag[] to solve mutually exclusive access to critical resources, and use tumn to solve the "starvation" phenomenon

hardware method

There are great limitations in completely using software methods to achieve process mutual exclusion, and software methods alone are rarely used now. The main idea of ​​the hardware method is to use one instruction to complete the two operations of checking and modifying the flag, thus ensuring that the checking and modifying operations are not interrupted; or to ensure that the checking and modifying are executed as a whole through interrupt masking .

There are two main hardware methods: one is interrupt masking; the other is hardware instructions

Advantages of the hardware approach :

  • Wide range of applications. The hardware approach works with any number of processes and is identical in uniprocessor and multiprocessor environments.
  • Simple. The flags of hardware methods are simple to set, have clear meanings, and are easy to verify their correctness.
  • Supports multiple critical sections. When there are multiple critical sections in a process, you only need to set up a Boolean variable for each critical section.

The hardware approach has many advantages, but it also has some disadvantages that cannot be overcome by itself.These shortcomings mainly include that the process consumes processor time when waiting to enter the critical section, and cannot realize "waiting for power" (requiring software to cooperate with the judgment); the selection algorithm of the process entering the critical section has some defects in the hardware implementation, which may cause Some processes are never selected, resulting in "starvation"

2.3.3 Signal amount

Although the software and hardware methods explained earlier can solve the mutual exclusion problem, they all have shortcomings. The algorithm of the software method is too complex, inefficient and unintuitive, and there is a "busy wait" phenomenon (the flag variable is continuously detected when entering the zone). As far as the hardware method is concerned, for user processes, the interrupt masking method is not a suitable mutual exclusion mechanism; the hardware instruction method has shortcomings such as the inability to implement "right-waiting".

Semaphores and synchronization primitives

The semaphore is a definite tuple (s, q), where s is an integer variable with a non-negative initial value, and q is a queue whose initial state is empty.The integer variable s represents the number of a certain type of resource in the system. When its value is greater than 0, it represents the number of currently available resources in the system. When its value is less than 0, its absolute value represents that the system is blocked due to requests for this type of resource. number of processes. In addition to the initial value of the semaphore,The value of the semaphore can only be changed by the P operation (also called the wait operation) and the V operation (also called the signal operation). The operating system uses its state to manage processes and resources.

The establishment of a semaphore must be explained,That is, the meaning and initial value of s should be accurately explained (note: this initial value is not a negative value). Each semaphore has a corresponding queue, and the queue is empty when the semaphore is established.

Let s be a semaphore,

When P(s) is executed, it mainly completes the following actions: first execute s=s-1; if s>=0, the process continues to run; if s<0, the process is blocked and inserted into the wait of the semaphore. in queue .

When V(s) is executed, it mainly completes the following actions: execute s-s+1 first; if s > 0, the process continues to execute: if s <= 0, remove the first process from the semaphore waiting queue , causing it to become ready and inserted into the ready queue, and then return to the original process to continue execution .

Insert image description here

Insert image description here

Both P and V operations are indivisible atomic operations, which ensures that the process of operating the semaphore will not be interrupted or blocked.. The P operation is equivalent to applying for resources, and the V operation is equivalent to releasing resources .P operations and V operations must appear in pairs in the system, but they are not necessarily in one process and can be distributed in different processes.

Classification of semaphores

  • Integer semaphore: Integer semaphore is an integer s,Except for initialization, it can only be accessed through the standard atomic operations P and V.. Integer semaphore introduces P and V operations, but when performing P operation,If there are no available resources, the process continues to test the semaphore, causing a "busy wait" phenomenon and failing to follow the "wait for right" principle.

  • Record semaphore (resource semaphore): In order to solve the "busy waiting" problem of integer semaphore, a linked list structure is added to link all processes waiting for the resource. Record semaphore is precisely because of its use of record type Named after the data structure.

    When the process performs P operation on the semaphore, if no remaining resources are available at this time, the process blocks itself, gives up the processor and inserts it into the waiting list. It can be seen that this mechanism follows the principle of "giving power and waiting". When the process performs a V operation on the semaphore, if there are still processes waiting for the resource in the linked list, the first waiting process in the linked list will be awakened.

    If the initial value of the semaphore is 1, it means that the resource is a critical resource that only one process can access at the same time.

Application of semaphore

  • Realize process synchronization

    Suppose there are concurrent processes P 1 and P 2 . There is a statement S 1 in P 1 and a statement S 2 in P 2. S 1 must be executed before S 2 . This synchronization problem can be easily solved using semaphores.

Insert image description here

Insert image description here

  • Implement process mutual exclusion

    Assume that there are processes P 1 and P 2 , both of which have their own critical sections, but the system requires that only one process can enter its own critical section at the same time. The use of semaphores here can easily solve the mutually exclusive entry of critical sections. Set the semaphore N with an initial value of 1 (that is, the number of available resources is 1). You only need to place the critical section between P(N) and V(N) to realize the exclusive entry of the two processes.

Insert image description here

If there are two or more processes that need mutually exclusive access to a resource, you can set a semaphore with an initial value of 1, and perform P operations and V operations on the semaphore before and after the code of these processes accessing the resource, to ensure that Mutually exclusive access of the process to the resource

2.3.4 Classic synchronization problem

producer-consumer problem

The producer-consumer problem is a well-known process synchronization problem.It describes a group of producers providing products to a group of consumers. They share a bounded buffer zone, into which producers put products and consumers take away products.. This problem is an abstraction of many mutually cooperative processes. For example, during input, the input process is the producer and the computing process is the consumer: during output, the computing process is the producer and the printing process is the consumer.

To solve this problem, two synchronization semaphores should be set up: one indicates the number of empty buffers, expressed as empty, and the initial value is the bounded buffer size n; the other indicates the number of full buffers (i.e., the number of products), expressed as full means that the initial value is 0 . In addition, a mutex semaphore needs to be set with an initial value of 1 to ensure that multiple producers or consumers can access the buffer pool mutually .

Insert image description here

The order of P(full)/P(empty) and P(mutex) cannot be reversed. The P operation must be performed on the resource semaphore first and then the P operation on the mutex semaphore, otherwise a deadlock will occur.

In the case where multiple semaphores exist at the same time,The P operation cannot usually be reversed. The P operation must be performed on the resource semaphore first, and then the P operation must be performed on the mutually exclusive semaphore., this can ensure that resources can be used when occupying the semaphore access rights, otherwise there will be a "dead wait" phenomenon in which the use rights are occupied but no resources are available .

Whenever there are multiple processes of the same type, a mutually exclusive semaphore is required.

reader-writer problem

In the reader-writer problem, there is a data area shared by many processes. This data area can be a file or a space in main memory. There are some processes (readers) that only read this data area and some processes that only read the data area. The process (writer) that writes data. In addition, the following conditions must be met:

  • Any number of readers can read this file at the same time.
  • Only one writer can write to the file at a time (writers must be mutually exclusive).
  • If a writer is operating, any reader process is prohibited from reading the file and any other writer process is prohibited from writing the file.
  • reader first algorithm

    When a reader attempts to perform a read operation, if another reader is performing a read operation at this time, he can start the read operation directly without waiting. Since as long as there are readers performing reading operations, writers cannot write, but subsequent readers can directly perform reading operations. Therefore, as long as readers arrive one after another, readers can start reading operations as soon as they arrive, and the writer process can only wait for all readers to finish reading. Writing operations can only be performed after exiting. This is reader priority.

    To solve this problem, you need to set up the following semaphores:Set the integer variable readcount that records the number of readers. The initial value is 0. When its value is greater than 0, it indicates that there is a reader and the writer cannot perform writing operations: Set the initial value of the mutex semaphore rmutex to 1 to ensure multiple The reader process's mutually exclusive access to readcount: Set the mutually exclusive semaphore mutex, with an initial value of 1, to control the writer's process' mutually exclusive access to the data area.. The algorithm is as follows:

Insert image description here

Insert image description here

  • fair case algorithm

    The execution order of processes is exactly in the order of arrival., that is, when a reader attempts to perform a read operation, if a writer is waiting for a write operation or is performing a write operation, subsequent readers must wait for the first arriving writer to complete the write operation before starting the read operation.

    To resolve this issue,Compared with the reader-first algorithm, a semaphore wmutex needs to be added. Its initial value is 1, which is used to indicate whether there is a writer who is writing or waiting. If there is a writer, new readers are prohibited from entering.. The algorithm is as follows:

Insert image description here

Insert image description here

  • writer first algorithm

    Some books also call the fair situation algorithm writer-first, but it is not writer-first in the true sense. It just performs read and write operations in the order of arrival. To achieve true writer first (i.e.When writers and readers wait at the same time, subsequent writers can jump in the queue before the waiting readers when they arrive. As long as there is a writer in the waiting queue, no matter when they arrive, they will be woken up before the readers.), you need to add additional semaphores for control.

    In order to achieve this purpose,An additional semaphore readable needs to be added to control that when writers arrive, they can enter the critical section before readers. When a writer arrives, you only need to wait for the previous writer to finish writing before entering the critical section directly, regardless of the reader. Whether it arrives before or after the writer. In addition, an integer writeount needs to be added to count the number of writers. Compared with the previous algorithm, the role of wmutex has changed. It is now used to control writers' mutually exclusive access to writecount.. The algorithm is as follows:

Insert image description here
Insert image description here

This method adds a readable semaphore to enable writers to jump in line.When the first writer arrives, he applies to occupy the readable semaphore. After the occupation is successful, it will continue to occupy it. Subsequent reader processes will be blocked because they cannot apply for the readable semaphore. When subsequent writers arrive, they do not need to apply for the readable semaphore. The semaphore is therefore queued behind the writer, thereby achieving the purpose of jumping in the queue. Readers cannot continue reading until all writers have finished writing and the last writer releases the readable semaphore. When a new writer arrives, continue to occupy the readable semaphore, prevent subsequent readers from reading operations, and repeat this process.. This algorithm truly implements writer priority, and new writers can also occupy the data area before the first arriving readers to operate .

Philosophers' dining problem

5 philosophers sit around a round table. There are 5 pegs on the table, one between each two philosophers: the philosopher's actions include thinking and eating. When eating, he needs to pick up the two pegs on his left and right at the same time. two chopsticks, and when thinking, put the two chopsticks back to their original places at the same time. The dining philosophers problem can be regarded as a typical problem of dealing with critical resources when concurrent processes are executed.

Chopsticks are critical resources and cannot be used by two philosophers at the same time, so a semaphore array is used to represent chopsticks.

Insert image description here

This solution has problems and will lead to a deadlock (if 5 philosophers are hungry at the same time and each take the chopsticks on the left, all 5 chopsticks will be occupied. When they try to take the chopsticks on the right, they will all be blocked because they have no chopsticks. "wait infinitely") .

For this deadlock problem, the following solutions can be used:

  • Only a maximum of 4 philosophers are allowed to eat at the same time.
  • A philosopher can pick up chopsticks only if the chopsticks on his left and right sides are available at the same time.
  • Number the philosophers and ask the odd-numbered philosopher to take the left chopstick first, and the even-numbered philosopher to take the right chopstick first.

Here is the solution to the last method: it is stipulated that the odd-numbered philosopher takes the left chopstick first, and then the right chopstick; the even-numbered philosopher does the opposite. The algorithm is as follows:

Insert image description here

Insert image description here

barber problem

The barber shop has a barber, a barber chair and a number of stools for customers to wait (assuming there are n stools here). If there are no customers, the barber sleeps in the barber chair. When a customer arrives, he must wake up the barber first. If the barber is giving a customer a haircut, if there is an empty stool, the customer waits; if there is no empty stool, the customer leaves. Design a program for each barber and customer to describe their activities.

There are two ways of thinking about this question:One is to regard the barber chair and the waiting stool as two different resources; the other is to regard the barber chair and the stool as a unified chair resource.. The code used in the first idea is a bit complicated to write, but it is easy to think of; the second method has less code, but it is not easy to understand.

Insert image description here

  • Insert image description here

    Insert image description here

  • Insert image description here

Analysis of steps to solve the problem of semaphore mechanism

  • Relationship analysis: First, you should determine what synchronization relationships exist in the problem.As long as there is a pair of synchronization relationships, a resource semaphore is often required. The initial value of the resource semaphore should be set to the corresponding number of resources in the question. Each sentence in the title generally implies a synchronization relationship or a type of resource., a synchronization relationship may exist not only between two roles (such as producer and consumer), but also between the same role (such as producer and producer). This is just a possible example. In fact, between producers no synchronization relationship).

  • Determine critical resources: The code that accesses critical resources is called a critical section. Since critical resources are only allowed to be accessed by one process at a time, a mutex semaphore needs to be used when accessing critical resources. The initial value of the mutex semaphore is 1 and is generally used. mutex as the name of the mutex semaphore. The general way to write the critical section is:

    Insert image description here

    It should be noted that if there are other restrictions (locking) on ​​synchronization or mutual exclusion relationships when accessing critical resources, the P and V operations of the resource semaphore related to this restriction are generally written outside the above general critical section code. . Assuming that its resource semaphore is N, the modified writing method of the critical section is:

    Insert image description here

  • Organize ideas: Determine the specific codes of different role processes in the problem and the semaphores used, and complete the answer to the semaphore mechanism problem.When answering, the P operation (wait operation) can be regarded as subtracting the number of such resources by 1, and the V operation (signal operation) can be regarded as increasing the number of such resources by 1.

When solving synchronization mutual exclusion problems, should we add loop statements to express concurrency?

Whether to add a loop statement depends on the actual process type.. For example, in the producer-consumer problem, since producers and consumers are constantly producing and consuming, the production and consumption code must be executed in a loop. In this case, loop statements (usually Use the while statement) to ensure that the code continues to execute. If the question requires stopping execution under certain conditions, just add a break statement at the appropriate position within the loop. The processes of some problems do not need to be executed in a loop, such as the customer process in the barber problem. A customer usually leaves after the haircut is finished. That is, the customer process only needs to be executed once and then ends. The code for a similar process only needs It is executed once, so there is no need to add a loop statement (the customer cannot always want a haircut, which is not common sense).

2.3.5 Pipeline

The semaphore mechanism can be used to achieve synchronization and mutual exclusion between processes, but since the control of the semaphore is distributed throughout the program, its correctness analysis is difficult, and improper use may also lead to process deadlock. In response to these problems in the semaphore mechanism, Diikstra proposed in 1971 to set up a "secretary" for each shared resource to manage access to it .All visitors must go through the "secretary", and the "secretary" only allows one visitor (process) to access shared resources at a time. This not only facilitates system management of shared resources, but also ensures mutually exclusive access and synchronization between processes.. In 1973, Hanson and Hoare developed the "secretary concept" into the management concept.

A monitor defines a data structure and a set of operations that can be performed by concurrent processes. This set of operations can synchronize processes and change data in the monitor . It can be seen from the definition of a monitor that a monitor consists of descriptions of shared data structures local to the monitor, a set of procedures for operating these data structures, and statements that set initial values ​​for the data structures local to the monitor.The monitor collects critical sections that are scattered in various processes and provides mutual exclusive access to public variables to provide protection for them.

Basic characteristics of tube processes:

  • Data local to the monitor can only be accessed by procedures local to the monitor.
  • A process can enter the monitor to access shared data only by calling a procedure within the monitor.
  • Only one process is allowed to execute an internal process within the monitor at a time, that is, processes enter the monitor mutually exclusive by calling internal procedures. Other processes that want to enter the monitor must wait and be blocked in the waiting queue.

The following facilities to support synchronization are included in the process definition:

  • A number of condition variables that are localized to the monitor and can only be accessed from within the monitor, used to distinguish between different reasons for waiting.
  • Two function procedures wait and signal that operate on condition variables.wait blocks the process calling this function in the queue associated with the condition variable and makes the tube available, that is, allowing other processes to enter the tube. signal wakes up the process blocked on the condition variable. If there are multiple such processes, select one of them to wake up; if there is no blocked process on the condition variable, do nothing. The signal procedure of the monitor must be called after the wait procedure is called.

2.4 Deadlock

2.4.1 The concept of deadlock

In a multiprogramming system, although the concurrent execution of multiple processes improves the utilization of system resources and increases the system's processing capabilities, the concurrent execution of multiple processes also brings new problems - deadlock.

When multiple processes are permanently blocked due to competition for system resources or communication with each other, these processes will be unable to move forward without external force. Each of these processes waits indefinitely for a resource that is owned by another process in the group and that it can never obtain. This phenomenon is called a deadlock.

  • There are at least two processes involved in the deadlock.
  • Each process participating in the deadlock is waiting for a resource.
  • At least two of the processes involved in the deadlock occupy resources.
  • Deadlocked processes are a subset of the current set of processes in the system.

2.4.2 Causes and necessary conditions for deadlock

Resource classification

The operating system is a resource management program that is responsible for allocating different types of resources to processes. The types of resources managed by modern operating systems are very rich, and they can be classified from different perspectives. For example, resources can be divided into deprivable resources and non-deprivable resources.

  • Deprivable resources areAlthough the resource owner process needs to use the resource, another process can forcibly deprive the resource from the owner process and use it for its own use.
  • Inalienable resources areExcept for the occupier process that no longer needs to use the resource and actively releases the resource, other processes may not forcibly deprive the occupier process of the resource while it is using it.

Whether a resource is a deprivable resource depends entirely on the nature of the resource itself .

Causes of deadlock

Deadlock occurs due to resource competition. If there is only one process running in the system and all resources are exclusive to this process, there will be no deadlock. When there are multiple processes executing concurrently in the system, if the resources in the system are not enough to meet the needs of all processes at the same time, it will cause processes to compete for resources, which may lead to deadlock.

Although resource competition may lead to deadlock, resource competition does not mean deadlock. Deadlock will only occur if the order of requesting and releasing resources is inappropriate during the running of the process (that is, when the process is advanced in an inappropriate order).

Deadlock occurs due to insufficient system resources and improper process advancement sequence .

Insufficient system resources are the root cause of deadlocks, the purpose of designing an operating system is to enable concurrent processes to share system resources. andImproper process advancement sequence is an important reason for deadlocks, when the system resources are just enough for the process, the improper advancement sequence of the processes can easily cause the processes to occupy the resources needed by each other, thus leading to deadlock.

Necessary conditions for deadlock to occur

  • Mutually exclusive conditions. A process requires exclusive control of the allocated resources, that is, a certain resource is only occupied by one process within a period of time.
  • No deprivation of conditions. The resources obtained by a process cannot be forcibly taken away by other processes before they are used up, that is, they can only be released by the process that obtained the resources.
  • Request and hold conditions. A process applies for a portion of its resources each time, and while waiting for new resources to be allocated, the process continues to occupy the resources that have been allocated. Request and hold conditions are also called partial allocation conditions.
  • Loop wait condition. There is a cyclic waiting chain for process resources, and the resources that each process in the chain has obtained are simultaneously requested by the next process in the chain.

To produce a deadlock, these four conditions are indispensable, so you can avoid deadlock by destroying one or several of these conditions.

2.4.3 Basic methods of dealing with deadlocks

  • Ostrich algorithm: Turn a blind eye to deadlocks like an ostrich, that is, ignore deadlocks.
  • Prevent deadlocks. Prevent the occurrence of deadlock by setting certain restrictions to destroy one or more of the four necessary conditions for deadlock.
  • Avoid deadlocks. In the process of dynamic allocation of resources, some method is used to prevent the system from entering an unsafe state, thereby avoiding the occurrence of deadlock.
  • Detect and remove deadlocks. The occurrence of deadlock is detected in time through the detection mechanism of the system, and then some measures are taken to relieve the deadlock.

Preventing deadlock is to destroy the necessary conditions for deadlock in the scheduling method so that the system cannot produce deadlock., if the deprived process scheduling method is adopted, the process with high priority can always obtain resources and complete the operation, so the system will not deadlock.

Avoiding deadlock is to predict whether the system will enter an unsafe state during the dynamic allocation process. If the resource allocation is likely to cause a deadlock, such allocation will not be performed., the banker's algorithm to be discussed later is a method to avoid deadlock.

Detecting and releasing deadlock is a relatively passive method, which is handled after detecting that a deadlock has occurred., such as depriving the deadlock process of resources and other methods to force the process to release resources or end the deadlock process to relieve the deadlock state.

2.4.4 Prevention of deadlock

To prevent deadlock from occurring, you only need to destroy one of the four necessary conditions for deadlock.

Mutually exclusive conditions

In order to break the mutual exclusion condition, multiple processes must be allowed to access the resource at the same time. However, this will be limited by the inherent characteristics of the resource itself.Some resources cannot be accessed at the same time and can only be accessed mutually. For example, a printer does not allow multiple processes to alternately print data during its operation and can only be used mutually exclusive. From this point of view, it is impossible to prevent deadlocks from occurring by destroying mutual exclusion conditions.

no deprivation conditions

In order to destroy the non-deprivation condition, you can formulate such a strategy: for a process that has obtained certain resources, if the new resource request cannot be satisfied immediately, it must release all the obtained resources and reapply when the resources are needed in the future. . This means that resources that have been obtained by a process can be deprived of resources during operation, thus destroying the non-deprivation condition.This strategy is complex to implement. Releasing acquired resources may cause the previous work to be invalid. Repeated application and release of resources will increase system overhead and reduce system throughput. This method is usually not used in situations where resource deprivation is costly. For example, it will not be used to allocate printers. When a process is printing, the deprivation method will not be used to relieve the deadlock.

Request and hold conditions

In order to break the request and hold conditions, a pre-static allocation method can be used .The pre-static allocation method requires the process to apply for all the resources it needs at one time before running, and it will not be put into operation until its resources are not satisfied. Once put into operation, these resources will always be owned by it, and no other resource requests will be made, thus ensuring that the system will not deadlock.. This method is simple and safe, but reduces resource utilization, because using this method must know all the resources required by the job (or process) in advance, even if some resources can only be used later in the run, and even some resources It is not used at all during normal operation and has to be applied for in advance. As a result, system resources cannot be fully utilized.

Taking a printer as an example, a job may only need to print calculation results when it is finally completed, but the printer needs to be assigned to it before the job is run. Then during the entire execution of the job, the printer is basically idle, while other The process waiting for the printer is delayed in starting to run, causing other processes to become "starved".

loop wait condition

In order to break the loop waiting condition, the ordered resource allocation method can be used .The ordered resource allocation method is to assign a number to all resources in the system according to type (for example, printer is 1, tape drive is 2), and each process is required to strictly request resources in the order of increasing numbers. Similar resources can be applied for at one time.. That is to say, as long as the process requests resource R i , it can only request resources ranked after R i in subsequent requests (i is the resource number), and can no longer request resources ranked before R i . After this restriction is removed, there will no longer be a loop in the system where several processes request resources.

This method is not suitable for modification after numbering various resources, which limits the addition of new equipment; the order in which resources are used by different jobs will not be exactly the same. Even if the system takes into account most situations when numbering resources, there will always be differences with the system. Jobs with inconsistent numbers, thusCauses a waste of resources; sequential use of resources will also increase the complexity of programming.

2.4.5 Avoiding deadlock

Several strategies used in deadlock prevention methods generally impose strong restrictions. Although they are relatively simple to implement, they seriously damage system performance. In methods of avoiding deadlock, the constraints imposed are weaker and it is possible to obtain better system performance.In this method, the state of the system is divided into a safe state and an unsafe state. As long as the system can always be in a safe state, deadlock can be avoided.

Safe state and unsafe state

In the method of avoiding deadlock, processes are allowed to apply for resources dynamically, and the system first calculates the security of resource allocation before allocating resources. If this allocation will not cause the system to enter an unsafe state, the resource will be allocated to the process, otherwise the process must wait .

If at a certain moment, the system can allocate the resources required by each process in a certain order until the maximum demand is reached, so that each process can be completed successfully, then the system state at this time is called a safe state, and the sequence is called for security sequence. If such a safe sequence does not exist in the system at a certain moment, the system state at this time is called an unsafe state. It should be noted that the security sequence may not be unique at a certain moment, that is, multiple security sequences may exist at the same time .

AlthoughNot all unsafe states are deadlock states, but when the system enters an unsafe state, it may enter a deadlock state; conversely, as long as the system is in a safe state, it can avoid entering a deadlock state.

Two points to note:

  • An unsafe state does not mean that a deadlock has occurred in the system. An unsafe state refers to a state where a deadlock may occur in the system and does not mean that a deadlock has occurred in the system.
  • A system in an unsafe state does not necessarily lead to deadlock. Deadlock is a proper subset of unsafe conditions.

banker's algorithm

A representative deadlock-avoiding algorithm is the banker's algorithm given by Dijkstra. In order to implement the banker's algorithm, several data structures must be set up in the system.

Assume that there are n processes (P 1 , P 2 ,...P n ) and m types of resources (R 1 , R 2 ,...R m ) in the system. The data structure used in the banker's algorithm is as follows:

  • Available resource vectorAvailable. This is an array with m elements, whereThe value of Available[i] represents the existing idle number of resources of type i. Its initial value is the number of resources of this type configured in the system. Its value changes dynamically with the allocation and recycling of resources of this type.
  • Maximum demand matrix Max. This is an n×m matrix, which defines the maximum number of m-type resources required by each process in the system.The value of Max[i][j] represents the maximum demand of the i-th process for the j-th type of resource.
  • Allocation matrix Allocation. This is also an n×m matrix, which defines the number of resources currently allocated to each process for each type of resource in the system.The value of Allocation[i][j] represents the number of resources of type j currently owned by the i-th process.
  • Need matrixNeed. This is also an n×m matrix, which defines the number of various resources that each process in the system still needs (note: it is "still needed", not "total need", which means that this matrix is ​​also changing ).The value of Need[i][j] indicates the number of resources of the jth type that the i-th process also needs.. Vector Need i ; is the i-th row of matrix Need, and is the demand resource vector of process i.

Need[i] [j]=Max[i] [j]-Allocation[i] [j]


Description of Banker's Algorithm :

Insert image description here

Insert image description here

The security algorithm is described as follows :

Insert image description here

Insert image description here

2.4.6 Deadlock detection and release

Deadlock detection

  • Resource Allocation Map

    A System Resource Allocation Graph can be defined as a tuple, namely SRAG=(V,E), where V is a vertex set and E is a directed edge set. The vertex set can be divided into two parts: P = (P 1 , P 2 ··, P n ) is a set composed of all processes in the system, and each P represents a process. R=(r 1 , r 2 ,··,r m )
    is a set of all resources in the system, and each r represents a type of resource.

    Each edge in the directed edge set E is an ordered pair representing a requested resource or an allocated resource, <P i , r i > is a requested resource, and <r i , Pi > is an allocated resource.

    In SRAG,Use circles to represent processes and boxes to represent each type of resource.. There may be multiple resources n of each type, and each resource can be represented by a circle in the box. The application edge is a directed edge from a process to a resource, which means that the process has applied for a resource, but the process has not yet obtained the resource. An allocation edge is a directed edge from a resource to a process, indicating that a resource is allocated to a process. An application edge only points to the box representing resource class r, indicating that no specific resource is specified when applying. When process Pi applies for a resource of resource class r i , a request edge is added to the resource allocation graph. If the request can be satisfied, the request edge is immediately converted into an allocation edge; when the process subsequently releases a resource , then the allocation edge is deleted.

    Insert image description here

    Insert image description here

  • Deadlock Theory

    The method of simplifying the resource allocation graph can be used to detect whether the system state S is a deadlock state.

    • In the resource allocation graph, find a process node Pi that is neither blocked nor isolated ( that is, find a connected edge from the process set, and the number of resource requests is less than the number of idle resources in the system). Because process Pi has obtained all the resources it needs, it can continue to run until completion, and then release all the resources it occupies (this is equivalent to eliminating all application edges and allocation edges of Pi ; making it an isolated node) .
    • After process Pi releases resources, it can wake up processes that are blocked while waiting for these resources. The originally blocked process may become a non-blocking process, and the allocation edge and application edge are eliminated according to the simplified method in the first step.
    • After repeating the simplification process of the first two steps, if all the edges in the graph can be eliminated and all processes become isolated nodes, the graph is said to be completely simplifiable; if the graph cannot be completely simplified through any process, the graph is said to be completely simplifiable. Graphs are not completely reducible.

    It can be shown that different simplification orders will lead to the same irreducible graph. The condition that system state S is a deadlock state is: if and only if the resource allocation graph of state S cannot be completely simplified, this theorem is called the deadlock theorem .

Deadlock and detection algorithm

The basic idea of ​​the deadlock detection algorithm is: obtain the number vector available(t) of various types of available resources in the system at a certain time t, and for a group of processes {P 1 , P 2 ,··P n } in the system, find Those processes whose number of requests for various types of resources are less than the number of available resources of various types in the system. Such processes can obtain all the resources they need and finish running. When the running is finished, they will release all the resources they occupy, thereby increasing the number of available resources. Add such processes to the sequence of processes that can be run and finished. Then conduct the above examination on the remaining processes. If several processes in a group do not belong to this sequence, they can deadlock.

Insert image description here

Insert image description here

deadlock release

Once a deadlock is detected in the system, the deadlocked process should be freed from the deadlock state, that is, the deadlock is lifted.

There are three commonly used methods to relieve deadlock:

  • Deprive resources. Seize enough resources from other processes to the deadlocked process to relieve its deadlock state.
  • Undo the process. Undo some processes until enough resources are allocated to other processes to relieve the deadlock.
  • The process rolls back. Let one or more processes roll back enough to avoid deadlock. When the process rolls back, it voluntarily releases resources rather than being deprived of them. The system is required to keep the historical information of the process and set a restore point.

2.4.7 Deadlock and starvation

Even if the system does not deadlock, some processes may wait for a long time.When the waiting time has a significant impact on the advancement and response of the process, it is said that process starvation occurs at this time. When the hunger reaches a certain level and the tasks assigned by the process no longer have practical significance even if they are completed, the process is said to be starved to death.

The difference between deadlock and starvation:

  • Considering the process status, deadlock processes are all in the waiting state; processes waiting (in the running or ready state) during busy hours are not in the waiting state, but may be starved to death.
  • The deadlocked process is waiting for resources that will never be released: while the starved process is waiting for resources that will be released but will not be allocated to itself, which is manifested in the fact that there is no upper bound on the waiting time (waiting in queue or waiting during busy hours).
  • Deadlock must occur due to cyclic waiting, but starvation does not. This also shows that the resource allocation graph can detect whether a deadlock exists, but it cannot detect whether a process is starved to death.
  • A deadlock must involve multiple processes, while there may be only one starved or starved process.

Starvation and starvation are related to resource allocation strategies, so starvation and starvation can be prevented from the perspective of fairness to ensure that all processes are not ignored, such as multi-level feedback queue scheduling algorithms.

Guess you like

Origin blog.csdn.net/pipihan21/article/details/129808475