OS 14: Buffer and Disk Scheduling Algorithms

Table of contents

1. Buffer management

(1) Single buffer and double buffer

1.1 - Single buffer

1.2 - Double buffer

(2) Ring buffer/multi-buffer

(3) Buffer Pool (Buffer Pool)

3.1 - Buffer pool composition

3.2 - How the buffer pool works

2. Disk storage performance and scheduling

(1) Data organization and format of the disk

(2) Disk formatting and disk partitioning

(3) Disk scheduling algorithm

3.1 - First Come First Served (FCFS)

3.2 - Shortest Seek Time First (SSTF)

3.3 - Disk scan (SCAN) algorithm

3.4 - Cyclic Scanning (C-SCAN) Algorithm

3.5 - NStep-SCAN scheduling algorithm

3.6 - F-SCAN Scheduling Algorithm


1. Buffer management

        A buffer is a storage area that can consist of dedicated hardware registers or memory .

        The buffer composed of registers is small and expensive, and is generally only used in occasions that require very high speed, such as associative memory used in memory management, data buffer used in device controllers, etc. In more cases, memory is used as a buffer.

(1) Single buffer and double buffer

        If no buffer is set between the producer and the consumer, the producer and the consumer will limit each other in time. For example, the producer has completed the production of data, but the consumer is not ready to receive it, and the producer cannot deliver the produced data to the consumer. At this time, the producer must pause and wait until the consumer is ready. If a buffer is set between the producer and the consumer, the producer can output data to the buffer without waiting for the consumer to be ready . //for example message queue

1.1 - Single buffer

        When a user process issues an I/O request, the operating system allocates a buffer for it in main memory.

        Since buffers are shared resources, producers and consumers must be mutually exclusive when using buffers . If the consumer has not taken the data in the buffer, even if the producer produces new data, it cannot be sent to the buffer, and the producer waits.

1.2 - Double buffer

        In order to speed up the input and output speed and improve device utilization, a double buffer mechanism is introduced, also known as buffer swapping (Buffer Swapping). When the device is input, the data is first sent to the first buffer, and then it is turned to the second buffer when it is full. // On the surface, it is nothing more than an extra buffer capacity

        Application of double buffer in communication between two machines

        If the communication between two machines only uses a single buffer , then only one direction of data transmission can be achieved between them at any one time . For example, it is only allowed to send data from A to B or from B to A, and never allow both parties to send data to each other at the same time. //half-duplex communication

        In order to realize two-way data transmission, two buffers must be set up in both machines, one is used as a sending buffer and the other is used as a receiving buffer.

(2) Ring buffer/multi-buffer

        A ring buffer is composed of multiple buffers , each of which has the same size. The buffers can be divided into three types: empty buffer R , buffer G filled with data , and buffer C in use , as shown in the figure below. //Multiple buffers

        Therefore, the ring buffer also contains multiple pointers . For example, the pointer Nextg to indicate the next buffer G with data, the pointer Nexti to indicate the next available empty buffer R, and the pointer Current to indicate the buffer C in use.

// Circular use of space, such as circular queues, etc.

(3) Buffer Pool (Buffer Pool)

        When the system is large, there will be many circular buffers, which not only consume a large amount of memory space, but also have low utilization. In order to improve the utilization of the buffer, you can use a common buffer pool that can be used for both input and output, and multiple buffers that can be shared by processes are set in the pool . //Unified management buffer

        The difference between a buffer pool and a buffer is that a buffer is just a linked list of memory blocks, while a buffer pool contains a management data structure and a management mechanism of a set of operation functions for managing multiple buffers. //Like the idea of ​​a monitor, manage a set of semaphores in a unified manner

3.1 - Buffer pool composition

        The buffer pool manages multiple buffers, and each buffer consists of a buffer header and a buffer body .

  • Buffer header : used to identify and manage the buffer , generally including the buffer number, device number, data block number on the device, synchronization semaphore, and queue link pointer.
  • Buffer: used to store data.

        For the convenience of management, the buffers of the same type in the buffer pool are generally linked into a queue , so the following three queues can be formed:

  • Empty buffer queue emq. A queue of empty buffers. Its head pointer F(emg) and tail pointer L(emq) point to the head buffer and tail buffer of the queue respectively.
  • Input queue inq. A queue of buffers filled with input data. Its head pointer F(ing) and tail pointer L(inq) point to the head and tail buffers of the input queue respectively.
  • Output queue outq. A queue of chained buffers filled with output data. Its head pointer F(outq) and tail pointer L(outq) point to the head and tail buffers of the queue respectively.

        In addition to the above three queues, there should be four working buffers : a working buffer for holding input data, a working buffer for extracting input data, a working buffer for storing output data, and a working buffer for extracting Working buffer for output data.

3.2 - How the buffer pool works

  1. Containment input . Pick up an empty buffer from the head of the empty buffer queue emq , and use it as the storage input working buffer hin. When the buffer hin data is full, hang it on the input queue inq queue. // empty queue -> input queue
  2. Extract input . Obtain a buffer from the head of the input queue inq as the extraction input work buffer sin, after fetching data from the buffer sin, hang it on the empty buffer queue emq . //input queue -> empty queue
  3. Containment output . Obtain an empty buffer from the head of the empty buffer queue emq , as the storage output working buffer hout. When the buffer hout is full of output data, hang it at the end of the output queue outq . // empty queue -> output queue
  4. Extract the output . Obtain a buffer full of output data from the head of the output queue outq  as the extracting output working buffer sout. After the buffer sout data is extracted, it is hung at the end of the empty buffer queue. // output queue -> empty queue

// The work of the buffer pool is the operation and conversion process of the three queues.

2. Disk storage performance and scheduling

        The speed of disk I/O and the reliability of the disk system will directly affect the performance of the system.

        There are several ways to improve disk system performance. Firstly, by selecting a good disk scheduling algorithm , the seek time of the disk can be reduced; secondly, the speed of disk I/O can be increased to improve the speed of file access; thirdly, redundancy technology can be adopted to improve the reliability of the disk system and establish Highly reliable file system.

(1) Data organization and format of the disk

        A disk device can include one or more physical disks , each disk is divided into one or two storage surfaces (Surface), each disk surface has several tracks (Track), and there are necessary gaps (Gap) between the tracks . //Physical disk->storage surface->track->sector (data block)

        For simplicity of processing, the same number of bits can be stored on each track. In this way, the density of the disk, that is, the number of bits stored per inch, is obviously that the density of the inner track is higher than that of the outer track .

        Each track is logically divided into several sectors (Sectors), floppy disks are about 8 to 32 sectors, hard disks can be as many as hundreds. A sector is called a disk block (or data block), and a certain gap (Gap) is reserved between each sector.

        A physical record is stored on a sector, and the number of physical record blocks that can be stored on a disk is determined by the number of sectors, tracks, and sides of the disk . For example, a 10GB capacity disk has 8 double-sided storable platters, a total of 16 storage surfaces (disk surfaces), and each surface has 16383 tracks (also called cylinders) and 63 sectors. // 10G/16 = 64M/storage area

(2) Disk formatting and disk partitioning

        In order to store data on a disk, the disk must first be low-level formatted . // Disk capacity division

        The following is the case of formatting a track. Each track (Track) contains 30 fixed-size sectors (Sectors), each sector capacity is 600 bytes, of which 512 bytes store data, and the rest are used to store control information.

        Each sector consists of two fields:

  1. Label field (ID Field): One of the bytes of SYNCH has a specific bit image, as the delimiter of the field, using the track number (Track), head number (Head #) and sector number (Sectors #) to identify a sector; the CRC field is used for segment verification.
  2. Data Field : store 512 bytes of data.

        It is worth emphasizing that, in order to simplify and facilitate the identification of the magnetic head, between different tracks (Track), different sectors (Sector) of each track, and different fields (Field) of each sector, in order to simplify and facilitate the identification of the magnetic head A gap (Gap, also called gap) of different lengths from one to several bytes is set .

        Disk Partition : After the disk is formatted, it is generally necessary to partition the disk . Logically, each partition is an independent logical disk. The starting sector and size of each partition are recorded in the partition table. // Disk partition is to divide the physical disk into multiple logical disks

(3) Disk scheduling algorithm

        The purpose of the disk scheduling algorithm: to reduce the access time to files, and minimize the average access time of each process to the disk.

        Since the time to access the disk is mainly the seek time, the goal of disk scheduling is to minimize the average seek time of the disk .

3.1 - First Come First Served (FCFS)

        The simplest disk scheduling algorithm. It schedules according to the order in which processes request access to the disk .

        The advantage of this algorithm is that it is fair and simple , and the requests of each process can be processed sequentially, and there will be no situation that the request of a certain process cannot be satisfied for a long time.

        However, this algorithm does not optimize the seek, so the average seek time may be longer .

3.2 - Shortest Seek Time First (SSTF)

        This algorithm selects the process process that requires the track to be accessed to be the closest to the track where the current head is located , so that the seek time is the shortest each time, but this algorithm cannot guarantee the shortest average seek time. // There may be starvation

3.3 - Disk scan (SCAN) algorithm

        The essence of the SSTF algorithm is a priority-based scheduling algorithm, so it may cause "starvation" (Starvation) in processes with low priority.

        The scanning (SCAN) algorithm not only considers the distance between the track to be accessed and the current track, but also the current moving direction of the magnetic head. // distance + moving direction

        The magnetic head moves from the inside to the outside, and after reaching the outermost track, it moves from the outside to the inside to scan the entire disk. The law of the head movement in this algorithm is similar to the operation of the elevator, so it is often called the elevator scheduling algorithm. //The problem is that the outermost layer is scanned every time, although it is unnecessary at this time

3.4 - Cyclic Scanning (C-SCAN) Algorithm

        SCAN algorithm can not only obtain better seek performance, but also prevent "starvation". The process must wait until the magnetic head continues to scan from the inside to the outside, and then scans all the tracks to be accessed from the outside to the inside before processing the request of the process, causing the request of the process to be greatly delayed. // process request is deferred

        In order to reduce this delay, the C-SCAN algorithm stipulates that the magnetic head moves in one direction, for example, it only moves from the inside to the outside. When the magnetic head moves to the outermost track and accesses it, the magnetic head immediately returns to the innermost track to be accessed . The minimum track number is followed by the maximum track number to form a cycle, and a circular scan is performed. //At this point, the head no longer returns to a fixed outer starting point

3.5 - NStep-SCAN scheduling algorithm

        In the scheduling algorithms of SSTF, SCAN and C-SCAN, the magnetic arm may stay in a certain place. For example, one or several processes have high access frequency to a certain track, that is, repeatedly request I/O operations on a certain track, thereby monopolizing the entire disk device . We call this phenomenon "Armstickiness" . This is prone to occur on high-density disks.

        The N-step SCAN algorithm divides the disk request queue into several subqueues with a length of N, and the disk scheduling will process these subqueues sequentially according to the FCFS algorithm. And every time a queue is processed according to the SCAN algorithm, after a queue is processed, other queues are processed. When a sub-queue is being processed, if a new disk I/O request occurs, the new request process is placed in another queue, thus avoiding sticking. //First process the process request in the previous queue, and then process the process request in the next queue, each queue has a fixed capacity

        When the value of N is very large, the performance of the N-step scanning method will be close to that of the SCAN algorithm; when N = 1, the N-step SCAN algorithm will degenerate into the FCFS algorithm.

3.6 - F-SCAN Scheduling Algorithm

        The F-SCAN algorithm is essentially a simplification of the N-step SCAN algorithm, that is, F-SCAN only divides the disk request queue into two sub-queues. One is a queue formed by all current processes requesting disk I/O , which is processed by disk scheduling according to the SCAN algorithm. The other is to place all newly-appearing processes requesting disk I/O into the queue of pending requests during the scan. This way, all new requests will be deferred until the next scan. //Current process queue + process queue during scanning

Guess you like

Origin blog.csdn.net/swadian2008/article/details/131656228