Chapter 8 Disk Storage Management (Tang Xiaodan Edition Operating System Notes) Super full and super detailed! ! !

Chapter 8 Disk Storage Management

main mission:

Effectively utilize storage space, improve disk I/O speed, and improve disk system reliability

8.1 Organization of external storage

How to allocate disk space for a file, the minimum disk space is a disk block (sector), that is, how the disk blocks of a file are organized

  • continuous organization
  • Link Organization
  • Index Organization

8.1.1 Continuous organization

Allocate a set of contiguous contiguous disk blocks for each file

On the directory: store each file name, start block number, length

Advantages: easy sequential access and fast access speed;

shortcoming:

  • Contiguous allocation space required
  • The file length must be known in advance
  • not suitable for dynamic growth
  • Cannot add and delete flexibly

8.1.2 Link organization

The disk blocks allocated for each file can be discrete

  • link method

    ① Implicit link

    On the directory: store each file name, start disk block number, end disk block number

    Implicit: There is a pointer to the next block on the block

    ② Explicit link

    On the directory: store each file name, start disk block number

    Explicit: Use a FAT file allocation table to record all physical block numbers and the next block linked by each physical block

8.1.3 FAT Technology

Organize filesystems with explicit link organization

  • FAT Algebra

    The basic unit of allocation and recovery in units of clusters, a cluster is a group of contiguous disk blocks

    The supported file system capacity size = the size of a cluster * the power of 2 FAT digits * the number of partitions

Benefits of clustering

It can adapt to the continuous increase of disk capacity, and can also reduce the number of items in the FAT table, so that the FAT table occupies less storage space, and reduces the access cost of accessing the FAT table

From FAT12->FAT16->FAT32

FAT table entries are limited, allowing up to 4096. As the disk capacity increases, it will inevitably cause fragmentation within the cluster and increase the size of the cluster.

The length of the FAT16 table is only 65356 items. As the disk capacity increases, the size of the cluster will inevitably increase. In order to reduce the fragmentation in the cluster, the length of the FAT table should also be increased. For this reason, the width of the FAT table needs to be increased. This also evolved from FAT16 to FAT32.

FAT32 is the final product of the FAT series of file systems. Each cluster of FAT32 is fixed at 4KB, and each cluster has 8 disk blocks. FAT32 can manage a maximum disk space of up to 2TB.

FAT32 is not backward compatible, runs slower than FAT16, has a minimum management space limit, and a FAT32 volume has at least 65537 clusters

8.1.4 NTFS file organization

NTFS (New Technology File System) is a brand-new file system specially developed for Windows NT, and is applicable to Windows 2000/XP and subsequent Windows OS.

  • disk organization

    NTFS uses clusters as the basic unit of disk space allocation and recovery.

    A file occupies several clusters, and a cluster belongs to only one file

    In this way, when allocating disk space for files, there is no need to know the size of the disk block, as long as the cluster of the corresponding size is selected according to different disk capacities, even though NTFS has independence from the physical block size of the disk.

  • file organization

    In NTFS, all file information, directory information, and available unallocated space information in a volume are recorded in a master file table MFT (Master File Table) in the form of file records, taking volume as a unit.

    This table is the center of the NTFS volume structure. Logically speaking, each file in the volume is regarded as a record, occupying a row in the MFT table, which also includes this row of the MFT itself. The size of each line is fixed at 1 B, and each line is called the metadata of the file corresponding to the line, also called the file control word.

Features: 64-bit disk address, with functions such as data consistency check

8.1.5 Index Organization

The link organization method solves the problem of continuous organization (inconvenient for random access), and new problems arise:

  • Efficient direct access is not supported
  • FAT needs to occupy a large memory space, and the disk block numbers are randomly distributed in the FAT. To find all the disk block numbers of a file, the entire FAT needs to be put into the memory space.
  • So we only need to transfer the block number of the opened file into the memory, there is no need to transfer the entire FAT into the memory

    Allocate an index table for each file

    Index tables are all allocated in one disk block, called index disk block

    On the directory: save the file name and index disk block number of each file

Pros: Greatly speeds up lookups for large files

Disadvantages: If there are many small files, there will be more index disk blocks, and the utilization rate of index disk blocks is not high

Single-level index organization

insert image description here

Figure 1 Single-level index organization

Multi-level index organization

When allocating disk space for a large file, if the disk block number of the allocated disk block is already filled with an index block, the OS must allocate another index block for the file, which is used to continue to allocate for it in the future. The disk block number is recorded therein. By analogy, each index block is linked in sequence through the chain pointer.

insert image description here

Figure 2 Secondary Index Allocation

Incremental indexing method

Mixed, large, medium and small files can be taken care of

8.2 Management of file storage space

Management of file storage space: mainly which disk blocks have been used and which are not used, and secondly provide allocation and recovery operations

8.2.1 Free list method and free linked list method

Free list method (continuous)

Record the first disk block number and the number of free disk blocks of consecutive free disk blocks

  • Allocation and recycling: each allocation can use FCFS to allocate a continuous free disk block

Advantages: higher allocation speed, reduced disk I/O frequency

Disadvantage: the allocated space must be contiguous

Free list method (discrete)

Free disk blocks are pulled into a free disk block chain

  • Allocation and Reclaim: Discrete Allocation of Disk Blocks

Pros: The allocation and recycling process is very simple

Disadvantages: Allocation of disk blocks may have to be repeated many times, and the efficiency of allocation and recovery is low

8.2.2 Bitmap method

Use binary 1 or 0 to indicate whether to use, and the bits corresponding to all disk blocks form a set

  • Allocation and reclamation: Find the unallocated disk block whose status bit is free, convert the physical address of the corresponding disk block, and modify the bit map

8.2.3 Group chaining method

UNIX adopts the method of combining free list and free linked list method

8.3 Ways to increase I/O speed

Cache, read ahead, write delayed, optimize physical block distribution, virtual disk, redundant array of cheap disks

8.3.1 Disk cache

Designate a buffer for disk blocks in memory, and the buffer stores copies of some disk blocks

  1. How to transfer disk cached data to requesting process?

    ① Data delivery: deliver the data to the memory workspace of the requesting process

    ②Pointer delivery: deliver the pointer to the buffered data to the requesting process

  2. What replacement strategy to adopt?

    Considerations when using the replacement algorithm: access frequency, predictability, and data consistency

  3. How is modified data rewritten to disk from the cache?

    Periodically write back to disk

8.3.2 Other ways to increase disk I/O speed

  • Read ahead Suitable for sequential files, save the data of the next disk block in memory in advance
  • Delayed write data is not written to the disk immediately, but hangs in the buffer to reduce the movement of the head
  • Optimize the distribution of physical blocks. Disk blocks should be allocated together as much as possible to reduce head movement.
  • Virtual Disk emulates a disk using a memory area

8.3.3 Redundant Array of Inexpensive Disks (RAID)

  • RAID0 (parallel interleaving)

    The data of each disk block is divided into several sub-disk blocks, and the data of each sub-disk block is stored in the same position on different disks. transmission

    RAID1 (disk mirroring function)

    RAID2 (parallel transfer function)

    RAID5 (with independent transmission function)

  • Advantages: Parallel transmission improves I/O speed, (except RAID0) has high backup reliability, and uses small disks to form a favorable price

8.4 Reliability Techniques to Increase Disk Speed

8.4.1 The first-level fault-tolerant technology STF-I

low level disk

effect:

Prevent data loss from disk surface defects

measure:

Double directory and double file allocation table (backup)

Hotfix redirection and read-after-write checksum

8.4.2 Second level fault tolerance technology STF-II

mid-level disk

effect

Protects against system malfunctions caused by disk drive and disk controller failures

measure:

Disk mirroring: Create an identical disk drive under the same disk controller (backup)

Disk duplex: add another same disk controller (backup) under the host

8.4.3 Fault-tolerant function based on cluster technology

Cluster: a unified computer system composed of multiple hosts

measure

Dual-machine hot device mode: one host works, and one host backs up

Two-machine mutual backup mode: two machines work at the same time, each working on its own task, each server is equipped with two disks, one for use, and one for receiving backup data from the other side

Public disk: There is one disk in dual-machine mode, and one disk is divided into two volumes for use

8.4.4 Backup system

Prevent damage from natural factors

  • tape
  • hard disk
  • CD drive
    • CD-ROM drives: CD-ROM and DVD-ROM
    • Readable and writable optical drive (recorder): CD-RW, COMBO recorder, DVD recorder

8.5 Data Consistency Control

Data consistency: the data is the same at any time under different files

8.5.1 Transactions

A transaction is a program unit for accessing and modifying various data items. Can be viewed as a series of read and write operations

  • operate

Commitment operation: Refers to the transaction's read and write operations on all files are completed

Relational operation: refers to the failure of any operation of a transaction on multiple files, which will lead to premature death

  • Attributes

Modify a batch of data, either complete or none

  • With: transaction record

A transaction record is a data structure that can record all information about the modification of data items when the transaction is running

  • transaction recovery algorithm

①undo: The transaction record only has the start operation, and there is no entrusted operation, so all the data will be restored

②redo: the transaction has start and commission operation: all data will be updated

8.5.2 Checkpoints

Recording transactions will record more and more data as time goes on

Complete the cleanup of the transaction table

  • After a failure, it is not necessary to process the index record transactions in all transaction tables, only the record transactions after the last checkpoint

8.5.3 Concurrency Control

The execution of each transaction is sequential. Only after one transaction is executed, another transaction is allowed to start executing. How to ensure the sequence

Concurrency control:

  • Use of mutex: Only by obtaining the mutex of the object can the object be operated (not efficient)
  • Shared files only allow one transaction to write, but allow multiple files to read; introduce shared locks
  • Difference: a mutex allows only one transaction to read and write, and a shared lock allows multiple transactions to read, but not write

Semaphore mechanism

8.5.4 Data Consistency Issues for Duplicate Data

  1. Duplicate file consistency

    For a UNIX file directory, each directory entry contains an ASCII file name and an index node number, which points to an index node.
    When there are duplicate files, a directory entry consists of a file name and several index node numbers, and the index node numbers correspond to the respective index nodes

  2. Link Count Consistency

    In the UNIX file directory, each directory entry contains an index node number, which is used to point to the index node of the file.


  • Reference: Computer Operating System (Fourth Edition) (Tang Xiaodan)

Guess you like

Origin blog.csdn.net/woschengxuyuan/article/details/128133558