Operating system study notes-file management

File management

1. The concept of documents

Defined through a bottom-up approach:

  1. Data item . Data items are the lowest level of data organization in the file system, which can be divided into the following two types:

    Basic data items . A value used to describe a certain attribute of an object. It is the smallest logical unit that can be named in the data, that is, atomic data .

    Combine data items . It consists of multiple basic data items .

  2. Record . Record refers to a set of related data items, used to describe the properties of an object in a certain aspect.

  3. file. A file refers to a set of related information defined by the creator, which can be logically divided into two types: structured files and unstructured files. In a structured file, a file is composed of a group of similar records, which is also called a recorded file ; while an unstructured file is regarded as a character stream, which is also called a stream file .

1.1. File attributes

All file information is stored in the directory structure , and the directory structure is stored on the external storage. File information is transferred to memory when needed. Generally, directory entries contain the file name and its unique identifier, and the identifier locates information about other attributes .

1.2 Basic operations of files

The file is an abstract data type. The operating system provides system calls, which perform operations such as creating, writing, reading, relocating, deleting, and truncating files.

  1. Create a file. Two steps: (1) Find space for files in the file system; (2) Create entries for new files in the directory;
  2. Write files. Perform a system call. The system maintains a write pointer.
  3. Read the file. The system maintains a pointer to the read position.
  4. File relocation (file addressing).
  5. Delete Files. First find the file directory entry to be deleted from the directory, and then reclaim the storage space occupied by the file.
  6. Truncate the file.

1.3. File opening and closing

When the system uses the file for the first time, use the system call open to copy the attributes of the specified file (including the physical location of the file on the external storage) from the external storage to an entry in the open file table of the internal memory , and add the number of the entry ( Also called index) is returned to the user . The operating system maintains a table (open file table) containing information about all open files . When the user needs a file operation, the file can be specified through an index of the table, so the search link is omitted.

Most operating systems require files to be explicitly opened before they are used. The open operation will search the directory based on the file name and copy the directory entry to the open file table. If the request to call open is allowed, the process opens the file, and open returns a pointer to an entry in the open file. All I/O operations are usually done by pointers (not by file names) .

Each opened file has the following information:

  • File pointer. The system tracks the last read and write position as a pointer to the current file position. This pointer is unique to a certain process of opening the file, so it must be stored separately from the disk file attributes.
  • File open count.
  • File disk location.
  • access permission.

2. The logical structure of the file

According to the logical structure, files can be divided into unstructured files and structured files:

  • Unstructured files (streaming files)

    Unstructured files organize data into records in order and accumulate and save them. It is a collection of orderly related information items, in bytes. Traversing unstructured files can only be done through exhaustive methods.

  • Structured document (recorded document)

    Structured files can be divided into the following types according to the organization of records:

    • Sequence file. The records in the file are arranged sequentially one after another. The records are usually fixed-length and can be stored sequentially or in the form of a linked list.
    • Index file.
    • Index order file.
    • Direct file or hash file.

3. Directory structure

3.1, file control block

As with process management, in order to achieve directory management, the operating system introduces a file control block data structure.

  • File control block

    The file control block (FCB) is a data structure used to store various information needed to control a file to achieve "access by name". An ordered collection of FCBs is called a file directory , and an FCB is a file directory entry . In order to create a new file, the system will allocate an FCB and store it in the file directory, called a directory entry .

    FCB mainly contains the following information :

    • Basic information . Such as the file name, the physical location of the file, the logical structure of the file, the physical structure of the file, etc.;
    • Access control information . Such as file access rights, etc.;
    • Use information . Such as the creation time of the file, etc.
  • Index node

    When retrieving a directory file, only the file name is used , and only when a directory entry is found (the search file name matches the file name in the directory entry), the physical address of the file needs to be read from the directory entry. In other words, when retrieving the directory, the other description information of the file will not be used, nor should it be transferred to the memory . Therefore, some systems use the method of separating the file name and file description information , and the file description information separately forms a data structure called an index node, which is called an i node . Each directory entry in the file directory consists only of a file name and a pointer to the i node corresponding to the file .

3.2, directory structure

  • Single-level directory structure

    Only one directory table is created in the entire file system, and each file occupies a directory entry. But the search speed is slow, the file does not allow the same name, and it is not convenient for file sharing;

  • Two-level directory structure

    Divide the file directory into a main file directory and a user file directory. However, it lacks flexibility and is not convenient for file classification.

  • Multi-level directory structure (tree directory structure) . Similar to the directory tree structure of Linux. Absolute path and relative path.

  • Acyclic graph directory structure

3.3, file sharing

  • Hard link
  • Soft link

4. Implementation of the file system

4.1, the hierarchy of the file system

Insert picture description here

4.2, directory implementation

The catalog is implemented to find

  • Linear catalog
  • Hash table

4.3, file realization

4.3.1. File allocation method (management of non-free blocks on the disk)

How to achieve Directory item structure advantage Disadvantage
Sequential allocation Allocating files must be contiguous disk blocks Starting block number, file length Sequential access is fast and supports random access Fragmentation occurs, and file extension is not used
Implicit link Except for the last disk block of the file, each disk block has a pointer to the next disk block (singly linked list) Start block number, end block number Can solve the problem of fragmentation, high utilization of external memory, easy file expansion Can only be accessed sequentially
Explicit link Establish a file allocation table (FAT) and explicitly record the sequence of disk blocks (FAT resident memory after booting) Starting block number Random access can also be achieved by querying the FAT in the memory FAT requires a certain amount of memory space
Index allocation Create an index table for file data blocks. If the file is too large, link scheme, multi-level index, mixed index can be used The link scheme records the block number of the first index block, and the multi-layer/mixed index records the block number of the top index block Support random access, easy file expansion The index table needs to occupy a certain amount of memory space. Before accessing data, you need to read the index fast. If a linking scheme is used, multiple disk read operations may be required when searching for index blocks

4.3.2 Management of file storage space (management of free disk blocks)

  • Free list method
  • Free list method
  • Bitmap
  • Group linked list method

5. Disk organization and management

5.1, the structure of the disk

The data on the disk surface is stored in a set of concentric circles called tracks . The track is divided into hundreds of sectors, each sector has a fixed storage size of 512Bytes , and a sector is also called a disk block . The tracks with the same relative position on all the disks form a cylinder . The sector is the smallest unit of disk addressing. The disk address is represented by "cylinder number·disk number·sector number (or block number)" .

5.2. Disk scheduling algorithm

  • First come first serve algorithm
  • Shortest search time first algorithm
  • The scanning algorithm is
    512Bytes**, and a sector is also called a disk block . The tracks with the same relative position on all the disks form a cylinder . The sector is the smallest unit of disk addressing. The disk address is represented by "cylinder number·disk number·sector number (or block number)" .

5.2. Disk scheduling algorithm

  • First come first serve algorithm
  • Shortest search time first algorithm
  • Scanning algorithm
  • Cycle scanning algorithm (C-SCAN or LOOK), improved C-LOOK algorithm

Guess you like

Origin blog.csdn.net/qq_36879493/article/details/107998659