File System Implementation Concepts

The file system is always in external memory and stores a large amount of data.
The file can be rewritten in place, that is, a block can be read, modified, and then written back.
The smallest unit of external memory and memory transfer is block.
Disk is random access memory.
The file system design is also a layered design idea. Start at the bottom.
(1) I/O control layer, including device drivers (translation of high-level commands to control hardware controllers) + interrupt handlers.
(2) Basic file system: Send high-level commands to the device driver to read and write to the disk.
(3) File organization module: From logical block --> physical block, it is used by the basic file system.
(4) Logical file system: Manage metadata. Metadata is just some file attributes + file name.
Hierarchical design is also mentioned in computer networks, TCP/IP protocols, etc. The advantage is, of course, the ability to clearly divide their functions.
Linux file system: ext3, 4.
Windows file system: FAT, NTFS.
A file system includes:
(1) The boot control block (boot control block) is the first block of a volume, if the volume has no operating system, it is empty .
(2) Volume control block: The details of the volume include how many blocks are available, block size, etc.
(3) Directory: file name + FCB (inode index node).
FCB has already mentioned in another blog, including some file attributes but not the file name.
FCB==inode==Master File Table
There is some information in the memory that is loaded when the file system is mounted, such as directories, the open file table of the entire system, and the open file table of a single process.
When the file system is mounted, all FCBs may have been allocated and stored in the pool.
In UNIX, directories and files are handled together, and in Windows, there are different system calls for directories and files.
For example, to open a new file, create an FCB, add records appropriately in the single process open file table, the system open file table, etc., and return a handle.
It is called a file descriptor in UNIX and a file handle in Windows.

A disk can have multiple partitions, if a partition does not have a file system, it is called raw.
If the boot area has multiple operating systems and multiple file systems, the boot loader can locate an operating system.
The root partition contains the operating system and system files, which are imported into memory at boot time.
When a file system is mounted, it can be mounted automatically or manually. When loading a file system, the operating system needs to determine whether it is valid, and if it is valid, fill in the type of the file system in the loading table.

For seamless movement among different file systems, VFS was introduced.
(1) VFS separates the interface and specific implementation of the file system.
(2) In NFS, there is also an identifier that uniquely identifies the remote file, so vnode is similar to inode, which saves remote file attributes.
Therefore, local and remote files can be distinguished by vnode or inode, and then files can be distinguished by specific file types, so we can correctly call specific operations.
Four main object types defined by VFS:
(1) inode object. separate file.
(2) file object. open file.
(3) superblock object. the entire file system.
(4) dentry object. A separate directory entry.
There are a series of operations for each object type.

目录实现有很多种,
(1)线性列表。
存储文件名和指向FCB的线性链表。
可以使用软件缓存来存储最近使用过的目录。
(2)哈希表。
给定一个文件名,通过哈希函数,快速找到指定目录。
缺点:哈希函数及哈希值都是预先给定,不能灵活变换。
解决方案:动态哈希即可扩展哈希、线性哈希等。或者使用溢出桶。

文件分配空间方法:
一、连续分配。
每个文件都是连续分配。分配时遵循首次适应方法。
目录是由(文件名,开始位置,长度)组成。
优点:直接访问。
缺点:
(1)外部碎片。
(2)确定文件大小。
解决方法:
(1)对于外部碎片,重新打包即把所有文件系统都复制到磁带上,清空整个磁盘,并重新分配连续空间。
(2)对于无法确定文件大小,则重新分配孔,但是费时;或者使用扩展连续空间。那么目录就是(文件名,开始位置,块数,第二个开始位置)组成。
二、链接分配。
目录为(文件名,文件起始指针,文件结尾指针)组成。
文件由链表组成。
优点:分配时只要有空闲块即可。没有外部碎片。
缺点:
(1)顺序访问而不能随机访问。
(2)指针需要空间。
(3)内部碎片。
(4)可靠性问题。因为由指针链接,只要有一个指针丢失,则文件就崩溃。
解决方法:
对于指针占用空间问题,需要利用更少的指针,则引入了cluster(簇),即一个簇由多个块组成。但是会加剧内部碎片问题。
对于可靠性,我们可以用FAT(文件分配表),目录为(文件名,开始块)组成。先在FAT找到指定块,再指向块的具体位置。
采用FAT会导致磁头寻道时间过长。
三、索引分配。
把所有索引放在一起,可以支持直接访问。
但是索引可能会在一个块不够存放,因此有了几种方法:
(1)链接方案:通过将索引块链接起来。
(2)多层索引。
(3)组合方案。例如Unix中的inode,在inode中有15个指针存在文件中,头12个指针为直接块。其他三个为间接索引块。第一个为一级间接块。依次类推,第三个为三级间接块。
如果文件不大,则可以直接访问。

对于空闲空间,我们也需要维护一个空闲空间链表。下面介绍几种方法实现。
(1)位向量。
通过已分配的块记为0,未分配的块记为1,则只需要简单的位运算就能得出第一个空闲块或者连续空闲块大小等结论。
缺点:所占空间太大。
(2)链表。
空闲空间通过链表连起来,第一个空闲块的地址缓存在内存中。
缺点:效率不高。
(3)组。
将n个空闲块的地址存在第一个空闲块中。
(4)计数。
空闲空间表中每个条目记录(起始,长度)的记录。



注:本人正在学操作系统,发现此篇博客是原创者对《操作系统概念》书中的一些总结,结合书本看效率更高。

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325529106&siteId=291194637