[Linux] Disk and file system

Table of contents

1. The physical structure of the disk

2. Disk logical abstraction

3. File system

1、Super Block

2、Group Descriptor Table

3、inode Table

4、Data Blocks

5、inode Bitmap

6、Block Bitmap

Fourth, the file system under Linux

1. Inode and file name

2. Add, delete, check and modify files

2.1. View file content

2.2. Delete files

2.3. Create files

2.4. Supplementary content

Five, soft and hard links

1. Soft link

2. Hard link


1. The physical structure of the disk

The physical structure of the disk is shown in the figure:

 The specific physical storage structure is as follows:

 

 The basic unit of storage in a disk is a sector , and the size of a sector is 512 bytes or 4kb. Here we consider it to be 512 bytes for the time being. On a normal disk, all sectors are 512 bytes. All sectors with the same radius form a circle of tracks .

 If we want to read the specified data, we need to determine which disk to use according to the head number, then determine which track to read, and finally locate the sector according to the sector number. Among them, the method of locating a sector through a head (head), a cylinder (track) (cylinder), and a sector (sector) is called a CHS locating method.

 An ordinary file includes attributes + content, which is essentially data and occupies one or more sectors. Since we can use CHS to locate any sector, we can locate any number of sectors, so that the file can be read from the hardware perspective or write.

2. Disk logical abstraction

 We already know that if the OS can know the CHS address, it can access any sector. But because the OS is software and the disk is hardware, in order to prevent iterative changes in the hardware and cause the OS to change accordingly, it is necessary to do a good job of decoupling the OS and hardware, so the CHS address is not used inside the OS.

 In order to reduce the frequency of IO operations, the basic unit size of IO operations between OS and peripherals is 4KB (adjustable). Even if you only need to modify one byte of data, you need to load all the 4KB data where this data is located into the memory, and then write it back to the disk after modification, so we call the disk a block device. The OS needs a new set of addresses for block-level access.

 Think of a disk track as a contiguous spatial structure:

 A sector is equivalent to a continuous array. At this time, only one array subscript is needed to locate a sector. Since the OS performs IO in units of 4KB, an OS-level file block should include 8 sectors. The OS does not care about the concept of the sector. The conventional access address of the computer is carried out through the start address + offset . Therefore, when the OS accesses the data block, it only needs to know the start address of the data block + 4KB. A data block is considered a type.

 Therefore, the address of a block is essentially a subscript N of the array, and you can use the subscript N to locate any block in the future. This addressing method is called LBA , Logical Block Address.

 After obtaining the LBA address, it can be converted into the CHS address of the disk through simple mathematical calculations . If it is known that LBA = 6500 , the size of one magnetic surface of the disk is 5000 , and the size of one track is 1000 . Then its corresponding address is the second magnetic surface, the sixth track, and the 500th sector.

Since then, the management of the disk has been abstracted into the management of a large array.

3. File system

 Since the disk is large, the OS partitions the disk blocks for easier management. After partitioning, each disk area is grouped. The specific structure is as follows:

When the OS partitions the disk, a Boot Block  will be set at the very beginning . This area mainly stores OS-related content, such as the partition table, mirror address, and so on. Generally speaking, this partition exists in sector 1 of track 0 on disk 0. When the user turns on the computer, the OS will load the disk driver, read the partition table of the disk, and then read the address of the OS from the start position of the specific partition, and load the OS. At this time, the OS is actually running.

After that, there are many Block groups  formed by the OS grouping each partition . Each Block group has 6 areas as shown in the figure above .

1、Super Block

 The Super Block stores all attribute information of the file system, including the type of the file system and the status of the entire group. The recorded information mainly includes: the total amount of bolck and inode , the number of unused blocks and inodes , the size of a block and inode , the time of the last mount, the time of the last data writing, and the time of the last inspection of the disk and other related information of the file system.
 There may be one such super block in each Block group in the future , and the data stored in each super fast is exactly the same and updated uniformly. The purpose of this is to prevent the super block area from being damaged. If there is a failure, the entire partition can no longer be used, so make a backup.

2、Group Descriptor Table

 GDT is a group descriptor, which saves attribute information such as detailed statistics in the group. For example, which part of the content is from where to where in this group, how much is used in this group, and so on.

3、inode Table

 Generally speaking, we call the collection of all attributes in a file an inode node, and the general size is 128 bytes. A file has an inode. There will be a large number of files in a group, and there will be a large number of inode nodes, so there needs to be a special area in the group to save these inode nodes. This area is called inode Table , also called inode table.

 Inside the group, each inode has its own inode number, and the inode number itself also belongs to the attribute of the corresponding file. Linux searches for a file based on the inode number.

  An inode corresponds to a file, and the inode attribute of the file has a mapping relationship with the data block corresponding to the file .

4、Data Blocks

 The content of the file is changing, and it is saved in data blocks. So to save the contents of a valid file, n data blocks are required . If there are multiple files, multiple data blocks are required. The area where these data blocks are located is Data Blocks  . The default size of a data block is 4KB .

  When Linux looks for a file, it first finds the inode of the file. There is an int blocks[NUM] array inside the inode structure , and the address of the data block storing the content of the file is recorded in the array. In a group, more than 95% of the contents are Data Blocks .

 When the operating system wants to load a file, only the inode node of the file is loaded. The inode node contains the mapping relationship of the data block of the file content, and which part of the content is to be accessed, which part of the content is loaded into the memory according to the mapping relationship.

5、inode Bitmap

 The inode Bitmap is a bitmap structure, and each bit indicates whether an inode is free or not.

6、Block Bitmap

 Block Bitmap  is a bitmap structure that records which data block in the Data Block has been occupied and which data block has not
been occupied.

Fourth, the file system under Linux

Use ls to specify and add the -i command option to observe the inode of the file :

1. Inode and file name

 The Linux system only recognizes the inode number, and the inode attribute of the file does not contain the file name, and the file name is only for the user to see.

 Any file must be inside the directory. The directory itself is also a file, and it also has its own inode and corresponding data block. The data block of the directory stores the mapping relationship between the file name and the file inode number in the directory . And in the directory, the file name and the inode number are key values ​​for each other.

 The inode number is uniquely valid within a partition and cannot be used across partitions. Which group of the current partition the file is in can be determinedaccording to the inode number .

2. Add, delete, check and modify files

2.1. View file content

 When a user accesses the content of a target file, it must be accessed in a specific directory. The specific process is as follows:

  1. First find the inode number of the target file in the current directory.
  2. A directory is also a file and belongs to a partition. In this partition, the group is found through the inode number of the target file, and the inode of the target file is found in the inode Table area of ​​the group.
  3. Through the mapping relationship between the inode of the target file and the corresponding Data blocks, the data block of the file is found, loaded into the OS, and finally displayed on the monitor.

2.2. Delete files

When the user deletes a target file, the specific process is as follows:

  1. In the current directory, find the inode number of the target file according to the file name.
  2. Find the inode of the target file according to the inode number, and set the bit corresponding to the block bitmap to 0 in combination with the mapping relationship with the corresponding Data blocks.
  3. Set the bit corresponding to the inode bitmap to 0 according to the inode number.

So if we want to delete a file, we only need to modify the bitmap, and the data block is not cleared .  

2.3. Create files

When a user creates an object file, it must be created in a directory. The specific process is as follows:

  1. The OS scans the inode bitmap in the group where the directory is located, finds a free position and sets it to 1, and obtains the inode number.
  2. Fill the default attributes after the file is created into the corresponding inode.
  3. Add a new mapping relationship between file name and inode number in the Data blocks of the current directory file.

2.4. Supplementary content

 The above content includes partitioning, grouping, filling in system properties, etc., all of which are done by the OS. After the partition is completed, in order to make the partition work normally, the partition needs to be formatted, that is, the OS writes the management attribute information of the file system to the partition, and performs area division. If the area division has been done before, then the formatting operation clears the bitmap structure and sets the attribute field to the initial state.

 The file system establishes the mapping relationship between inodes and Data blocks through arrays. Since Data blocks are very large, in order to be able to map them, the array uses direct indexing, secondary indexing, and tertiary indexing to complete the mapping, because it is not the key content , only for understanding, not for explanation.

 In the file system, it is possible that the inodes are not used up and the data blocks are used up. Or the inode is used up, but there are still remaining Data blocks. For example, only create one file, and then continuously stuff data into this file, consuming Data blocks. Or keep creating empty files, consuming inodes. There is currently no way to avoid this problem.

Five, soft and hard links

1. Soft link

Create a soft link command:

ln -s [目标文件] [软链接文件名称]

The specific operation is as follows:

Linked myfile.txt  using my-soft . my-soft is a linked file. 

 It is observed that the inode number of  my-soft is different from that of  myfile.txt , which indicates that the soft link is an independent link file. It has its own inode number, and must have its own inode attributes and content. The content of the soft link is the path of the file it points to . Users can quickly find the target file.

 The specific usage of soft links is that if the path of a target file is very deep, we have to write a very long path every time we access the target file, which is not efficient. At this point, you can use the soft link to create a soft link file in the working directory to facilitate access to the target file. Similar to shortcuts in Windows systems.

2. Hard link

Create a hard link command:

ln [目标文件] [软链接文件名称]

 The specific operation is as follows:

Linked myfile.txt  using my-hard  . my-hard  is an ordinary file. 

 It is observed that the inode number of my-hard is the same as that of myfile.txt, which means that the hard link is the same file as the original file. The hard link only establishes the mapping relationship between the new file name and the old inode number, and only modifies the  current   directory content.

 The number of hard links between my-hard  and  myfile.txt has become 2 . It means that there are two ways to find the file at this time, corresponding to two file names. The number of hard links is essentially a kind of reference counting.

 Now we use the command unlink to delete hard links:

 At this time, the number of hard links of the file becomes 1 again.

Next, we create another directory file and observe the number of hard links:

 You can see that the default number of hard links for directory files is 2 . This is because the directory file inherently has two hard links, one is its own name, and the other is the " .  " symbol inside the directory . If there is a directory file inside the directory file, then the number of hard links of the directory file becomes 3 : its own name, the " .  " symbol inside the directory, and the " .. " symbol in the directory inside the directory :


  Now look at the number of hard links in the root directory:

 The number of hard links is 19 . The contents of the root directory are as follows:

 Generally speaking, the hard link number -2 of a directory file is the number of directories under the directory .


It should be noted that users cannot independently create hard links to a directory file !

 Shows that hard links cannot be created for directories . This is because if a hard link can be established for the directory, it is easy to cause a loop path problem. The OS does not give this permission to the user.


That’s all for the relevant content about the file system. I hope you will support me a lot. If there is something wrong, please correct me. Thank you!

Guess you like

Origin blog.csdn.net/weixin_74078718/article/details/130249187