[Linux] Basic IO -- disk file system

foreword

In the last Linux article, we learned the operations of opening files, reading and writing files, etc. These are for the file being opened.
And the file is still open, that is 磁盘文件. This blog will focus on disk files and learn about them.

insert image description here

1. Disk file

Let's think first:

  1. How to properly store disk files?
  2. How to locate a file?
  3. How to read and write to disk files?

1. The physical structure of the disk

The disk is the only one on the computer 机械设备that is also a peripheral

We first understand the physical structure of the disk, so that we can have a better understanding in the subsequent abstraction
insert image description here

  1. Disc
    Discs are stacked together to form a layer-by-layer disk surface.
    Discs are similar to 光碟the same thing. We can abstractly understand that there are small magnets one after another on the surface of the disk. When we write data to the disk When: 磁性N -> S; Delete disk data:磁性S -> N
  2. Magnetic head In order to read data, the magnetic head
    is also responsible for one side of reading 多个一起的. Because the magnetic head is also distributed layer by layer and fixed together, the swing of the magnetic head is一面对应一个磁头
    共进退的
  3. Platters and heads are 没有挨着的. If they are next to each other, when the two rotate at a high speed, excessive heat may be generated, causing the magnetic properties on the surface of the disc to be destroyed due to high temperature, resulting in data loss. But both距离依然很近

2. Physical reading and writing

Next, we try to understand at a physical level how data is read and written to disk

First of all, let's understand the structure of the disc
insert image description here

  1. Sector:
    The disc is a circular structure, through which 角度the disc can be evenly divided, and each piece is a fan
  2. Sectors can be divided into different blocks
    from the outside to the inside , and these blocks are sectors. A sector is the basic unit of disk storage. A sector has , or data.不同半径
    512个字节4kb一般磁盘,所有扇区都是512字节。
  3. The area with a circle of tracks
    同一个半径is a track, and there are sectors with the same radius on a track

Next, we understand the reading and writing of data on the physical structure

insert image description here

The reading and writing of data depends on 磁头the motor driving the transmission shaft and the transmission arm, and using the magnetic head to read and write on the disk.
Each disk surface, each track, and each sector has its corresponding number. We can use these numbers to drive the magnetic head to read data.
First of all, we need to determine where the data is 盘面, that is, determine which head reads —> 磁头的编号;盘面的编号
secondly, where we locate 磁道--> 半径
finally, we determine which track is on 扇区--> 扇区的编号
we call such physical read data The method is CHS定位法
磁头:head ;磁道(柱面):cylinder;扇区:sectorto take the first letter of each word

3. Logical abstraction of the disk

In physical storage, we can determine CHS定位法the sector where the required data is located, but 磁盘毕竟是外设, 操作系统作为内设is it also based on the CHS positioning method to read data?
The answer is 否定的. If the operating system, that is, the software, also relies on the CHS positioning method, does that mean that this is 唯一途the path of communication between the two? But we also mentioned above that the general disk, the sector size is 1 512字节, but there are also 4kbsome , which means 不同磁盘存储数据其实不同that the CHS positioning method 不能统一, the disk can be different, but 操作系统对待所有磁盘应该都是相同的, so the operating system 并不直接reads CHS定位法the data, but使用自己的一套逻辑

When the operating system processes the real structure, it adopts 先描述,再组织the method of abstracting data types, which is also used in the disk.

insert image description here
In real life, we have used magnetic tape, which is also a device for magnetically storing data. The tape is round in the machine, but we pull out the tape a little bit, and it can be unfolded into a linear structure by us.
In the operating system, the tracks on the disk are also abstracted into arrays, and the internal storage sectors

insert image description here

Simply abstract the disk data storage, use 数组下标模拟扇区的编号
but generally the size of a sector is 512字节, but 操作系统的读取是以4KB为单位for IO, so in conversion, the operating system reads 8 sectors at a time. But the operating system does not actually need to know the concept of a sector, the 他只把一次读取的4KB大小的数据,称为块。
conventional access method of a computer: 起始地址+偏移量
so we only need to know 数据块的起始地址(盘片的第一个扇区的下标地址)+ 4KB(type of block), the operating system regards a block as a 类型
so-called block address, which is the essence of 数组的一个下标。
this access method It is called that LAB --- logic adress block 逻辑块地址访问
the operating system passes the data it is looking for to the disk by reading the block,
OS -> N(数组下标) -> LAB -> 逻辑块地址
but the disk locates the data through the CHS positioning method
, soLAB和CHS是需要互相转换的

2. File system

The file system we talked about before is not complete. It was for open files before, but this time we are going to learn how the file system manages disk files.

The storage of the notebook we use may be divided into 2 disks, or 4 disks, mine is divided into 4 disks, namely C, D, E, and F. But this is what the graphical display of Windows shows us. In fact, our hard disk is not divided into 4 pieces, but it is 完整的一块. Just through. 规定界限,划分成了4个区域。
insert image description here
Next, we know that the reading of the operating system is 4KB, and a disk is usually more than 100 GB, so there are still divisions under this. Next,
insert image description here
we need to study this grouping in detail.
insert image description here
Let's look at it first区的整体结构
insert image description here

Boot Block: 启动块, the size is 1kb, stipulated by the pc standard, used to store 磁盘分区信息and 启动信息, 任何文件系统都不能操作该块.
Block group: 块组, details are as follows

1. Block group

Next, let's look at the structure of the block group in detail
insert image description here
File = attribute + content
Linux stores attributes and contents separately, 属性存储在inode Table,内容存储在Date blocks

All attributes of a file are stored in inode节点(128 bytes), 一个文件,一个inode. Attributes include but are not limited to, file creation time, file size, file type, file permissions...

文件的内容是变化Yes, use data blocks to store file content, so the content of a file may be stored by 1 or more blocks, 一个块就是4KBeven if the file content is only 1 byte, it still needs to allocate a block - 4KB for storage. Content is stored in Date blocks

  1. SuperBlock: Save the file system 所有属性信息, such as the type of the file system (Linux is Ext), the entire grouping situation (how many groups are there in this partition, where does it start, where does it end, how much is loaded by memory...).

SuperBlock Yes , there is a SuperBlock in 统一更新each one , and other groups are supplemented, mainly to carry out the SuperBlock to avoid global unavailability due to a little damage. Because SuperBlock is about data, it needs to be backed up.Block groupgroup 0中为主备份

  1. Group Descriptor Table 组描述表: Describe the group 各个位置存储的信息, such as the starting position and length of the Block Bitmap, etc.
  2. inode table: Save the inode nodes of all files in the group, each inode has its own inode number, which also belongs to the attribute of the file.
    inodes
  3. Data blocks: Store the content of all files in the current group.

An inode corresponds to a file, and the file inode attribute and the data block corresponding to the file, 有映射关系.inode内存储该文件内容所在数据块在Date blocks的下标(可多个)。

  1. Block Bitmap位图& inode Bitmap位图: Each bit indicates whether it is free, and describes which data blocks or inode blocks are free or occupied.

2. File name and inode number

The Linux system only recognizes inode编号the inode attribute of the file, and does not 不存在文件名.
文件名是给用户看的

目录其实也是文件, has its own inode, but 内容和普通文件有所不同
any one of its files must be inside a directory, and the data block of the directory is stored in the directory, , 文件名和文件inode编号的映射关系and in the directory, 文件名和inode互为key值(you can find the inode by the file name, and you can find the file name by the inode ).

3. Reading the file

When we access a file, we access it in a specific directory: cat log.txt

  1. First, in the current directory, find the inode number of the corresponding file through the file name
  2. A directory is also a file, and it must belong to a partition. Combined with the inode, find the group in the partition, and find the inode of the file in the group.
  3. Then find the data block Date blocks of the file through the found inode, and load it into the OS, and complete the display display

4. Deletion of files

To delete a file, first of all 找到该文件, the search process is the same as the reading of the file. After that, the deletion of the file is actually 并不是将Date blocks里的数据清空, but 将Block Bitmap和inode Bitmap中该文件的位图置为0the next time you create a file at this location or write a new file, you can directly overwrite it.
When we delete a file by mistake, under Linux, we can 日志find the deleted file in inode编号, and we can find the inode of the original file, and then find its inode Bitmap and Block Bitmap, and replace them 置为1. When creating a new file, It will not be overwritten at this location, and the file will be restored.
Under Windows, it can be found in the recycle bin. The recycle bin is similar to logs, but the files entering the recycle bin are actually only under Linux mv,剪切移到回收站. Empty the recycle bin to correspond to Linux rm.

5. Inode mapping relationship

An inode is actually a structure. As we said above, an inode is 128 bytes, and the address of the corresponding Date blocks is stored in an array, but 128 bytes cannot store several addresses, so does this mean that the corresponding inode storage The size of the file is not large, but in fact, a file is as large as hundreds of MB, GB. So what is the mapping relationship between inode and Date blocks?

The address of the Date block stored in the inode does use an array, but the Date block mapped by the array may not be the content of the stored file.
insert image description here

  1. 直接索引
    The storage is directly the Date block of the corresponding content, as shown in Figure 1, 2, 3, 4, the content of the file is directly inside
  2. 二级索引
    The Date block corresponding to the address stored in the inode is not the content of the file, but 其他Date block的下标represents the file 由多个Date block存储内容
    . Wouldn't it be possible to store more in this file size?
  3. 三级索引
    Same as the secondary index, but the Date block corresponding to the inode stores the Date block of the secondary index, and continues to store the subscript of the Date block internally

6. Supplementary knowledge

  1. The inode number, 可以直接确定该inode在本区内的位置, cannot cross partitions.
  2. Partitioning, grouping, and the internal structure of the Block group are all 操作系统for us to divide and manage. At 安装操作系统that time, these contents were actually already being implemented.格式化,其实就是操作系统向分区写入文件系统的管理属性
  3. 存在数据块存满,inode没用完或者inode用完了,但是数据块没用完。
    If you keep creating empty files, the inodes can be used up, but the data blocks are obviously not used up.
    If you only create a few files, but keep writing to these files, there are also data blocks that are full, but the inodes are not used up. Condition.
  4. 目录也是文件, as we mentioned above, 文件名并不是文件的属性, is not stored in the inode structure, but stored in the directory, but the directory is also a file, so where is the mapping relationship between the file name and the inode?
    The answer is that, in the eyes of the operating system, 目录和普通文件都是文件, directories, 属性etc. are also stored in inode结构体it , but 目录的数据/内容,就是该目录下各文件名和inode的映射关系. We will learn more about it in the next blog – soft and hard links

conclusion

This is roughly the end of the study of the disk file system. Readers can combine the [Linux] file operation and [Linux] redirection in front of the blogger . This is a complete file system, including how to open/close the operating system and how to read files. How to implement writing and redirection in terms of file operations.
If you think this article is helpful to you, you might as well like it to support the blogger, please, this is really important to me.
insert image description here

Guess you like

Origin blog.csdn.net/m0_72563041/article/details/129781206