Hard disk partition and Linux file systems

1. The physical structure of the hard disk

 

Hard disk is physically divided into:

  1. Disc
    1. Tracks
    2. Sector
  2. Manipulator
  3. magnetic head
  4. Spindle

 

Track:

When the hard disk rotation, if the head is fixed at a position, the head traces out a circular path on the disc surface, the circular path is called a track. With the disc center of a circle, which can be divided into many tracks, these tracks with the naked eye can not see, because they are only on disk in a special way some of magnetization of the magnetized areas, is the information on the hard disk along the storage of such tracks, tracks on the disc in order from outside to inside "0" are numbered.

 

cylinder:

Since the hard disk can be composed by a number, the same track on the formation of different disc cylinder (Cylinder), as in the first picture.

magnetic head:

Suppose there are N platters of the hard disk, then there is a 2N disk (a disk with a surface 2), the head also having 2N number, that is, each disk has a magnetic head.

Sector:

Early hard disk to the center of the disk radially outward to the start track is divided into aliquots of arcs, arcs of these sectors is the hard disk. Each sector is generally defined size 512byte, where we should be more confused, the outer circumference of the inner ring is significantly longer than that, how could each sector is 512byte? The answer is storing some early hard outer lower storage density than the inner ring , but it is still only a very long outer storage 512Byte, so if we know the number (number of tracks) cylinders cylinders, heads, heads, sectors number of sectors, the capacity of the hard disk we can substantially be calculated disk size = cylinders * Heads * Sectors * 512byte. However, due to the low density outer hard early, resulting in the disk utilization is not high, the hard disk is the same inside and outside the storage density disc mode is used, each track are divided into an arc segment 512byte size, also caused the inner and outer tracks so the number of sectors will be different, the number of sectors to be greater than the number of the inner ring outer ring sectors.

 

Disk addressing:

Hard disk access, reading data, the first step is addressing, namely locate the physical address of where the data resides, it is necessary to find the corresponding cylinders, heads and the corresponding sectors on the hard disk, then how to address it? There are two ways: CHS and LBA

 

2. Partition

The general use of the hard disk, the first partition on the hard drive, then format the partition using a file system format (NTFS, FAT, ext2 / ext3 / ext4) partition before you can use normal.

It is the smallest unit of the cylindrical partition , i.e. the partition number is a cylinder start to the end of a cylinder number.

 

 

 As shown, cylinder 1 can be divided into a region 200, 201 to cylinder 500 is subdivided into a zone, 501 to 1000 is subdivided into a region, and so on. 0 not in any partition cylinder inside, why?  

Speaking in front of the hard disk from the outer circumference (cylinder 0) to the inner sectors are sequentially numbered, each sector appears no different, but here the hard disk the first sector 0 of the cylinder (logical sector 0, the CHS represents should 0/0/1) is the most important, because the first sector of the hard disk is recorded important information in the entire hard disk, the first sector (512 bytes) is recorded two major parts:

 (1) MBR (Master Boot Record ): The master boot program here and master boot program is a boot program of the operating system , but this is only part of 446 bytes

 (2) DPT (Disk Partition table ): also here the hard disk partition table, partition table is used to partition the hard disk of the recording, for example, c is 1 to 200 cylindrical disc, d is a disc 201 to cylinder 500, the partition table a total of only 64 bytes, it can be seen, the partition is actually very simple, is to modify the records in this table inside look on the re-zoning, but since only 64 bytes, and a record will occupy 16 bytes, up to the partition table only record four partition information, in order to continue to separate more partitions, introduces the concept of the extended partition, that is to say, in these four partitions can be used to record a record in which the extended partition information, and then extended partition was continued for logically divided partitions, partition record logical partition is recorded in the first sector of the extended partition, such as linked lists can be partitioned as many partitions. Note, however, a partition table may have from 1 to 4 primary partitions, but can only have an extended partition .

   By way of example, may be a primary partition P1: 1 ~ 200, an extended partition P2: 201 ~ 1400, the beginning of the extended partition can be used to record the first sector of the extended partition partitioned from the logical partition.

 

3. Understand the Linux inode / block / superblock

After the completion of required disk partition format (format), after the operating system to be able to use this file system. Why the need for "Format" mean?

This is because each operating system set file attributes / permissions are not the same, in order to store the data required for these files, so you need to format the partition to become the operating system to take advantage of the "file system format (filesystem ). " Each operating system can use the file system is not the same.

Traditional disk and file system applications, it is only one partition can be formatted as a file system, it can be said that a filesystem is a partition. However, due to the use of new technologies such as the software often hear LVM disk array (software raid), these techniques may be formatted as a plurality of partition file system (e.g. LVM), the plurality of partitions can be a file system synthesis (LVM, RAID)! So, when the current format is no longer said to be formatted for a partition, usually called a data can be mounted as a file system instead of a partition.

The file system will typically both part of the data are stored in different blocks, the attribute authority placed into the inode, as for the actual data block is placed into the data block. Further, there is a super block (Superblock) records the entire information for the entire file system, and includes a block of inode total use amount of the remaining amount.

 

Each inode and the block has a number, as for the significance of these three data can be briefly described as follows:

  • superblock: General information recording this filesystem, including the total amount of inode / block, the amount of the remaining amount, and the format of the file system related information;
  • inode: Record file attributes of a file occupies one inode, while recording data block numbers for this file is located;
  • block: the actual contents of the log file, if the file is too large, it will take more than block

As each inode and the block has a number, and each file occupies a inode, there are block numbers placed in the file data inode. Therefore, if we can find the file's inode, then you'll know the block number of the data file is placed, of course, will be able to read the actual data of the file. This is a more efficient approach, because this way the disk can read out all the data in a short time, read and write performance is better.

 

 

The inode block and block diagram used to explain, as shown above, to format the file system inode block and the block, assuming a data file attributes and permissions is placed into the inode number 4, while the inode file data is recorded the actual placement point 2,7,13,15 four block number, then the operating system can accordingly be arranged in order to read the disk, you can breath the four block read out the contents. This method is referred to as data access indexing file system (indexed allocation)

That there is no other conventional file system you can compare ah? Yes, that is our usual flash drive (flash memory), flash drive using the FAT file system format generally. This format FAT file system inode did not exist, so there is no way all the FAT file this block is to be read out at the outset. Each block numbers are among the first to record a block, he reads a bit like the bottom of this:

 

 

Figure above we assume that the data file is written sequentially 1-> 7-> 4-> 15, the four block numbers, but the file system is no way to know one breath four block number, he had to be a after a read-out of the block, will know where the next block. If the block of data is written to the same file scattered too much, then we read the disk head will not turn around to attend all of the data on the disk, so the disk will be more complete in order to turn several laps to read contents of this file!

Often hear so-called "defragmentation", right? The reason is the need to defrag block files are written too discrete, and this time the file read performance will become poor due. This time you can sort through the debris to a file belongs to the same blocks aggregate together, this data will be read more easily ah! Naturally, FAT file systems need regular defragmentation about, whether it Ext2 disk reforming it?

Because the index is Ext2 file system, basically not need to defragment often. But if the file system for too long, often delete / edit / add files, or data file may cause too discrete problem, this time might need to be reformed carefully.

inode content property rights and related log files, as to block the block is the actual content in the log file. And the file system inode and block the outset planned, unless reformatted (or use resize2fs and other instruction to change the file system size), otherwise it will not change after the inode fixed and block. But if you think about it, if I file systems of up to several hundred GB, then all of the inode and block all put together would be very unwise decision because inode number and block too big, not easy to manage. Thus when the file system is formatted into a plurality of substantially block group (block group), each group of blocks has a separate inode / block / superblock system.

We feel like in the army, a battalion which has even divided into several, each even has its own liaison system, but eventually return on even the most accurate information to the general battalion! In this way into better manage groups of friends! Whole, the file system format a bit like the bottom of this:

 

 

In the overall planning, the file system front there is a boot sector (boot sector), you can install the boot sector boot manager, this is a very important design, because this way we can be of different boot manager to install individual file system the forefront, rather than whole pieces covering only MBR disk, which would also be able to make a multi-boot environment.

Each block group (block group) in the six main Description:

data block (resource block)

Local data block is used to place the contents of the file data, the file system block size supported there 1k, 2k, 4k only three kinds. When formatting the size of the block is fixed, and each block is numbered, to facilitate the recording of the inode Note, however, that due to differences in the size of the block, will cause the file system can support a maximum disk capacity and maximum single file size is not the same. Because the file system block size limit is generated as follows:

 

 

 

The basic block is limited as follows:

  1. Block size and number in the format can not be changed completely;
  2. At most a data file that can be placed within each block;
  3. If the file is larger than the size of the block, then a file will occupy more than the number of block;
  4. If the file is smaller than the block, the remaining capacity of the block is not able to be used again (wasting disk space)

The fourth point mentioned above, since each block can only accommodate a data file only, so if your files are very small, but when you choose the largest block in the 4K format Shique, may have some capacity waste Oh! We are under a
simple example to count it a waste of space!

 

 

What will produce the above situation?

BBS data such as websites it! If the BBS above data using a plain text file that records of each message, and the message content, if they are written on time, "such as the title," think about whether it will generate a lot of small files it?

Well, since a large block may produce serious waste of disk capacity, whether we will be set at 1K block size can be? This is wrong, because if the block is small, then large files will take up a greater number of block, but also to record more block inode number, this time will likely result in poor performance file system read and write.

 

What is the inode

Files stored on the hard disk, hard smallest unit of storage called "sectors (Sector)" . Storage 512 bytes per sector (corresponding 0.5KB)

Operating system reads the hard disk, it does not read a sector, so that the efficiency is too low, but a plurality of disposable continuous reading of sectors, i.e., a one-time read "block" (block). This "block" composed of a plurality of sectors, is a minimum unit of file access. Size "chunks", the most common is 4KB, namely eight consecutive sector to form a block.

File data is stored in the "block", then it is clear that you must also find a place to store meta-information file, such as file creator, creation date, file size of the file, and so on. This meta-information file storage area is called the inode, Chinese translation of "inode."

Each file has a corresponding inode, which contains some of the information associated with the file.

 

inode content

inode contains meta-information file, specifically, have the following:

  1. Number of bytes in the file
  2. User ID file owner
  3. Group ID documents
  4. Read files, write, and execute permissions
  5. Timestamp file, there are three: ctime inode refers to a time change, mtime refers to the time on the contents of a file changes, atime refers to a file open.
  6. The number of links that point to the number of file names inode
  7. The location of the file data block

 

Stat command can be used to view a file's inode information:

 

 

In short, all the file information in addition to the file name, there is among the inode. As for why no file name, there will be explained in detail below.

 

inode size

inode will consume disk space, so the hard disk formatting, the operating system automatically hard disk into two regions. A data area storing the file data; the other is a region inode (inode Table), the information stored inode contains.

Each of the inode size, typically 128 bytes or 256 bytes. The total number of the inode is, for a given when formatting, usually per 1KB 2KB or sets a per inode. Assuming a 1GB hard disk, the size of each of the inode is 128 bytes, sets a 1KB each inode, the inode table size that will reach 128MB, accounting for 12.8% of the entire hard drive.

Check each hard disk partition and the total number of inode number already in use, you can use the df command.

 

 

 

See each of the inode size, you can use the following commands:

 

 

 

note:

  dumpe2fs supports only ext2 / 3/4 file system

  If xfs need to use xfs_info View

 

 

 

inode number

Each has an inode number and operating system used to identify different inode number of files.

It is worth repeating again, internal Unix / linux system does not use file names, using inode number to identify the file. For a system, the file name just inode number for easy identification of another name or nickname.

On the surface, the user through the file name to open the file. In fact, internal systems of the process is divided into three steps:

  1. The system finds the inode number corresponding to the file name;
  2. By inode number, get inode information;
  3. According inode information, where to find the data block file, read data

 

Use ls -i command, you can see the inode number corresponding to the file name:

 

 

 

Directory files

Unix / Linux systems, the directory (directory) is a file. Open Directory, in fact, open the file directory.

Directory file structure is very simple, it is a list of a series of directory entries (dirent) of. Each directory entry consists of two parts: a file contains a file name, file name and inode number of the corresponding.

ls command lists only the file names of all files in the directory

ls -i command lists the entire directory file, the file name and inode number:

 

 

 

If you want to view the details of the file, it must be based on inode number, the inode access, read the information. ls -l command lists the details of the file.

Understand these above knowledge, we can understand the permissions to the directory. Read the catalog file permissions (r) and write permissions (w), is (that is, different users can access what file operations on the directory for the directory file itself, for example, where different users of the tmp directory files (d tmp can find out is a directory file, d represents directory, i.e., directory) are rwxr-xr-x, three characters of the first group, i.e. rwx, indicates that the file has read and write access to the file by the user, a second set of three character, namely rx, indicate the file has read and write permissions to groups of users who the user is located in the files of other users, the third group of three characters, namely rx, represent groups of users outside the owner of the file where the user of the user read and write permissions to the file. a process that is running under a user directory file access operations that can only be operated with the privileges of the file directory of the user has). Because only the file name in the file directory and inode numbers, so if you have only read access can only get the file name, can not obtain additional information, because other information is stored in the inode node, and read files in the directory information required to perform the inode permission (x).

 

Hard links

In general, the file name and inode number is "one to one" relationship, each inode number corresponds to a file name. However, Unix / Linux systems allow multiple file names point to the same inode number.

This means, you can use a different file name to access the same content; the file content changes will affect all file names; however, deleting a file name, does not affect access another file name. This situation is called a "hard link" (hard link).

ln command to create a hard link:

ln source destination file

 

 

After running above command, the same inode number of the source file and the target file, point to the same inode. inode information in a listing called "number of links" record points total of the inode file name, then it will increase 1.

Conversely, deleting a file name, it could make the "number of links" inode nodes minus 1. When this value reaches zero, indicating that no file name pointing to the inode, the inode number of the system will be recovered, and its corresponding block area.

 

Soft links

In addition to hard links, there is a special case.

inode number of files A and B, although not the same, but the content is the path file A file B. When reading the file A, the system will automatically direct visitors file B. Therefore, no matter what a file is opened, the final reading is a file B. In this case, the file A is called a "soft button" file B (soft link) or "symbolic link (symbolic link).

This means that the file A depends on B file exists, if you delete a file B, open the file A'll get an error: "No such file or directory". This is the biggest difference between soft and hard links links: A document to a file B file name, file inode number is not B's, B's file inode "number of links" will not change.

 

 

 

The special role of the inode

Since the inode number and file name separated, this mechanism has led to some Unix / Linux system-specific phenomenon.

(1) In some cases, the file name contains special characters, it can not be deleted properly. In this case, delete the inode, it can play the role of deleted files.

(2) to move files or rename files, just change the file name, does not affect the inode number.

(3) open a file in the future, the system to inode number to identify the file, the file name will not be considered. Therefore, generally speaking, the system does not know the file name from the inode number.

The third point enables easy software update can be updated without shutting down the software without restarting. Because the system through the file inode number to identify the running, not the file name. Update, the new version of the file with the same file name, create a new inode, it will not affect the file operation. Wait until the next time you run the software, the file name will automatically point to the new file, inode legacy files were recovered.

 

to sum up:

  1. An inode corresponding to a file, and a file based on its size, will occupy a plurality of Blocks;
  2. More accurate, one file corresponds to only one Inode. In fact, as hard links instead of creating a new file, just write a new relationship in the Directory in it;
  3. When you delete a file, only the inode is marked as available, the contents of the file in the block is not cleared, only the new files need block of time, it will be covered.

 

Reference article from the network and "Bird Brother private kitchens"

 

Reference links:

https://www.cnblogs.com/doll-net/p/6090298.html
https://blog.csdn.net/Ohmyberry/article/details/80427492

Guess you like

Origin www.cnblogs.com/hukey/p/11693712.html