linux root file system and initrd

1 Root file system
In simple terms, (root file system) is the first mounted file system in the system.
Filesystem Handling
Like every traditional Unix system, Linux makes use of a system 's root filesystem : it is the filesystem that is directly mounted by the kernel during the booting phase and that holds the system initialization scripts and the most essential system program.
Other   filesystems   can   be   mounted either   by   the   initialization   scripts   or   directly   by   the   users on   directories   of   already   mounted   filesystems.   Being   a  tree   of   directories   every   filesystem   has   its   own   root  directory.   The   directory   on   which   a   filesystem   is   mounted   is   called  the   mount   point.   A   mounted   filesystem   is   a   child   of   the   mounted  filesystem   to   which   the   mount   point   directory   belongs.   For   instance,  the   /proc   virtual   filesystem   is   a   child   of   the   system 's   root  filesystem   (and   the   system 's   root   filesystem   is   the   parent   of   /proc).  The   root   directory   of   a   mounted   filesystem   hides   the   content   of   the  mount   point   directory  of   the  parent   filesystem,   as   well   as   the   whole  subtree   of   the   parent   filesystem   below   the  mount  point.
In short, I think the root file system is a directory structure, so what is the difference between the root file system and the ordinary file system? I think the root file system is to include the necessary directories and key files when Linux starts. For example, when Linux starts, it needs to have related files in the init directory. When Linux mounts a partition, Linux will definitely find /etc/fstab. Mounting files, etc., the root file system also includes many application bin directories, etc., and any file including these necessary for Linux system startup can become the root file system.

2 Mount root file system
When linux starts, after a series of initializations, it is necessary to mount the root file system to prepare for the final running of the init process.
There are several ways to mount the root file system:
(1) The file system already exists on the hard disk ( or a similar device), the kernel directly mounts it according to the command line parameters (root=/dev/xxx) to start. Here is a question, how does the kernel find the corresponding device according to /dev/xxx when the root file system itself does not exist yet? It turns out that the kernel obtains the master and slave device numbers of the device by directly parsing the name of the device, and then it can access the corresponding device driver. Therefore, there is a long list of root_dev_names in init/main.c, through which the device number can be obtained according to the device name.
(2) Mount the root file system from a slower device such as a floppy drive. If the kernel supports ramdisk, when the root file system is loaded, the kernel determines that it needs to mount from a floppy disk (fdx), and will automatically copy the file system image to ramdisk. , generally corresponding to the device ram0, and then mount the root file system on ram0. From the source code, if the kernel does not support ramdisk when compiling, and the boot parameter is root=/dev/fd0, the system will mount directly on the floppy disk, except that the speed is relatively slow, which is theoretically feasible (I haven’t tried this, I don’t know Is it the same?)
(3) Initrd is used to mount the root file system at startup. At first, I was confused by the two things ramdisk and initrd. In fact, ramdisk is just a block device implemented on ram, and initrd can be said to be a mechanism used in the startup process. That is, before loading linux, the bootloader can load a relatively small image of the root file system in a specified location in memory, let's call this memory initrd, and then tell the kernel the starting address of initrd by passing parameters. and size (you can also compile these parameters into the kernel), you can temporarily use initrd to mount the root file system during the startup phase. The original purpose of initrd is to divide the startup of the kernel into two stages: keep the minimum and most basic startup code in the kernel, and then put the support for various hardware devices in the initrd in the form of modules, so that in the The required modules can be loaded from the root file system mounted by initrd during startup. One advantage of this is that it can flexibly support different hardware by modifying the content in initrd while keeping the kernel unchanged. In the final stages of boot completion, the root filesystem may be remounted to other devices, but may not be remounted (as is the case with many embedded systems). The specific implementation process of initrd is as follows: the bootloader loads the root file system image into the specified location of the memory, and passes the relevant parameters to the kernel. Release it, mount the root file system on ram0. It can be seen from this process that the kernel needs support for both ramdisk and initrd.
2. An implementation method of the embedded system root file system. For systems in which both the kernel and root file systems are stored in flash, the mechanism of initrd started by linux can generally be used. The specific process has been relatively clear before, and another point is to pass root=/dev/ram0 in the startup parameters, so that the root file system mounted by initrd will no longer be switched, because the actual device at this time is ram0. In addition, the starting address parameter of initrd is a virtual address, which needs to correspond to the physical address used in the bootloader.
3 initrd
3.1
initrd = init ramdisk, is a file system that exists in memory at startup.
The original purpose of initrd is to divide the startup of the kernel into two stages: keep the minimum and most basic startup code in the kernel, and then put the support for various hardware devices in the initrd in the form of modules, so that in the The required modules can be loaded from the root file system mounted by initrd during startup. One advantage of this is that it can flexibly support different hardware by modifying the content in initrd while keeping the kernel unchanged. In the final stages of boot completion, the root filesystem can be remounted to other devices.
Do you have to use initrd for Linux startup?
No, if you compile all the required functions into the kernel (non-module mode), you only need one kernel file, and initrd can reduce the size of the startup kernel and increase flexibility. If your kernel supports a certain file system (such as ext3, UFS) in a module mode, and the driver module (such as jbd) in the startup phase is placed on these file systems, the kernel cannot read the file system, so it can only pass the initrd virtual file system to load these modules.
Some people here will ask: Since the kernel cannot read the file system at this time, how is the kernel's file loaded into memory? The answer is simple, Grub is file-system sensitive and can recognize common filesystems.
How is the initrd file generated?
Use the mkinitrd command, which is actually a Bash script
#file `which mkinitrd`
/sbin/mkinitrd: Bourne-Again shell script text executable
The script first creates an 8M empty file, and on this Create a file system and copy the corresponding files.
A default RedHat Fedora Core 2, what is its initrd
(related to system hardware)?
# file initrd-2.6.5-1.358.img
initrd-2.6.5-1.358.img: gzip compressed data, from Unix, max compression
# mv initrd-2.6.5-1.358.img initrd-2.6.5-1.358.gz
# gzip -d initrd-2.6.5-1.358.gz
# ll
-rw-r--r-- 1 root root 8192000 Jan 14 11:32 initrd-2.6.5-1.358
# mkdir /mnt/loop
# mount -o loop initrd-2.6.5-1.356 /mnt/loop
………… Modify this filesystem in the middle, etc…………
# umount loop
# cd /boot
# gzip -9 initrd-2.6.5-1.356
# mv initrd-2.6.5-1.356.gz initrd-2.6.5-1.356.img
3.2
The standard answer is: initrd is a temporary root file system used by linux in the system boot process to support two-stage boot Process.
To put it more vernacularly, initrd is a virtual RAM disk with a root file system, which contains the root directory '/', and other directories, such as: bin, dev, proc, sbin, sys and other directories necessary for linux to start, And added the necessary executable commands to the bin directory.
The PC or server linux kernel uses this initrd to mount the real root file system, and then unloads this initrd from memory. In this case, initrd is actually a transitional thing. Of course, it is also possible to directly use the initrd as the root file system without uninstalling the initrd, which is of course in the absence of a hard disk, which is mostly used in ultra-lightweight embedded systems without disks. In fact, most embedded systems now have their own disks, so initrd is also used for transition in most embedded systems now.
Initrd's boot process
'second-stage boot program', commonly used is grub to decompress the kernel and copy it into memory, then the kernel takes over the CPU to start executing, and then the kernel calls the init() function, note that this init function is not later init process! ! ! The kernel then calls the function initrd_load() to load the initrd root filesystem in memory. The Initrd_load() function calls some other functions to allocate space for the RAM disk, and to calculate CRC and other operations. The RAM disk is then decompressed and loaded into memory. Now, there is an image of the initrd in memory.
Then the kernel will call the mount_root() function to create the real file system with the partition, then call the sys_mount() function to mount the real root file system, and then chdir into the real root file system.
Finally, the init function calls the run_init_process function, and uses execve to start the init process, thereby entering the init running process.

4 The standard process of
Linux system startup The standard process of Linux system startup The
system startup refers to the entire process from the power-on of the computer to the display of the user's login prompt. We will discuss the whole process and some of the content involved here. The process can be mainly divided into two stages: loading the kernel and preparing the running environment, which we discuss separately. The discussion in this section is only based on the i386 hardware architecture, but most of the content is common.

Figure 1 Overview of the boot process
Loading the kernel (loading the kernel into the memory and passing control to it)
The computer is powered on to the Boot Loader to start working, the hardware content is much greater than the software content, so I won't mention it here, if you really care Friends, please don't worry, we will discuss it in the next issue.
This stage is the main battlefield of the Boot Loader. It has to load the executable kernel image and the extra data information required for the kernel to boot from the storage medium into memory, which is not a simple task, because in addition to loading from the hard disk, it may also need to boot the server from the network. Load on external media. The variety of various file system types also brings huge challenges to loading.
The Boot Loader may also need to change the CPU's operating privilege level before the kernel can be put into operation.
In addition, Boot Loader also completes some other functions, such as obtaining system information from BIOS, or extracting information from command line parameters at startup. Some Boot Loaders also play the role of a boot selection tool, which is convenient for users to choose different operating systems.
Boot Loader's responsibilities:
l Determine what to load, which can require the user to make choices
l Load the kernel and related data it may need, such as initrd or other parameters
l Prepare the operating environment for the kernel, for example, let the CPU enter the privileged mode
l Put the kernel into running
the historical changes of Boot Loader:
Early Linux only supported two kinds of Boot Loaders: floppy disk boot sector and Shoelace. Shoelace is a file system related Boot Loader inherited from Minix. It only supports the Minix file system. At the time, Linux only used Minix as a filesystem, so it wasn't a problem. However, the Minix file system cannot save creation, modification and access time information; the file name length is limited to 14 bytes and so on. With the development of Linux, these flaws that are so different from the traditional Unix file system are becoming more and more intolerable, and it is no longer suitable as the main file system of Linux.
To support the implementation of other file systems, Linux introduced VFS (Virtual File System). The move was quickly met with enthusiastic responses, and a slew of new filesystem implementations emerged. One of the Minix filesystem variants, the extended filesystem Xiafs (named after its author), breaks the Minix filesystem's filename length limit, increasing this length to all 30 characters in one fell swoop. The competition between filesystems was so intense at the time that it was hard to see who would win, or even whether there would be an eventual "winner".
 Although there is a lot of uncertainty, one thing is clear: no matter which filesystem will be favored in the end, no one can boot from the hard disk except Minix as the root filesystem, since Shoelace only supports the Minix filesystem. LILO came into being. Due to the support of multiple file systems (the mainstream file systems supported by the kernel at that time already had Minix, extended file system ext, Xiafs. Some people were porting BSD's FFS, and it was impossible to see when it was over) It was too difficult to implement and maintain. , and Boot Loader should not be a stumbling block for people to experiment with new file systems, so LILO adopts a file system-independent design.
This design has stood the test of time and has proven to be very successful. Even today, LILO can still be booted from hard disks with most filesystems supported by the kernel. However, since ext2 has not evolved much for such a long time and has become the de facto standard, the Boot Loader related to the file system has gradually become popular.
Although ext2 has met most people's daily needs, file system designers are still developing new file systems featuring a journaling mechanism, and have made considerable progress. Considering the possibility that multiple file system implementations coexist at the same time, the demand for a file system-independent Boot Loader may become strong again.
Initializing the Basic Operating Environment
Once the kernel starts running, it initializes the internal data structures, detects the hardware, and activates the appropriate drivers to prepare the operating environment for the application software. The period contains an important operation - the operating environment of the application software must have a file system, so the kernel must first mount the root file system. Since our purpose is to introduce the basic process, the relevant hardware initialization details will not be discussed, and the relevant content will be introduced in detail in the next issue of the magazine.
After hardware initialization is complete, the kernel sets out to create the first process - the initial process. The process is actually a continuation of the entire hardware power-on initialization process, but the logic of the process is complete, so we "normalize" it according to the way the process was created — - We also call this initial process "hardware process", it will occupy the first position of the process descriptor table, so it can be represented by task[0][k1] or INIT_TASK. The process will then create a new process to execute the init() function. In fact, this new process is the first actually useful process in the system, and it will be responsible for executing the initialization operation of the next stage; the initial process (INIT_TASK) It will start to execute the idle loop by itself, that is, after the kernel initialization is completed, the only task of the initial process is to consume idle CPU time when there are no other processes to execute (so the initial process is also called the idle process).
The initialization of the next stage is a little easier than the previous stage, because now a real process takes over and does them, whereas the previous stage was done manually by the "hardware process". At this stage, this new process created by INIT_TASK needs to initialize the bus, network and start various system kernel background threads in the system, then initialize the peripherals, set the file format, and after that, it needs to do the final step for entering the system Prepare - initialize the filesystem, mount the root filesystem, open the /dev/console device, redirect stdin, stdout, and stderr to the console, then search the filesystem for an init program, and use the execve() system call to load and execute the init program . The system has since entered the user mode.
Mounting the root filesystem
In order to mount a filesystem, the kernel needs to: 1. know on which storage medium the root filesystem is located; and 2. have a driver to access that medium.
Most commonly, the root filesystem is an ext2 filesystem, located on an IDE hard drive. The operation required in this case is simple: just pass the device number as a parameter to the kernel, and IDE device drivers are usually compiled into the kernel.
The problem is compounded if the kernel does not have drivers for the relevant media. And this is not uncommon, such as the "generic" kernel used by Linux installation disks. If the kernel included device drivers for all supported hardware, it would be a monolith; and some drivers would affect other devices when detecting the hardware.
This problem can be solved by the initrd mechanism, which allows the RAM filesystem to be used before the actual root filesystem is mounted. In addition to the above two reasons, the introduction of initrd can also solve the problem of dynamic synthesis of the kernel. (See Reference 1 for details.)
However, we should note that initrd is not always present during the entire startup process. It can be said to be a plug-in. In order to solve the above problems, it is added to the startup process, as shown in Figure 1. It is also not necessary to select it when the Linux system starts.
Why introduce initrd?
The kernel image must be loaded during the Linux startup process. During this process, some factors must be considered:
First, the kernel image cannot be too large. Due to various hardware and compatibility constraints, the Linux kernel image cannot be too large, but this is not easy to do. The core part of the Linux kernel itself is not small; and the drivers that will be used must also be added.
Second, to support as many hardware devices as possible. We have one important job during the boot process: mount the root filesystem, since further data and application software are on it, our kernel must be able to access the root filesystem. For the average user, if they use the ext2 file partition on the IDE hard drive as the root filesystem, there should be no problem. Because whether it is an IDE hard disk or an ext2 file system, their drivers will definitely be included in the kernel image itself. However, there are indeed some special cases: for example, if we want to release the installation CD of the Linux system, then the driver for the CD is not necessarily included in the kernel. (Some people may be wondering, hey, haven't all the kernel images in the CD have been read in, why can't the kernel access the CD? Note that the kernel image is read by Boot Loader, and the kernel does not have the function of Boot Loader. ) If there is no CD-ROM driver, how can we install the software package in the CD-ROM to the user's computer? Precompile the driver into the kernel? Sounds good, but if we had some other installation media besides CDs, then all those drivers would make the kernel image huge.
Moreover, there is a more serious problem, the various drivers are likely to conflict, especially in the past when ISA devices dominated the market, such conflicts were almost inevitable.
The solution at that time was for the distributor to provide pre-compiled different kernels to support various devices, put each kernel into a floppy disk, and hand it to the user along with the distribution package. The user chooses the floppy disk with the appropriate kernel to boot. . Or provide users with tools for making boot disks, allowing users to make their own boot disks before installation. Of course, neither approach is satisfactory.
The only hope is to use a modular mechanism. When the kernel starts, the corresponding module is called to load the driver, and then access the root file system. Whether it is to further analyze the device through the kernel or obtain the configuration information directly from the user, the method of configuring first and then loading the module can effectively avoid the occurrence of conflicts.
In addition to the need to call the corresponding modules before mounting the root file system at installation time, on the system that has completed the installation, we may still need to call some modules before mounting the root file system. This is mainly for configuring the computer - usually for different computers to configure the kernel.
Ideally, the user configures the compilation file according to his actual situation, recompiles the kernel, and completes this work step by step. But few users like this tedious and error-prone work. And compiling and generating the kernel requires corresponding tools, but most users do not need these tools.
A monolithic kernel can be directly compiled during the installation process, but this does not solve the problem very well: firstly, all the compilation tools are still needed, and secondly, there is a high probability that the task cannot be completed due to errors in the compilation process . Therefore, we still have to use the module mechanism: the module mechanism is very reliable, and an error will just not load the corresponding module, and will not make the whole task fail. Loading a module, as mentioned earlier, also requires getting the module before mounting the root filesystem.
Based on the above reasons, Linux introduced the initrd mechanism.
What initrd does
initrd allows the system to load a RAM disk at boot time, which can be treated as a root filesystem on which programs can run. (It has two meanings, first, the program is on it; second, the file system environment of the program is also on it.) After this, a new root file system can be mounted from another device, the previous root file system. (initrd) will be moved to a directory and eventually uninstalled.
Why use a RAM disk? First of all, using a RAM disk can easily support possible changes in the future; in addition, it is also to keep the work of the Boot Loader as simple as possible. When the system boots, in addition to the kernel image, Boot Loader reads all relevant information into memory as a file, and the kernel treats the file as a continuous memory block during startup. That is to use it as a RAM disk. Because of this, this mechanism is called "initial RAM Disk", abbreviated as initrd.
The initrd is mainly used to divide the system startup into two stages: the kernel of the initial startup only needs to keep the minimal set of drivers, and after that, when the additional modules must be loaded at the startup, they are loaded from the initrd.
Operations performed
by initrd When initrd is used, the typical system startup process becomes:
1) Boot Loader reads the kernel image and initrd file
2) The kernel converts the initrd file into a "normal" RAM disk and releases the initrd file occupied of memory.
3) initrd is used as the root file system and is installed in a read-write mode.
4) /linuxrc is executed (it can be any executable, including scripts; it executes as uid0 and basically does everything an init program can do)
5) linuxrc mounts the "real" root filesystem
6 ) linuxrc places the root filesystem under the root directory via the pivot_root system call.
7) The common startup process (such as calling /sbin/init) starts to execute.
8) Unmount the initrd file system.
Note that this is a typical flow. In fact, the initrd mechanism can be used in two ways: either as an ordinary root file system, in which case the fifth and sixth steps can be skipped and /sbin/init can be executed directly (our test system uses this method); or as a staging environment through which the kernel can continue to mount the "real" root filesystem.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327074482&siteId=291194637