Linux file system initialization process (4) --- load initrd (middle)

First, the purpose

    The initrd file in CPIO format is introduced above in detail. This article analyzes the process of loading and parsing the initrd file from the perspective of source code.

    The initrd file and the linux kernel are generally stored in the disk space. In the system startup phase, bootload is responsible for loading the kernel and initrd on the disk into the designated memory space; then, the initrd file is read and parsed by the kernel, in the VFS (currently Only the root directory of rootfs) creates new directories, regular files, symbolic link files, and special files; in this way, the VFS grows from the root directory "/" to a large and leafy tree.

 

Second, the function call process

 

    The detailed loading process of initrd is implemented in init/initramfs.c. In order to better understand the loading process, we give the call relationship diagram 1 of key functions. It should be noted here that because the roofs_initcall() macro is used to register the populate_rootfs() function in the initcallroofs section, populate_rootfs() will be implicitly called when the do_initcalls() function is executed.

                                figure 1

Three, initcall introduction

 

    Linux defines a special segment initcall in the code segment, which stores function pointers; in the linux initialization phase, do_initcalls() is called to execute the functions in this segment in turn. For detailed information about this section, please refer to the vmlinux.lds.S link script.

    The user can call the following set of macros to register function pointers in the initcall segment; the initcall segment is divided into 8 levels of initcall0-initcall7, the initcall0 segment has the highest priority, the initcall7 segment has the lowest priority, and the higher priority segment is executed first ; The priority of the initcallrootfs segment is between 5 and 6.

#define __define_initcall(fn, id) \
179     static initcall_t __initcall_##fn##id __used \
180     __attribute__((__section__(".initcall" #id ".init"))) = fn

 

 

187 #define early_initcall(fn)          __define_initcall(fn, early)
      
196 #define pure_initcall(fn)           __define_initcall(fn, 0)
        
198 #define core_initcall(fn)           __define_initcall(fn, 1)
199 #define core_initcall_sync(fn)      __define_initcall(fn, 1s)
200 #define postcore_initcall(fn)       __define_initcall(fn, 2)
201 #define postcore_initcall_sync(fn)  __define_initcall(fn, 2s)
202 #define arch_initcall(fn)           __define_initcall(fn, 3)
203 #define arch_initcall_sync(fn)      __define_initcall(fn, 3s)
204 #define subsys_initcall(fn)         __define_initcall(fn, 4)
205 #define subsys_initcall_sync(fn)    __define_initcall(fn, 4s)
206 #define fs_initcall(fn)             __define_initcall(fn, 5)
207 #define fs_initcall_sync(fn)        __define_initcall(fn, 5s)
208 #define rootfs_initcall(fn)         __define_initcall(fn, rootfs)
209 #define device_initcall(fn)         __define_initcall(fn, 6)
210 #define device_initcall_sync(fn)    __define_initcall(fn, 6s)
211 #define late_initcall(fn)           __define_initcall(fn, 7)
212 #define late_initcall_sync(fn)      __define_initcall(fn, 7s)

 

  Users can use different priority initcall macros to easily register function pointers in the Linux code; store these function pointers in the corresponding initcall section; finally, do_initcalls() executes the functions in the section in order according to the priority. The code is implemented as follows:

 

715 static initcall_t *initcall_levels[] __initdata = {
716     __initcall0_start,
717     __initcall1_start,
718     __initcall2_start,
719     __initcall3_start,
720     __initcall4_start,
721     __initcall5_start,
722     __initcall6_start,
723     __initcall7_start,
724     __initcall_end,
725 };

678 int __init_or_module do_one_initcall(initcall_t fn)
679 {
681     int ret;
686     ret = fn();
    }

739 static void __init do_initcall_level(int level)
740 {
742     initcall_t *fn;
            ... 
751     for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++)
752         do_one_initcall(*fn);
753 }
754 
755 static void __init do_initcalls(void)
756 {
757     int level;
758 
759     for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++)
760         do_initcall_level(level);
761 }

 

Back to the topic of loading initrd, at the end of init/initram.c, the populate_rootfs() function is registered with the rootfs_initcall macro; based on the above analysis, we know that this is the entry point for loading the initrd file. Let’s analyze the function of this function.

627 rootfs_initcall(populate_rootfs);

Fourth, load the initrd file

 

    In the system startup phase, bootload loads the initrd into the memory whose starting address is initrd_start and the ending address is initrd_end.

    populate_rootfs() calls unpack_to_rootfs() to read and parse the initrd file from the memory; according to the CPIO format, we know that the initrd file is composed of many segments, and the segment is composed of file header, file name and file body, so the analysis The program can use the state machine principle to process the initrd file.

    The parser defines the following 8 states: Start (initial state), Collect (obtain symbolic link file information state), GotHeader (obtain file header information state), SkipIt (skip this section state), GotName (obtain file name and create a new File status), CopyFile (write file status), GotSymlink (new symbolic link file status), Reset (termination status).

 

376 static __initdata int (*actions[])(void) = {
377     [Start]     = do_start,
378     [Collect]   = do_collect,
379     [GotHeader] = do_header,
380     [SkipIt]    = do_skip,
381     [GotName]   = do_name,
382     [CopyFile]  = do_copy,
383     [GotSymlink]    = do_symlink,
384     [Reset]     = do_reset,
385 };

 

 

    In order to understand the parsing process of the initrd file intuitively, the state machine jump diagram 2 is given below.

    It can be seen from the figure that the file is divided into symbolic links and non-symbolic links. This is because the symbolic link file is a special file. Only the inode of the first symbolic link file stores real data, and The path name of the first symbolic link file is stored in the inode of other symbolic link files. Therefore, the path name of the first symbolic link file needs to be cached. The data structure of the cache is a hash table, so it is often used when processing symbolic link files. The operation of some hash tables is divided into two cases: symbolic link files and non-symbol link files.

    The detailed analysis process of the initrd file is as follows:

    1. S0: Initial state, initialize some global variables;

    2. S1: Get the file header and file body of the symbolic link file;

    3. S2: Obtain file header information according to the definition of CPIO format;

    4. S3: Skip the current CPIO format segment and continue to process the next segment;

    5. S4: Get the file name and create a new file in VFS;

    6. S5: Write the content of the file to the newly created file;

    7. S6: Create a new symbolic link file;

    8. S7: After processing the current CPIO format segment, continue the processing of a segment.

 

    It can also be seen from the figure that since the directory file and the special file have no file content, the S5 state is skipped and the S3 state is directly entered.

                                                                     figure 2

Five, summary

 

    Through the above analysis, the program can successfully parse the initrd file and use sys_dir(), sys_open(), sys_mknod(), sys_symlink() and other system calls to create new directories, regular files, special files, and symbolic link files. At this time, VFS has grown from only the root directory "/" to a large tree with rich content.

 

Guess you like

Origin blog.csdn.net/daocaokafei/article/details/114873107