Analyze the basic implementation of the Linux startup process

Download the Linux kernel URL:

https://www.kernel.org/

The latest Linux kernel is version 5.15. Now commonly used Linux kernel source codes are versions 4.14, 4.19, 4.9, etc. Among them, the source code compressed package of version 4.14 is about 90+M, and after decompression, it is 700+M, with a total of 61350 files. With so many files, it will be difficult to view with source insight or VSCode, so you can view them online.

View the Linux kernel source URL online:

https://elixir.bootlin.com/linux/latest/source

View the Android source code online:

http://androidxref.com/

The Android system is based on the Linux kernel, the bottom layer is the Linux kernel, and the amount of source code has doubled many times. Therefore, it is more difficult to use the software to view the Android source code, and you can use the online website to view the source code.

We know that there is a bootloader in front of the startup of the Linux system, such as the commonly used uboot. This article does not analyze the startup of uboot, but only puts a flowchart:

picture

This article mainly explains the process of system initialization by this function after jumping from the bootloader to the startup function start_kernel of the Linux system.

In the linux4.14/arch/arm/kernel/head.S file, it is the initialization of the final assembly stage, and then it will jump to the start_kernel function of the main.c file, where the Linux startup initialization is done, and this function will call Nearly 100 functions are used to complete the initialization of the Linux system, and the calling functions are as follows (different kernel versions, the order and details vary):

linux4.14/init/main.c, start_kernel function.

asmlinkage __visible void __init start_kernel(void)
{
 char *command_line;
 char *after_dashes;

 set_task_stack_end_magic(&init_task);
 smp_setup_processor_id();
 debug_objects_early_init();

 cgroup_init_early();

 local_irq_disable();
 early_boot_irqs_disabled = true;
 /*
  * Interrupts are still disabled. Do necessary setups, then
  * enable them.
  */
 boot_cpu_init();
 page_address_init();
 pr_notice("%s", linux_banner);
 setup_arch(&command_line);
 /*
  * Set up the the initial canary and entropy after arch
  * and after adding latent and command line entropy.
  */
 add_latent_entropy();
 add_device_randomness(command_line, strlen(command_line));
 boot_init_stack_canary();
 mm_init_cpumask(&init_mm);
 setup_command_line(command_line);
 setup_nr_cpu_ids();
 setup_per_cpu_areas();
 smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
 boot_cpu_hotplug_init();

 build_all_zonelists(NULL);
 page_alloc_init();

 pr_notice("Kernel command line: %s\n", boot_command_line);
 /* parameters may set static keys */
 jump_label_init();
 parse_early_param();
 after_dashes = parse_args("Booting kernel",
      static_command_line, __start___param,
      __stop___param - __start___param,
      -1, -1, NULL, &unknown_bootoption);
 if (!IS_ERR_OR_NULL(after_dashes))
  parse_args("Setting init args", after_dashes, NULL, 0, -1, -1,
      NULL, set_init_arg);
 /*
  * These use large bootmem allocations and must precede
  * kmem_cache_init()
  */
 setup_log_buf(0);
 pidhash_init();
 vfs_caches_init_early();
 sort_main_extable();
 trap_init();
 mm_init();

 ftrace_init();

 /* trace_printk can be enabled here */
 early_trace_init();
 /*
  * Set up the scheduler prior starting any interrupts (such as the
  * timer interrupt). Full topology setup happens at smp_init()
  * time - but meanwhile we still have a functioning scheduler.
  */
 sched_init();
 /*
  * Disable preemption - early bootup scheduling is extremely
  * fragile until we cpu_idle() for the first time.
  */
 preempt_disable();
 if (WARN(!irqs_disabled(),
   "Interrupts were enabled *very* early, fixing it\n"))
  local_irq_disable();
 radix_tree_init();
 /*
  * Allow workqueue creation and work item queueing/cancelling
  * early.  Work item execution depends on kthreads and starts after
  * workqueue_init().
  */
 workqueue_init_early();

 rcu_init();

 /* Trace events are available after this */
 trace_init();

 context_tracking_init();
 /* init some links before init_ISA_irqs() */
 early_irq_init();
 init_IRQ();
 tick_init();
 rcu_init_nohz();
 init_timers();
 hrtimers_init();
 softirq_init();
 timekeeping_init();
 time_init();
 sched_clock_postinit();
 printk_safe_init();
 perf_event_init();
 profile_init();
 call_function_init();
 WARN(!irqs_disabled(), "Interrupts were enabled early\n");
 early_boot_irqs_disabled = false;
 local_irq_enable();

 kmem_cache_init_late();
 /*
  * HACK ALERT! This is early. We're enabling the console before
  * we've done PCI setups etc, and console_init() must be aware of
  * this. But we do want output early, in case something goes wrong.
  */
 console_init();
 if (panic_later)
  panic("Too many boot %s vars at `%s'", panic_later,
        panic_param);

 lockdep_info();
 /*
  * Need to run this when irqs are enabled, because it wants
  * to self-test [hard/soft]-irqs on/off lock inversion bugs
  * too:
  */
 locking_selftest();
 /*
  * This needs to be called before any devices perform DMA
  * operations that might use the SWIOTLB bounce buffers. It will
  * mark the bounce buffers as decrypted so that their usage will
  * not cause "plain-text" data to be decrypted when accessed.
  */
 mem_encrypt_init();

#ifdef CONFIG_BLK_DEV_INITRD
 if (initrd_start && !initrd_below_start_ok &&
     page_to_pfn(virt_to_page((void *)initrd_start)) < min_low_pfn) {
  pr_crit("initrd overwritten (0x%08lx < 0x%08lx) - disabling it.\n",
      page_to_pfn(virt_to_page((void *)initrd_start)),
      min_low_pfn);
  initrd_start = 0;
 }
#endif
 kmemleak_init();
 debug_objects_mem_init();
 setup_per_cpu_pageset();
 numa_policy_init();
 if (late_time_init)
  late_time_init();
 calibrate_delay();
 pidmap_init();
 anon_vma_init();
 acpi_early_init();
#ifdef CONFIG_X86
 if (efi_enabled(EFI_RUNTIME_SERVICES))
  efi_enter_virtual_mode();
#endif
 thread_stack_cache_init();
 cred_init();
 fork_init();
 proc_caches_init();
 buffer_init();
 key_init();
 security_init();
 dbg_late_init();
 vfs_caches_init();
 pagecache_init();
 signals_init();
 proc_root_init();
 nsfs_init();
 cpuset_init();
 cgroup_init();
 taskstats_init_early();
 delayacct_init();

 check_bugs();

 acpi_subsystem_init();
 arch_post_acpi_subsys_init();
 sfi_init_late();

 if (efi_enabled(EFI_RUNTIME_SERVICES)) {
  efi_free_boot_services();
 }
 /* Do the rest non-__init'ed, we're now alive */
 rest_init();

 prevent_tail_call_optimization();
}

Among them, seven functions are more important, namely:

setup_arch(&command_line);

mm_init();

sched_init();

init_IRQ();

console_init();

vfs_caches_init();

rest_init();

1、setup_arch(&command_line)

This function is the system architecture initialization function, which handles the parameters passed in by uboot. Different architectures perform different initializations, that is to say, each architecture will have a setup_arch function.

linux4.14/arch/arm/kernel/setup.c

picture

2、mm_init

memory initialization function

linux4.14/init/main.c

picture

3、sched_init

The kernel process scheduler is initialized. The Linux kernel implements four scheduling methods, generally using the CFS scheduling method. As a universal operating system, various requirements must be considered. We cannot only specify the running time of the process according to the interrupt priority or the time rotation slice. As a multi-user operating system, the fairness of each user must be considered. It is not possible to limit the running time of a user's process just because he does not have advanced privileges. It is necessary to consider that each user has a fair amount of time.

linux4.14/kernel/sched/core.c

picture

4、init_IRQ

Interrupt initialization function, this is well understood, everyone has used interrupts.

linux4.14/arch/arm/kernel/irq.c

picture

5、console_init

Before this function is initialized, all the kernel printing functions printk you write will not print anything. Before this function is initialized, all printing will be stored in buf. After this function is initialized, the data in buf will be printed out, so that you can see what printk prints on the terminal.

tty is a terminal in Linux, and the two sentences _con_initcall_start and _con_initcall_end mean to execute all initcall functions between them.

linux4.14/kernel/printk/printk.c

picture

6、vfs_caches_init

Virtual file system initialization, such as sysfs, root file system, etc., is mounted at this step. proc is a kernel virtual, used to output kernel data structure information, not counted here.

The vfs virtual file system shields the difference of the underlying hardware, provides a unified interface, and facilitates the transplantation and use of the system. It enables users to directly port codes to other platforms without changing the application code.

linux4.14/fs/dcache.c

picture

The mount here is mainly in the mnt_init() function:

linux4.14/fs/namespace.c

picture

7、rest_init

This function can be regarded as the last function called by the start_kernel function. Here, the two most important kernel processes kernel_init and kthreadd are generated. After kernel_init, it will jump from the kernel space to the user space and become the init process of the user space, PID=1 , and kthreadd, PID=2, is a kernel process, which is specially used to monitor the request to create a kernel process. It maintains a linked list. If there is a need to create a kernel process, it will be created on the linked list.

At this point, the most important init process in user space has come out, and all subsequent user space processes are forked by the init process. If it is an Android system, the init process will fork a zygote process, which is the parent process of all Android system processes.

linux4.14/init.main.c

picture

picture

In the figure above, the kernel_init process is created on line 400, and the kthreadd process is created on line 412, both of which are kernel processes. Line 426 notifies the kernel_init process that kthreadd has been created. That is, kthreadd actually runs first, and kernel_init runs second.

You can refer to the following articles to understand the rest of the functions:

https://www.cnblogs.com/andyfly/p/9410441.html

https://www.cnblogs.com/lifexy/p/7366782.html

https://www.cnblogs.com/yanzs/p/13910344.html#radix_tree:init

Guess you like

Origin blog.csdn.net/weixin_41114301/article/details/132221142