The command line parsing mechanism of the Linux kernel

Linux kernel’s cmdline parse

Kernel version number: 4.9.229

Recently, I encountered a bug related to the console at work, so I took the time to learn the principle of kernel command line parsing. This article takes version 4.9 as an example, briefly introduces the learning experience and summarizes the analysis mechanism of cmdline.

cmdline is often obtained by the joint action of BootLoader and dts. The general form is as follows:

Kernel command line: console=ttymxc0,115200 root=/dev/mmcblk1p2 rootwait rw

The kernel sets aside a single piece of data segment, ie .ini.setup段.

arch/arm/kernel/vmlinux.lds.S
==>
.init.data : {
		INIT_DATA
		INIT_SETUP(16)
		INIT_CALLS
		CON_INITCALL
		SECURITY_INITCALL
		INIT_RAM_FS
}

include/asm-generic/vmlinux.lds.hs
==>
#define INIT_SETUP(initsetup_align)					\
		. = ALIGN(initsetup_align);				\
		VMLINUX_SYMBOL(__setup_start) = .;			\
		*(.init.setup)						\
		VMLINUX_SYMBOL(__setup_end) = .;

__setup_startThe start and end of the init.setup section __setup_end. The .init.setup section stores the mapping table of kernel general parameters and corresponding processing functions.

include/linux/init.hA structure is defined in obs_kernel_param, which represents parameters and corresponding processing functions, and is stored in the .init.setup section.

struct obs_kernel_param {
    
    
	const char *str;
	int (*setup_func)(char *);
	int early;
};

#define __setup_param(str, unique_id, fn, early)			\
	static const char __setup_str_##unique_id[] __initconst		\
		__aligned(1) = str; 					\
	static struct obs_kernel_param __setup_##unique_id		\
		__used __section(.init.setup)				\
		__attribute__((aligned((sizeof(long)))))		\
		= {
      
       __setup_str_##unique_id, fn, early }

#define __setup(str, fn)						\
	__setup_param(str, fn, fn, 0)

#define early_param(str, fn)						\
	__setup_param(str, fn, fn, 1)

We focus on the console, which is defined in kernel/printk/printk.c

static int __init console_setup(char *str)
{
    
    
	...
}
__setup("console=", console_setup);

So we will __setup("console=", console_setup);bring in and expand to get:

static struct obs_kernel_param __setup_console_setup 
__used_section(.init.setup) __attribute__((aligned((sizeof(long)))) = {
    .str = “console=”,
    .setup_func = console_setup,
    .early = 0
}

__setup_console_setup will be linked to the .init.setup section when compiling, and will be compared with the name of obs_kernel_param in the .init.setup section according to the parameter name in the cmdline when the kernel is running.

If it matches, call console_setup to parse the parameter, and the parameter of console_setup is the value of console in cmdline.


Next, when the start_kernel function is executed, let's see how to start parsing cmdline step by step. The key functions are as follows:

asmlinkage __visible void __init start_kernel(void)
{
    
    
	...
    /*
     * 解析dtb中的bootargs并放置到boot_command_line中
     * 并且会执行early param的解析
     */
	setup_arch(&command_line); 
	...
	setup_command_line(command_line); //简单的备份和拷贝boot_command_line
	...
    /*
     * 执行early param的解析,由于setup_arch已经执行过一次,
     * 所以这里不会重复执行,会直接return
     */
	parse_early_param();
    /*
     * 执行普通的非early类型的cmdline的解析
     */
	after_dashes = parse_args("Booting kernel",
				  static_command_line, __start___param,
				  __stop___param - __start___param,
				  -1, -1, NULL, &unknown_bootoption);
	if (!IS_ERR_OR_NULL(after_dashes))
		parse_args("Setting init args", after_dashes, NULL, 0, -1, -1,
			   NULL, set_init_arg);
	...
}

Let's take a look at these 4 key functions in turn.

setup_arch

This function is related to the specific architecture. Different architectures correspond to different setup_arch functions. In this article, we use arm as an example.

Parse tags in setup_arch to get cmdline and copy it to boot_command_line. At the same time, memory and page tables are also initialized accordingly.

The key functions are as follows:

void __init setup_arch(char **cmdline_p)
{
    
    
    ...
	setup_processor();
    // 搜索dtb中的chosen并解析bootargs参数,并放到boot_command_line中
	mdesc = setup_machine_fdt(__atags_pointer);
	...
	strlcpy(cmd_line, boot_command_line, COMMAND_LINE_SIZE);
	*cmdline_p = cmd_line;
	...
    // 解析cmdline中的early param,从boot_command_line中获取bootargs参数
	parse_early_param();
	...
	early_paging_init(mdesc);
	...
	paging_init(mdesc);
    ...
}

The call chain of the setup_machine_fdt function is as follows:

setup_machine_fdt
	early_init_dt_scan_nodes
		of_scan_flat_dt(early_init_dt_scan_chosen, boot_command_line);

The final code will call early_init_dt_scan_chosen, whose function is to scan the chosen in the dts node and parse the corresponding bootargs parameters.

Next call parse_early_param, parse the early param in cmdline, and get the bootargs parameter from boot_command_line.

void __init parse_early_param(void)
{
    
          
    static int done __initdata;
    static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata;

    if (done)   //注意这个done flag,在一次启动过程中,该函数可能会被多次调用,但只会执行一次
        return; //因为结尾将done设为1,再次执行时会直接return

    strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE); 
    parse_early_options(tmp_cmdline);  //解析动作会破坏tmp_cmdline中的数据,所以才有了前面一步copy动作
    done = 1;
}

==>
void __init parse_early_options(char *cmdline)
{
    
    
	parse_args("early options", cmdline, NULL, 0, 0, 0, NULL,
		   do_early_param);
}

The implementation of parse_args is as follows:

/* Args looks like "foo=bar,bar2 baz=fuz wiz". */
char *parse_args(const char *doing,
		 char *args,
		 const struct kernel_param *params,
		 unsigned num,
		 s16 min_level,
		 s16 max_level,
		 void *arg,
		 int (*unknown)(char *param, char *val,
				const char *doing, void *arg))
{
    
    
	char *param, *val, *err = NULL;

	/* Chew leading spaces */
	args = skip_spaces(args);

	if (*args)
		pr_debug("doing %s, parsing ARGS: '%s'\n", doing, args);

	while (*args) {
    
    
		int ret;
		int irq_was_disabled;

		args = next_arg(args, &param, &val);
		/* Stop at -- */
		if (!val && strcmp(param, "--") == 0)
			return err ?: args;
		irq_was_disabled = irqs_disabled();
		ret = parse_one(param, val, doing, params, num,
				min_level, max_level, arg, unknown);
		if (irq_was_disabled && !irqs_disabled())
			pr_warn("%s: option '%s' enabled irq's!\n",
				doing, param);

		switch (ret) {
    
    
		case 0:
			continue;
		case -ENOENT:
			pr_err("%s: Unknown parameter `%s'\n", doing, param);
			break;
		case -ENOSPC:
			pr_err("%s: `%s' too large for parameter `%s'\n",
			       doing, val ?: "", param);
			break;
		default:
			pr_err("%s: `%s' invalid for parameter `%s'\n",
			       doing, val ?: "", param);
			break;
		}

		err = ERR_PTR(ret);
	}

	return err;
}

parse_args traverses the cmdline string, cuts parameters according to spaces, and calls next_arg for all parameters to obtain (param, val) key-value pairs. For example, console=ttymxc0,115200, then param=console, val=ttymxc0,115200.

Then call parse_one to process the key-value pair.

static int parse_one(char *param,
		     char *val,
		     const char *doing,
		     const struct kernel_param *params,
		     unsigned num_params,
		     s16 min_level,
		     s16 max_level,
		     void *arg,
		     int (*handle_unknown)(char *param, char *val,
				     const char *doing, void *arg))
{
    
    
	unsigned int i;
	int err;

	/* Find parameter */
	for (i = 0; i < num_params; i++) {
    
    
		if (parameq(param, params[i].name)) {
    
    
			if (params[i].level < min_level
			    || params[i].level > max_level)
				return 0;
			/* No one handled NULL, so do it here. */
			if (!val &&
			    !(params[i].ops->flags & KERNEL_PARAM_OPS_FL_NOARG))
				return -EINVAL;
			pr_debug("handling %s with %p\n", param,
				params[i].ops->set);
			kernel_param_lock(params[i].mod);
			param_check_unsafe(&params[i]);
			err = params[i].ops->set(val, &params[i]);
			kernel_param_unlock(params[i].mod);
			return err;
		}
	}

	if (handle_unknown) {
    
    
		pr_debug("doing %s: %s='%s'\n", doing, param, val);
		return handle_unknown(param, val, doing, arg);
	}

	pr_debug("Unknown argument '%s'\n", param);
	return -ENOENT;
}

Since num_params=0 passed in from parse_early_options, parse_one is the last handle_unknown function that goes directly, that is, do_early_param passed in from parse-early_options.

static int __init do_early_param(char *param, char *val,
				 const char *unused, void *arg)
{
    
    
	const struct obs_kernel_param *p;

	for (p = __setup_start; p < __setup_end; p++) {
    
    
		if ((p->early && parameq(param, p->str)) || //early是否置为1
		    (strcmp(param, "console") == 0 &&
		     strcmp(p->str, "earlycon") == 0)
		) {
    
    
			if (p->setup_func(val) != 0)
				pr_warn("Malformed early option '%s'\n", param);
		}
	}
	/* We accept everything at this stage. */
	return 0;
}

do_early_param will search from __setup_startthe to __setup_endarea, which is actually the area mentioned above __section(.init.setup), and find the corresponding obs_kernel_param structure array, and poll the members defined in it.

If obs_kernel_param early is 1, or there is console parameter in cmdline and obs_kernel_param has earlycon parameter, the setup function of obs_kernel_param will be called to parse the parameters.

do_early_param is to do cmdline analysis for functions that need to be configured as early as possible in the kernel (such as earlyprintk earlycon).

If the early value of obs_kernel_param is 0, the execution of parsing will be postponed, because parse_args will be called again.

setup_command_line

Call setup_command_line to copy 2 copies of cmdline and put them in saved_command_lineand static_command_line.

static void __init setup_command_line(char *command_line)
{
    
    
	saved_command_line =
		memblock_virt_alloc(strlen(boot_command_line) + 1, 0);
	initcall_command_line =
		memblock_virt_alloc(strlen(boot_command_line) + 1, 0);
	static_command_line = memblock_virt_alloc(strlen(command_line) + 1, 0);
	strcpy(saved_command_line, boot_command_line);
	strcpy(static_command_line, command_line);
}

parse_early_param

parse_early_param copied a copy of boot_command_line and called parse_args through parse_early_options.

Note: start_kernel will call parse_early_param twice in total, this is the second time.

/* Arch code calls this early on, or if not, just before other parsing. */
void __init parse_early_param(void)
{
    
    
	static int done __initdata;
	static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata;

	if (done)
		return;

	/* All fall through to do_early_param. */
	strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
	parse_early_options(tmp_cmdline);
	done = 1;
}

As mentioned above, the done flag has been set to 1, so it will return directly here.

parse_args

Continue to go down, after the execution of parse_early_param is completed, parse_args will be executed.

Note that this is the second time start_kernel executes parse_args.

The second execution of parse_args, its formal parameter parse_args is no longer NULL, but specified .__param段.

after_dashes = parse_args("Booting kernel",
				  static_command_line, __start___param,
				  __stop___param - __start___param,
				  -1, -1, NULL, &unknown_bootoption);

parse_args will still traverse the cmdline, split the cmdline into param and val key-value pairs, and call parse_one for each pair of parameters. This time parse_one is processed as follows:

  • First, it will traverse .__paramall the segments kernel_param, compare its name with the param of the parameter, and call the set method of the kernel_param member variable kernel_param_ops to set the parameter value with the same name. This is mainly for the command line parameters of loading the driver.
  • If parse_args is passed to parse_one, it is a general parameter of the kernel, such as console=ttyS0,115200. Then traverse the .__param section before parse_one will not find the matching kernel_param. Just go to the back and call handle_unknown. It is from parse_args unknown_bootoption.

unknown_boooption is as follows:

static int __init unknown_bootoption(char *param, char *val,
				     const char *unused, void *arg)
{
    
    
	repair_env_string(param, val, unused, NULL);

	/* Handle obsolete-style parameters */
	if (obsolete_checksetup(param)) //该函数是最终解析early=0类型param的
		return 0;

	/* Unused module parameter. */
	if (strchr(param, '.') && (!val || strchr(param, '.') < val))
		return 0;

	if (panic_later)
		return 0;

	if (val) {
    
    
		/* Environment option */
		unsigned int i;
		for (i = 0; envp_init[i]; i++) {
    
    
			if (i == MAX_INIT_ENVS) {
    
    
				panic_later = "env";
				panic_param = param;
			}
			if (!strncmp(param, envp_init[i], val - param))
				break;
		}
		envp_init[i] = param;
	} else {
    
    
		/* Command line option */
		unsigned int i;
		for (i = 0; argv_init[i]; i++) {
    
    
			if (i == MAX_INIT_ARGS) {
    
    
				panic_later = "init";
				panic_param = param;
			}
		}
		argv_init[i] = param;
	}
	return 0;
}
static int __init obsolete_checksetup(char *line)
{
    
           
    const struct obs_kernel_param *p;
    int had_early_param = 0;

    p = __setup_start;
    do {
    
    
        int n = strlen(p->str);
        if (parameqn(line, p->str, n)) {
    
    
            if (p->early) {
    
      //如果early=1,跳过,继续轮询
                /* Already done in parse_early_param?
                 * (Needs exact match on param part).
                 * Keep iterating, as we can have early
                 * params and __setups of same names 8( */
                if (line[n] == '\0' || line[n] == '=')
                    had_early_param = 1;
            } else if (!p->setup_func) {
    
      //如果setup_func不存在,就停止
                pr_warn("Parameter %s is obsolete, ignored\n",
                    p->str);
                return 1;
            } else if (p->setup_func(line + n))  //循环执行setup_func
                return 1;
        }
        p++;
    } while (p < __setup_end);
    return had_early_param;
}
  1. First, repair_env_string will reassemble param val into the form of param=val.
  2. obsolete_checksetup traverses all obs_kernel_param in the -init_setup section, and if there is a match between param->str and param, call param_>setup to configure the parameter value.
  3. parse_one will match each cmdline parameter passed by parse_args .__paramwith .init.setupsegment traversal, and if str or name matches, it will call its corresponding set or setup function to parse or set the parameter value.

The parse_args in start_kernel ends, and the cmdline of the kernel is parsed!

Summarize

  • Compile and link the kernel, use .__paramand .init.setupsegment to store the parameters required by the kernel and the mapping table of the corresponding processing function;

  • The kernel starts, do_early_param processes the parameters used by the kernel early (such as earlyprintk earlycon)

  • __paramparse_args traverses and .init.setupmatches each parameter of the cmdline . If the match is successful, the corresponding processing function is called to parse and set the parameter value.

Points to note

  • parse_early_param will be executed 2 times:
    • For the first time in setup_arch, parse the corresponding early params when early=1
    • The second time, because the done flag has been set to 1, it will return directly.
  • parse_args will also be executed 2 times
    • The first parse_args corresponds to the corresponding early params when parse_early_param is executed for the first time
    • The second parse_args is called directly in start_kernel, and the corresponding params when parsing early=0 are executed

Guess you like

Origin blog.csdn.net/fly_wt/article/details/125132038