Linux 单用户模式patch解析

在我之前文章提到Linux 4.1内核支持单用户模式(传送门:https://blog.csdn.net/cui841923894/article/details/81568351),此模式下用户UID和GID均为0同时不再区分用户权限(类root权限),应用于在某些小系统(例如嵌入式系统)。
接下来我们看下这个patch是如何实现内核单用户的。

内核patch解析

patch查看地址:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2813893f8b197a14f1e1ddb04d99bce46817c84a

1.commit说明

kernel: conditionally support non-root users, groups and capabilities
There are a lot of embedded systems that run most or all of their
functionality in init, running as root:root.  For these systems,
supporting multiple users is not necessary.
在很多嵌入式系统中,他们始终使用root:root用户进行操作。这些系统中,多用户功能显得不是很必需(鸡肋了~)。

This patch adds a new symbol, CONFIG_MULTIUSER, that makes support for
non-root users, non-root groups, and capabilities optional.  It is enabled
under CONFIG_EXPERT menu.
这个patch添加了新的CONFIG_MULTIUSER内核开关,支持non-root users,, non-root groups, and capabilities。

When this symbol is not defined, UID and GID are zero in any possible case
and processes always have all capabilities.
当CONFIG_MULTIUSER关闭(关闭多用户模式),UID和GID均是0,进程拥有所有capabilities拥有的功能。

The following syscalls are compiled out: setuid, setregid, setgid,
setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
getgroups, setfsuid, setfsgid, capget, capset.
同时系统调用setuid, setregid, setgid,
setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
getgroups, setfsuid, setfsgid, capget, capset将不再编译(和支持)。

Also, groups.c is compiled out completely.
同时group.c文件不再编译。

In kernel/capability.c, capable function was moved in order to avoid
adding two ifdef blocks.
kernel/capability.c中的capable相关函数也将移除(其实是采用#ifdef来判断进入正常处理还是直接返回)。

This change saves about 25 KB on a defconfig build.  The most minimal
kernels have total text sizes in the high hundreds of kB rather than
low MB.  (The 25k goes down a bit with allnoconfig, but not that much.
这项修改在使用defconfig(内核的默认config)可以节省25KB的内核二进制大小。在小内核的config场景可以节省数百KB空间(小于1MB)。在allnoconfig下节省稍微小于25KB的空间。

The kernel was booted in Qemu.  All the common functionalities work.
Adding users/groups is not possible, failing with -ENOSYS.
在虚拟机启动的系统(验证),基本系统调用都可以正常运行,所有设计添加users/groups的操作都无效,返回-ENOSYS。

Bloat-o-meter output:
add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650)

[[email protected]: coding-style fixes]
Signed-off-by: Iulia Manda <[email protected]>
Reviewed-by: Josh Triplett <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
Tested-by: Paul E. McKenney <[email protected]>
Reviewed-by: Paul E. McKenney <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

2.patch修改内容解析
因为patch涉及修改行多,并且很多目的相同,所以挑重点介绍。

a.某些功能和架构中添加对MULTIUSER config的支持:

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index a5ced5c..de2726a 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -328,6 +328,7 @@ config COMPAT
    select COMPAT_BINFMT_ELF if BINFMT_ELF
    select ARCH_WANT_OLD_COMPAT_IPC
    select COMPAT_OLD_SIGACTION
+   depends on MULTIUSER

diff --git a/drivers/staging/lustre/lustre/Kconfig b/drivers/staging/lustre/lustre/Kconfig
index 6725467..62c7bba 100644
--- a/drivers/staging/lustre/lustre/Kconfig
+++ b/drivers/staging/lustre/lustre/Kconfig
@@ -10,6 +10,7 @@ config LUSTRE_FS
    select CRYPTO_SHA1
    select CRYPTO_SHA256
    select CRYPTO_SHA512
+   depends on MULTIUSER
…

b.通过#ifdef CONFIG_MULTIUSER设置函数分支

diff --git a/include/linux/capability.h b/include/linux/capability.h
index aa93e5e..af9f0b9 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -205,6 +205,7 @@ static inline kernel_cap_t cap_raise_nfsd_set(const kernel_cap_t a,
               cap_intersect(permitted, __cap_nfsd_set));
 }

+#ifdef CONFIG_MULTIUSER //如果定义多用户,则执行正常功能函数
 extern bool has_capability(struct task_struct *t, int cap);
 extern bool has_ns_capability(struct task_struct *t,
                  struct user_namespace *ns, int cap);
@@ -213,6 +214,34 @@ extern bool has_ns_capability_noaudit(struct task_struct *t,
                      struct user_namespace *ns, int cap);
 extern bool capable(int cap);
 extern bool ns_capable(struct user_namespace *ns, int cap);
+#else // 如果non-root模式,则capability等操作不支持
+static inline bool has_capability(struct task_struct *t, int cap)
+{
+   return true;
+}
…
+static inline bool ns_capable(struct user_namespace *ns, int cap)
+{
+   return true;
+}
+#endif /* CONFIG_MULTIUSER *

diff --git a/include/linux/cred.h b/include/linux/cred.h
index 2fb2ca2..8b6c083 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -62,9 +62,27 @@ do {                         \
        groups_free(group_info);        \
 } while (0)

-extern struct group_info *groups_alloc(int);
 extern struct group_info init_groups;
+#ifdef CONFIG_MULTIUSER //non-root模式屏蔽in_group_p和in_egroup_p等函数
+extern struct group_info *groups_alloc(int);
 extern void groups_free(struct group_info *);
+
+extern int in_group_p(kgid_t);
+extern int in_egroup_p(kgid_t);
+#else
+static inline void groups_free(struct group_info *group_info)
+{
+}
+
+static inline int in_group_p(kgid_t grp)
+{
+        return 1;
+}
+static inline int in_egroup_p(kgid_t grp)
+{
+        return 1;
+}
+#endif

diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h
index 2d1f9b6..0ee05da 100644
--- a/include/linux/uidgid.h
+++ b/include/linux/uidgid.h
@@ -29,6 +29,7 @@ typedef struct {
 #define KUIDT_INIT(value) (kuid_t){ value }
 #define KGIDT_INIT(value) (kgid_t){ value }

+#ifdef CONFIG_MULTIUSER  //屏蔽__kuid_val和__kuid_val
 static inline uid_t __kuid_val(kuid_t uid)
 {
    return uid.val;
@@ -38,6 +39,17 @@ static inline gid_t __kgid_val(kgid_t gid)
 {
    return gid.val;
 }
+#else
+static inline uid_t __kuid_val(kuid_t uid)
+{
+   return 0;
+}
+
+static inline gid_t __kgid_val(kgid_t gid)
+{
+   return 0;
+}
+#endif

c. init/Kconfig添加MULTIUSER支持,这样内核make menuconfig可以看到MULTIUSER

…
+config MULTIUSER
+   bool "Multiple users, groups and capabilities support" if EXPERT
+   default y
+   help
+     This option enables support for non-root users, groups and
+     capabilities.
+
+     If you say N here, all processes will run with UID 0, GID 0, and all
+     possible capabilities.  Saying N here also compiles out support for
+     system calls related to UIDs, GIDs, and capabilities, such as setuid,
+     setgid, and capset.
+
+     If unsure, say Y here.
+

d. kernel/Makefile添加MULTIUSER支持

diff --git a/kernel/Makefile b/kernel/Makefile
index 1408b33..0f8f8b0 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -9,7 +9,9 @@ obj-y     = fork.o exec_domain.o panic.o \
        extable.o params.o \
        kthread.o sys_ni.o nsproxy.o \
        notifier.o ksysfs.o cred.o reboot.o \
-       async.o range.o groups.o smpboot.o
+       async.o range.o smpboot.o
+
+obj-$(CONFIG_MULTIUSER) += groups.o //这里,选择CONFIG_MULTIUSER后才会编译group.c

e.这里在capability.c中,第35行添加ifdef CONFIG_MULTIUSER,第386行添加+#endif /* CONFIG_MULTIUSER */,说明只有选择CONFIG_MULTIUSER,文件第35行——386行中包括的函数,才可以生效(定义,实现)。

diff --git a/kernel/capability.c b/kernel/capability.c
index 989f5bf..45432b5 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -35,6 +35,7 @@ static int __init file_caps_disable(char *str)
 }
 __setup("no_file_caps", file_caps_disable);

+#ifdef CONFIG_MULTIUSER
 /*
  * More recent versions of libcap are available from:
  *
@@ -386,6 +387,24 @@ bool ns_capable(struct user_namespace *ns, int cap)
 }
 EXPORT_SYMBOL(ns_capable);

+
+/**
+ * capable - Determine if the current task has a superior capability in effect
+ * @cap: The capability to be tested for
+ *
+ * Return true if the current task has the given superior capability currently
+ * available for use, false if not.
+ *
+ * This sets PF_SUPERPRIV on the task if the capability is available on the
+ * assumption that it's about to be used.
+ */
+bool capable(int cap)
+{
+   return ns_capable(&init_user_ns, cap);
+}
+EXPORT_SYMBOL(capable);
+#endif /* CONFIG_MULTIUSER */

f.sys_ni.c中添加以上处理函数。
这里提一下sys_ni.c作用,如果一个系统调用被淘汰,它所对应的服务例程就要被指定为sys_ni_syscall。sys_ni_syscall中的”ni”即表示”not implemented(没有实现)”。

diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 5adcb0a..7995ef5 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -159,6 +159,20 @@ cond_syscall(sys_uselib);
 cond_syscall(sys_fadvise64);
 cond_syscall(sys_fadvise64_64);
 cond_syscall(sys_madvise);
+cond_syscall(sys_setuid);
+cond_syscall(sys_setregid);
+cond_syscall(sys_setgid);
+cond_syscall(sys_setreuid);
+cond_syscall(sys_setresuid);
+cond_syscall(sys_getresuid);
+cond_syscall(sys_setresgid);
+cond_syscall(sys_getresgid);
+cond_syscall(sys_setgroups);
+cond_syscall(sys_getgroups);
+cond_syscall(sys_setfsuid);
+cond_syscall(sys_setfsgid);
+cond_syscall(sys_capget);
+cond_syscall(sys_capset);

以上,patch简单来说,就是实现了:

The following syscalls are compiled out: setuid, setregid, setgid,
setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
getgroups, setfsuid, setfsgid, capget, capset.

Also, groups.c is compiled out completely.

In kernel/capability.c, capable function was moved in order to avoid
adding two ifdef blocks.

运行效果

1.使用v4.18内核编译bzImage

#git branch
* (HEAD detached at v4.18)

#cp arch/x86/configs/x86_64_defconfig ./.config
#make menuconfig (关闭MULTIUSER)
#make bzImage -j8

这里写图片描述
编译好后,内核在目录arch/x86/boot/bzImage

2.使用qemu启动

/ # adduser cuibixuan  (这里为什么还能添加用户?)
adduser: /home/cuibixuan: No such file or directory
passwd: unknown uid 0
/ # su cuibixuan
su: can't set groups: Function not implemented

可以看到,groups相关操作,已经” Function not implemented”。说明添加到kernel/sys_ni.c的函数sys_setgroups已经生效(+cond_syscall(sys_setgroups);)。

后续

Linux对single-user system的支持,个人认为仅仅不支持uid/gid、group和等capability等相关函数是不够的。比如,启动前fs已经配置多个用户(/etc/passwd和/etc/group)怎么处理;以及某些(安全相关)系统调用建议运行在个人用户权限下怎么办?以及https://lwn.net/Articles/631853/讨论提到:
multiple processes, scheduling等问题:

Come to think of it, I look forward to the next tinification patch
that removes support for multiple processes, scheduling, and makes the
only running process always have pid 1.

或者针对threads讨论:

The problem is then that the single userspace task can prevent
necessary kernel threads from running.

这里写代码片

猜你喜欢

转载自blog.csdn.net/cui841923894/article/details/82261386
今日推荐