【安全】10分钟理解Capability本质

核心定理大公式

(注：man capabilities 7)

           P'(ambient)     = (file is privileged) ? 0 : P(ambient)
           P'(permitted)   = (P(inheritable) & F(inheritable)) |
                             (F(permitted) & cap_bset) | P'(ambient)
           P'(effective)   = F(effective) ? P'(permitted) : P'(ambient)
           P'(inheritable) = P(inheritable)    [i.e., unchanged]

       where:
           P         denotes the value of a thread capability set before the
                     execve(2)
           P'        denotes the value of a thread capability set after the
                     execve(2)
           F         denotes a file capability set
           cap_bset  is the value of the capability bounding set (described
                     below).

Linux内核行为

1. exec后仍然是root进程，在计算最终进程cap时，把file的所有全部设置为1

2. inherent的不能超过bnd(边界)

3. uid切换普通，p,e清空，b,i不变

4. euid切换普通，e清空，其余不变，还是可以切回到root，因此p不清

根据公式，得出推论

1. root用户如果把自己的3个全部搞为0，bnd不为0，exec之后全为1，由于被F影响。

2. bnd限制了root用户exec后的边界，docker就是对当前进程的bnd进行设置，避免在容器内root创建新进程而cap提升。

3. i的效果在普通用户下体现，i和bnd决定了这个进程及后续普通用户进程的边界。

4. i影响新进程的p(和e)，即使当前进程没有了p/e，exec之后还是可以有，新进程的p/e和当前进程的p/e没关系。

5. 对于普通用户，文件的e如果为0，一切都没有意义。因此对于普通用户来说，默认情况下没有任何cap，一定要通过文件的e来拥有权限。因此文件cap是仅针对普通用户而设计的。root用户在exec时完全忽略文件cap。

docker run --cap_add --cap_drop的本质

1. docker修改bnd，并让i等于bnd，这样后续root用户的边界是bnd，普通用户的边界是i和bnd。

2. 丢掉其它能力，只保持默认14个，应付root的

3. docker exec只拥有docker容器的权限，不能超过。

其它备注

1. e和p要同时去掉

2. i是exec不是fork，fork的完全继承父进程。

3. 进程的capset只能修改自己的

4. bnd是通过prctl来修改的

【安全】10分钟理解Capability本质

猜你喜欢