为什么给予Docker容器SYS_ADMIN特权“不好”?

20
我遇到了安全团队的问题,因为工程团队想要在Docker中挂载文件系统,但是为了实现这一点,必须设置"--cap-add SYS_ADMIN"标志。安全团队不允许使用此标志。
在互联网上我找到了很多关于在Docker运行时使用"--cap-add SYS_ADMIN"标志需要谨慎的文章,因为"仅凭SYS_ADMIN就授予了相当大的能力范围,并且可能会呈现更多的攻击面。"
然而,我找不到任何明确说明这些能力是什么以及它们会呈现哪些"攻击面"的文章?
SYS_ADMIN标志到底授予了什么?
设置此标志存在什么实际的安全风险?
1个回答

17

这基本上是对主机的根访问权限。来自 capabilities 手册页面:

CAP_SYS_ADMIN Note: this capability is overloaded; see Notes to kernel developers, below.

          * Perform a range of system administration operations
            including: quotactl(2), mount(2), umount(2), pivot_root(2),
            setdomainname(2);
          * perform privileged syslog(2) operations (since Linux 2.6.37,
            CAP_SYSLOG should be used to permit such operations);
          * perform VM86_REQUEST_IRQ vm86(2) command;
          * perform IPC_SET and IPC_RMID operations on arbitrary System
            V IPC objects;
          * override RLIMIT_NPROC resource limit;
          * perform operations on trusted and security Extended
            Attributes (see xattr(7));
          * use lookup_dcookie(2);
          * use ioprio_set(2) to assign IOPRIO_CLASS_RT and (before
            Linux 2.6.25) IOPRIO_CLASS_IDLE I/O scheduling classes;
          * forge PID when passing socket credentials via UNIX domain
            sockets;
          * exceed /proc/sys/fs/file-max, the system-wide limit on the
            number of open files, in system calls that open files (e.g.,
            accept(2), execve(2), open(2), pipe(2));
          * employ CLONE_* flags that create new namespaces with
            clone(2) and unshare(2) (but, since Linux 3.8, creating user
            namespaces does not require any capability);
          * call perf_event_open(2);
          * access privileged perf event information;
          * call setns(2) (requires CAP_SYS_ADMIN in the target
            namespace);
          * call fanotify_init(2);
          * call bpf(2);
          * perform privileged KEYCTL_CHOWN and KEYCTL_SETPERM keyctl(2)
            operations;
          * perform madvise(2) MADV_HWPOISON operation;
          * employ the TIOCSTI ioctl(2) to insert characters into the
            input queue of a terminal other than the caller's
            controlling terminal;
          * employ the obsolete nfsservctl(2) system call;
          * employ the obsolete bdflush(2) system call;
          * perform various privileged block-device ioctl(2) operations;
          * perform various privileged filesystem ioctl(2) operations;
          * perform privileged ioctl(2) operations on the /dev/random
            device (see random(4));
          * install a seccomp(2) filter without first having to set the
            no_new_privs thread attribute;
          * modify allow/deny rules for device control groups;
          * employ the ptrace(2) PTRACE_SECCOMP_GET_FILTER operation to
            dump tracee's seccomp filters;
          * employ the ptrace(2) PTRACE_SETOPTIONS operation to suspend
            the tracee's seccomp protections (i.e., the
            PTRACE_O_SUSPEND_SECCOMP flag);
          * perform administrative operations on many device drivers.
          * Modify autogroup nice values by writing to
            /proc/[pid]/autogroup (see sched(7)).

1
“主机”是指运行容器的系统吗?还是这些特权仅限于容器内部的系统?如果具有CAP_SYS_ADMIN特权的正在运行的Docker容器以某种方式访问运行此容器的系统,您能否举个例子说明如何应用此特权? - goulashsoup
2
@goulashsoup 上面列出了暴露的系统调用列表。并非所有这些系统调用都被隔离到命名空间中。无论您运行多少个容器,都只有一个内核处理主机和主机上的容器的这些系统调用。特别是mount、bpf、clone、setns、seccomp和ptrace看起来很有前途。 - BMitch
但是,如果我在Windows上运行Linux Docker容器,则Docker容器不能仅使用主机内核,因为Windows具有与Linux不同的系统调用,这是一个误解吗? - goulashsoup
2
你不能在Windows主机上运行Linux容器。相反,你需要在Windows上运行一个Linux虚拟机,然后在该虚拟机上运行容器。因此问题变成了虚拟机的安全性有多重要,特别是因为它被授予了访问Windows文件系统的权限。 - BMitch

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接