1. [Proposal for inlining container security fields](https://github.com/kubernetes/kubernetes/pull/12823)
## Use Cases
1. As a cluster operator, I want to support securing pods from one another using SELinux when
SELinux integration is enabled in the cluster
2. As a user, I want volumes sharing to work correctly amongst containers in pods
#### SELinux context: pod- or container- level?
Currently, SELinux context is specifiable only at the container level. This is an inconvenient
factoring for sharing volumes and other SELinux-secured resources between containers because there
is no way in SELinux to share resources between processes with different MCS labels except to
remove MCS labels from the shared resource. This is a big security risk: _any container_ in the
system can work with a resource which has the same SELinux context as it and no MCS labels. Since
we are also not interested in isolating containers in a pod from one another, the SELinux context
should be shared by all containers in a pod to facilitate isolation from the containers in other
pods and sharing resources amongst all the containers of a pod.
#### Volumes
Kubernetes volumes can be divided into two broad categories:
1. Unshared storage:
1. Volumes created by the kubelet on the host directory: empty directory, git repo, secret,
downward api. All volumes in this category delegate to `EmptyDir` for their underlying
storage.
2. Volumes based on network block devices: AWS EBS, iSCSI, RBD, etc, *when used exclusively
by a single pod*.
2. Shared storage:
1.`hostPath` is shared storage because it is necessarily used by a container and the host
2. Network file systems such as NFS, Glusterfs, Cephfs, etc.
3. Block device based volumes in `ReadOnlyMany` or `ReadWriteMany` modes are shared because
they may be used simultaneously by multiple pods.
For unshared storage, SELinux handling for most volumes can be generalized into running a `chcon` operation on the volume directory after running the volume plugin's `Setup` function. For these
volumes, the Kubelet can perform the `chcon` operation and keep SELinux concerns out of the volume
plugin code. Some volume plugins may need to use the SELinux context during a mount operation in
certain cases. To account for this, our design must have a way for volume plugins to state that
a particular volume should or should not receive generic label management.
For shared storage, the picture is murkier. Labels for existing shared storage will be managed
outside Kubernetes and administrators will have to set the SELinux context of pods correctly.
The problem of solving SELinux label management for new shared storage is outside the scope for
this proposal.
## Analysis
The system needs to be able to:
1. Model correctly which volumes require SELinux label management
1. Relabel volumes with the correct SELinux context when required
### Modeling whether a volume requires label management
#### Unshared storage: volumes derived from `EmptyDir`
Empty dir and volumes derived from it are created by the system, so Kubernetes must always ensure
that the ownership and SELinux context (when relevant) are set correctly for the volume to be
usable.
#### Unshared storage: network block devices
Volume plugins based on network block devices such as AWS EBS and RBS can be treated the same way
as local volumes. Since inodes are written to these block devices in the same way as `EmptyDir`
volumes, permissions and ownership can be managed on the client side by the Kubelet when used
exclusively by one pod. When the volumes are used outside of a persistent volume, or with the
`ReadWriteOnce` mode, they are effectively unshared storage.
When used by multiple pods, there are many additional use-cases to analyze before we can be
confident that we can support SELinux label management robustly with these file systems. The right
design is one that makes it easy to experiment and develop support for ownership management with
volume plugins to enable developers and cluster operators to continue exploring these issues.
#### Shared storage: hostPath
The `hostPath` volume should only be used by effective-root users, and the permissions of paths
exposed into containers via hostPath volumes should always be managed by the cluster operator. If
the Kubelet managed the SELinux labels for `hostPath` volumes, a user who could create a `hostPath`
volume could affect changes in the state of arbitrary paths within the host's filesystem. This
would be a severe security risk, so we will consider hostPath a corner case that the kubelet should
never perform ownership management for.
#### Shared storage: network
Ownership management of shared storage is a complex topic. SELinux labels for existing shared
storage will be managed externally from Kubernetes. For this case, our API should make it simple to
express whether a particular volume should have these concerns managed by Kubernetes.
We will not attempt to address the concerns of new shared storage in this proposal.
When a network block device is used as a persistent volume in `ReadWriteMany` or `ReadOnlyMany`
modes, it is shared storage, and thus outside the scope of this proposal.
#### API requirements
From the above, we know that label management must be applied:
1. To some volume types always
2. To some volume types never
3. To some volume types *sometimes*
Volumes should be relabeled with the correct SELinux context. Docker has this capability today; it