Secrets proposal

2015-02-17 20:18:38 -05:00 · 2015-02-17 20:18:38 -05:00 · ea18e6698d
parent 3043ae9144
commit ea18e6698d
1 changed files with 547 additions and 0 deletions
--- a/docs/design/secrets.md
+++ b/docs/design/secrets.md
@ -0,0 +1,547 @@
+# Secret Distribution
+
+## Abstract
+
+A proposal for the distribution of secrets (passwords, keys, etc) to the Kubelet and to
+containers inside Kubernetes using a custom volume type.
+
+## Motivation
+
+Secrets are needed in containers to access internal resources like the Kubernetes master or
+external resources such as git repositories, databases, etc.  Users may also want behaviors in the
+kubelet that depend on secret data (credentials for image pull from a docker registry) associated
+with pods.
+
+Goals of this design:
+
+1.  Describe a secret resource
+2.  Define the various challenges attendant to managing secrets on the node
+3.  Define a mechanism for consuming secrets in containers without modification
+
+## Constraints and Assumptions
+
+*  This design does not prescribe a method for storing secrets; storage of secrets should be
+   pluggable to accomodate different use-cases
+*  Encryption of secret data and node security are orthogonal concerns
+*  It is assumed that node and master are secure and that compromising their security could also
+   compromise secrets:
+   *  If a node is compromised, the only secrets that could potentially be exposed should be the
+      secrets belonging to containers scheduled onto it
+   *  If the master is compromised, all secrets in the cluster may be exposed
+*  Secret rotation is an orthogonal concern, but it should be facilitated by this proposal
+
+## Use Cases
+
+1.  As a user, I want to store secret artifacts for my applications and consume them securely in
+    containers, so that I can keep the configuration for my applications separate from the images
+    that use them:
+    1.  As a cluster operator, I want to allow a pod to access the Kubernetes master using a custom
+        `.kubeconfig` file, so that I can securely reach the master
+    2.  As a cluster operator, I want to allow a pod to access a Docker registry using credentials
+        from a `.dockercfg` file, so that containers can push images
+    3.  As a cluster operator, I want to allow a pod to access a git repository using SSH keys,
+        so that I can push and fetch to and from the repository
+2.  As a user, I want to allow containers to consume supplemental information about services such
+    as username and password which should be kept secret, so that I can share secrets about a
+    service amongst the containers in my application securely
+3.  As a user, I want to associate a pod with a `ServiceAccount` that consumes a secret and have
+    the kubelet implement some reserved behaviors based on the types of secrets the service account
+    consumes:
+    1.  Use credentials for a docker registry to pull the pod's docker image
+    2.  Present kubernetes auth token to the pod or transparently decorate traffic between the pod
+        and master service
+4.  As a user, I want to be able to indicate that a secret expires and for that secret's value to
+    be rotated once it expires, so that the system can help me follow good practices
+
+### Use-Case: Configuration artifacts
+
+Many configuration files contain secrets intermixed with other configuration information.  For
+example, a user's application may contain a properties file than contains database credentials,
+SaaS API tokens, etc.  Users should be able to consume configuration artifacts in their containers
+and be able to control the path on the container's filesystems where the artifact will be
+presented.
+
+### Use-Case: Metadata about services
+
+Most pieces of information about how to use a service are secrets.  For example, a service that
+provides a MySQL database needs to provide the username, password, and database name to consumers
+so that they can authenticate and use the correct database. Containers in pods consuming the MySQL
+service would also consume the secrets associated with the MySQL service.
+
+### Use-Case: Secrets associated with service accounts
+
+[Service Accounts](https://github.com/GoogleCloudPlatform/kubernetes/pull/2297) are proposed as a
+mechanism to decouple capabilities and security contexts from individual human users.  A
+`ServiceAccount` contains references to some number of secrets.  A `Pod` can specify that it is
+associated with a `ServiceAccount`.  Secrets should have a `Type` field to allow the Kubelet and
+other system components to take action based on the secret's type.
+
+#### Example: service account consumes auth token secret
+
+As an example, the service account proposal discusses service accounts consuming secrets which
+contain kubernetes auth tokens.  When a Kubelet starts a pod associated with a service account
+which consumes this type of secret, the Kubelet may take a number of actions:
+
+1.  Expose the secret in a `.kubernetes_auth` file in a well-known location in the container's
+    file system
+2.  Configure that node's `kube-proxy` to decorate HTTP requests from that pod to the 
+    `kubernetes-master` service with the auth token, e. g. by adding a header to the request
+    (see the [LOAS Daemon](https://github.com/GoogleCloudPlatform/kubernetes/issues/2209) proposal)
+
+#### Example: service account consumes docker registry credentials
+
+Another example use case is where a pod is associated with a secret containing docker registry
+credentials.  The Kubelet could use these credentials for the docker pull to retrieve the image.
+
+### Use-Case: Secret expiry and rotation
+
+Rotation is considered a good practice for many types of secret data.  It should be possible to
+express that a secret has an expiry date; this would make it possible to implement a system
+component that could regenerate expired secrets.  As an example, consider a component that rotates
+expired secrets.  The rotator could periodically regenerate the values for expired secrets of
+common types and update their expiry dates.
+
+## Deferral: Consuming secrets as environment variables
+
+Some images will expect to receive configuration items as environment variables instead of files.
+We should consider what the best way to allow this is; there are a few different options:
+
+1.  Force the user to adapt files into environment variables.  Users can store secrets that need to
+    be presented as environment variables in a format that is easy to consume from a shell:
+
+        $ cat /etc/secrets/my-secret.txt
+        export MY_SECRET_ENV=MY_SECRET_VALUE
+
+    The user could `source` the file at `/etc/secrets/my-secret` prior to executing the command for
+    the image either inline in the command or in an init script, 
+
+2.  Give secrets an attribute that allows users to express the intent that the platform should
+    generate the above syntax in the file used to present a secret.  The user could consume these
+    files in the same manner as the above option.
+
+3.  Give secrets attributes that allow the user to express that the secret should be presented to
+    the container as an environment variable.  The container's environment would contain the
+    desired values and the software in the container could use them without accomodation the
+    command or setup script.
+
+For our initial work, we will treat all secrets as files to narrow the problem space.  There will
+be a future proposal that handles exposing secrets as environment variables.
+
+## Flow analysis of secret data with respect to the API server
+
+There are two fundamentally different use-cases for access to secrets:
+
+1.  CRUD operations on secrets by their owners
+2.  Read-only access to the secrets needed for a particular node by the kubelet
+
+### Use-Case: CRUD operations by owners
+
+In use cases for CRUD operations, the user experience for secrets should be no different than for
+other API resources.
+
+#### Data store backing the REST API
+
+The data store backing the REST API should be pluggable because different cluster operators will
+have different preferences for the central store of secret data.  Some possibilities for storage:
+
+1.  An etcd collection alongside the storage for other API resources
+2.  A collocated [HSM](http://en.wikipedia.org/wiki/Hardware_security_module)
+3.  An external datastore such as an external etcd, RDBMS, etc.
+
+#### Size limit for secrets
+
+There should be a size limit for secrets in order to:
+
+1.  Prevent DOS attacks against the API server
+2.  Allow kubelet implementations that prevent secret data from touching the node's filesystem
+
+The size limit should satisfy the following conditions:
+
+1.  Large enough to store common artifact types (encryption keypairs, certificates, small
+    configuration files)
+2.  Small enough to avoid large impact on node resource consumption (storage, RAM for tmpfs, etc)
+
+To begin discussion, we propose an initial value for this size limit of **1MB**.
+
+#### Other limitations on secrets
+
+Defining a policy for limitations on how a secret may be referenced by another API resource and how
+constraints should be applied throughout the cluster is tricky due to the number of variables
+involved:
+
+1.  Should there be a maximum number of secrets a pod can reference via a volume?
+2.  Should there be a maximum number of secrets a service account can reference?
+3.  Should there be a total maximum number of secrets a pod can reference via its own spec and its
+    associated service account?
+4.  Should there be a total size limit on the amount of secret data consumed by a pod?
+5.  How will cluster operators want to be able to configure these limits?
+6.  How will these limits impact API server validations?
+7.  How will these limits affect scheduling?
+
+For now, we will not implement validations around these limits.  Cluster operators will decide how
+much node storage is allocated to secrets. It will be the operator's responsibility to ensure that
+the allocated storage is sufficient for the workload scheduled onto a node.
+
+### Use-Case: Kubelet read of secrets for node
+
+The use-case where the kubelet reads secrets has several additional requirements:
+
+1.  Kubelets should only be able to receive secret data which is required by pods scheduled onto
+    the kubelet's node
+2.  Kubelets should have read-only access to secret data
+3.  Secret data should not be transmitted over the wire insecurely
+4.  Kubelets must ensure pods do not have access to each other's secrets
+
+#### Read of secret data by the Kubelet
+
+The Kubelet should only be allowed to read secrets which are consumed by pods scheduled onto that
+Kubelet's node and their associated service accounts.  Authorization of the Kubelet to read this
+data would be delegated to an authorization plugin and associated policy rule.
+
+#### Secret data on the node: data at rest
+
+Consideration must be given to whether secret data should be allowed to be at rest on the node:
+
+1.  If secret data is not allowed to be at rest, the size of secret data becomes another draw on
+    the node's RAM - should it affect scheduling?
+2.  If secret data is allowed to be at rest, should it be encrypted?
+    1.  If so, how should be this be done?
+    2.  If not, what threats exist?  What types of secret are appropriate to store this way?
+
+For the sake of limiting complexity, we propose that initially secret data should not be allowed
+to be at rest on a node; secret data should be stored on a node-level tmpfs filesystem.  This
+filesystem can be subdivided into directories for use by the kubelet and by the volume plugin.
+
+#### Secret data on the node: resource consumption
+
+The Kubelet will be responsible for creating the per-node tmpfs file system for secret storage.
+It is hard to make a prescriptive declaration about how much storage is appropriate to reserve for
+secrets because different installations will vary widely in available resources, desired pod to
+node density, overcommit policy, and other operation dimensions.  That being the case, we propose
+for simplicity that the amount of secret storage be controlled by a new parameter to the kubelet
+with a default value of **64MB**.  It is the cluster operator's responsibility to handle choosing
+the right storage size for their installation and configuring their Kubelets correctly.
+
+Configuring each Kubelet is not the ideal story for operator experience; it is more intuitive that
+the cluster-wide storage size be readable from a central configuration store like the one proposed
+in [#1553](https://github.com/GoogleCloudPlatform/kubernetes/issues/1553).  When such a store
+exists, the Kubelet could be modified to read this configuration item from the store.
+
+When the Kubelet is modified to advertise node resources (as proposed in
+[#4441](https://github.com/GoogleCloudPlatform/kubernetes/issues/4441)), the capacity calculation
+for available memory should factor in the potential size of the node-level tmpfs in order to avoid
+memory overcommit on the node.
+
+#### Secret data on the node: isolation
+
+Every pod will have a [security context](https://github.com/GoogleCloudPlatform/kubernetes/pull/3910).
+Secret data on the node should be isolated according to the security context of the container.  The
+Kubelet volume plugin API will be changed so that a volume plugin receives the security context of
+a volume along with the volume spec.  This will allow volume plugins to implement setting the
+security context of volumes they manage.
+
+## Community work:
+
+Several proposals / upstream patches are notable as background for this proposal:
+
+1.  [Docker vault proposal](https://github.com/docker/docker/issues/10310)
+2.  [Specification for image/container standardization based on volumes](https://github.com/docker/docker/issues/9277)
+3.  [Kubernetes service account proposal](https://github.com/GoogleCloudPlatform/kubernetes/pull/2297)
+4.  [Secrets proposal for docker (1)](https://github.com/docker/docker/pull/6075)
+5.  [Secrets proposal for docker (2)](https://github.com/docker/docker/pull/6697)
+
+## Proposed Design
+
+We propose a new `Secret` resource which is mounted into containers with a new volume type. Secret
+volumes will be handled by a volume plugin that does the actual work of fetching the secret and
+storing it. Secrets contain multiple pieces of data that are presented as different files within
+the secret volume (example: SSH key pair).
+
+In order to remove the burden from the end user in specifying every file that a secret consists of,
+it should be possible to mount all files provided by a secret with a single ```VolumeMount``` entry
+in the container specification.
+
+### Secret API Resource
+
+A new resource for secrets will be added to the API:
+
+```go
+type Secret struct {
+    TypeMeta
+    ObjectMeta
+
+    // Keys in this map are the paths relative to the volume
+    // presented to a container for this secret data.
+    Data map[string][]byte
+    Type SecretType
+}
+
+type SecretType string
+
+const (
+    SecretTypeOpaque              SecretType = "opaque"           // Opaque (arbitrary data; default)
+    SecretTypeKubernetesAuthToken SecretType = "kubernetes-auth"  // Kubernetes auth token
+    SecretTypeDockerRegistryAuth  SecretType = "docker-reg-auth"  // Docker registry auth
+    // FUTURE: other type values
+)
+
+const MaxSecretSize = 1 * 1024 * 1024
+```
+
+A Secret can declare a type in order to provide type information to system components that work
+with secrets.  The default type is `opaque`, which represents arbitrary user-owned data.
+
+Secrets are validated against `MaxSecretSize`.
+
+A new REST API and registry interface will be added to accompany the `Secret` resource.  The
+default implementation of the registry will store `Secret` information in etcd.  Future registry
+implementations could store the `TypeMeta` and `ObjectMeta` fields in etcd and store the secret
+data in another data store entirely, or store the whole object in another data store.
+
+#### Other validations related to secrets
+
+Initially there will be no validations for the number of secrets a pod references, or the number of
+secrets that can be associated with a service account.  These may be added in the future as the
+finer points of secrets and resource allocation are fleshed out.
+
+### Secret Volume Source
+
+A new `SecretSource` type of volume source will be added to the ```VolumeSource``` struct in the
+API:
+
+```go
+type VolumeSource struct {
+    // Other fields omitted
+
+    // SecretSource represents a secret that should be presented in a volume
+    SecretSource *SecretSource `json:"secret"`
+}
+
+type SecretSource struct {
+    Target ObjectReference
+}
+```
+
+Secret volume sources are validated to ensure that the specified object reference actually points
+to an object of type `Secret`.
+
+### Secret Volume Plugin
+
+A new Kubelet volume plugin will be added to handle volumes with a secret source.  This plugin will
+require access to the API server to retrieve secret data and therefore the volume `Host` interface
+will have to change to expose a client interface:
+
+```go
+type Host interface {
+    // Other methods omitted
+
+    // GetKubeClient returns a client interface
+    GetKubeClient() client.Interface
+}
+```
+
+The secret volume plugin will be responsible for:
+
+1.  Returning a `volume.Builder` implementation from `NewBuilder` that:
+    1.  Retrieves the secret data for the volume from the API server
+    2.  Places the secret data onto the container's filesystem
+    3.  Sets the correct security attributes for the volume based on the pod's `SecurityContext`
+2.  Returning a `volume.Cleaner` implementation from `NewClear` that cleans the volume from the
+    container's filesystem
+
+### Kubelet: Node-level secret storage
+
+The Kubelet must be modified to accept a new parameter for the secret storage size and to create
+a tmpfs file system of that size to store secret data.  Rough accounting of specific changes:
+
+1.  The Kubelet should have a new field added called `secretStorageSize`; units are megabytes
+2.  `NewMainKubelet` should accept a value for secret storage size
+3.  The Kubelet server should have a new flag added for secret storage size
+4.  The Kubelet's `setupDataDirs` method should be changed to create the secret storage
+
+### Kubelet: New behaviors for secrets associated with service accounts
+
+For use-cases where the Kubelet's behavior is affected by the secrets associated with a pod's
+`ServiceAccount`, the Kubelet will need to be changed.  For example, if secrets of type
+`docker-reg-auth` affect how the pod's images are pulled, the Kubelet will need to be changed
+to accomodate this.  Subsequent proposals can address this on a type-by-type basis.
+
+## Examples
+
+For clarity, let's examine some detailed examples of some common use-cases in terms of the
+suggested changes.  All of these examples are assumed to be created in a namespace called
+`example`.
+
+### Use-Case: Pod with ssh keys
+
+To create a pod that uses an ssh key stored as a secret, we first need to create a secret:
+
+```json
+{
+  "apiVersion": "v1beta2",
+  "kind": "Secret",
+  "id": "ssh-key-secret",
+  "data": {
+    "id_rsa.pub": "dmFsdWUtMQ0K",
+    "id_rsa": "dmFsdWUtMg0KDQo="
+  }
+}
+```
+
+**Note:** The values of secret data are encoded as base64-encoded strings.
+
+Now we can create a pod which references the secret with the ssh key and consumes it in a volume:
+
+```json
+{
+  "id": "secret-test-pod",
+  "kind": "Pod",
+  "apiVersion":"v1beta2",
+  "labels": {
+    "name": "secret-test"
+  },
+  "desiredState": {
+    "manifest": {
+      "version": "v1beta1",
+      "id": "secret-test-pod",
+      "containers": [{
+        "name": "ssh-test-container",
+        "image": "mySshImage",
+        "volumeMounts": [{
+          "name": "secret-volume",
+          "mountPath": "/etc/secret-volume",
+          "readOnly": true
+        }]
+      }],
+      "volumes": [{
+        "name": "secret-volume",
+        "source": {
+          "secret": {
+            "target": {
+              "kind": "Secret",
+              "namespace": "example",
+              "name": "ssh-key-secret"
+            }
+          }
+        }
+      }]
+    }
+  }
+}
+```
+
+When the container's command runs, the pieces of the key will be available in:
+
+    /etc/secret-volume/id_rsa.pub
+    /etc/secret-volume/id_rsa
+
+The container is then free to use the secret data to establish an ssh connection.
+
+### Use-Case: Pods with pod / test credentials
+
+Let's compare examples where a pod consumes a secret containing prod credentials and another pod
+consumes a secret with test environment credentials.
+
+The secrets:
+
+```json
+[{
+  "apiVersion": "v1beta2",
+  "kind": "Secret",
+  "id": "prod-db-secret",
+  "data": {
+    "username": "dmFsdWUtMQ0K",
+    "password": "dmFsdWUtMg0KDQo="
+  }
+},
+{
+  "apiVersion": "v1beta2",
+  "kind": "Secret",
+  "id": "test-db-secret",
+  "data": {
+    "username": "dmFsdWUtMQ0K",
+    "password": "dmFsdWUtMg0KDQo="
+  }
+}]
+```
+
+The pods:
+
+```json
+[{
+  "id": "prod-db-client-pod",
+  "kind": "Pod",
+  "apiVersion":"v1beta2",
+  "labels": {
+    "name": "prod-db-client"
+  },
+  "desiredState": {
+    "manifest": {
+      "version": "v1beta1",
+      "id": "prod-db-pod",
+      "containers": [{
+        "name": "db-client-container",
+        "image": "myClientImage",
+        "volumeMounts": [{
+          "name": "secret-volume",
+          "mountPath": "/etc/secret-volume",
+          "readOnly": true
+        }]
+      }],
+      "volumes": [{
+        "name": "secret-volume",
+        "source": {
+          "secret": {
+            "target": {
+              "kind": "Secret",
+              "namespace": "example",
+              "name": "prod-db-secret"
+            }
+          }
+        }
+      }]
+    }
+  }
+},
+{
+  "id": "test-db-client-pod",
+  "kind": "Pod",
+  "apiVersion":"v1beta2",
+  "labels": {
+    "name": "test-db-client"
+  },
+  "desiredState": {
+    "manifest": {
+      "version": "v1beta1",
+      "id": "test-db-pod",
+      "containers": [{
+        "name": "db-client-container",
+        "image": "myClientImage",
+        "volumeMounts": [{
+          "name": "secret-volume",
+          "mountPath": "/etc/secret-volume",
+          "readOnly": true
+        }]
+      }],
+      "volumes": [{
+        "name": "secret-volume",
+        "source": {
+          "secret": {
+            "target": {
+              "kind": "Secret",
+              "namespace": "example",
+              "name": "test-db-secret"
+            }
+          }
+        }
+      }]
+    }
+  }
+}]
+```
+
+The specs for the two pods differ only in the value of the object referred to by the secret volume
+source.  Both containers will have the following files present on their filesystems:
+
+    /etc/secret-volume/username
+    /etc/secret-volume/password