Connect production guide draft 1

pull/4275/head
Paul Banks 2018-06-13 16:19:44 +01:00 committed by Jack Pearkes
parent b757b5cc48
commit ac0c5c2bfa
1 changed files with 82 additions and 36 deletions

View File

@ -14,17 +14,17 @@ designed to work with minimal configuration out of the box, but completing the
security model](/docs/internals/security.html) are prerequisites for production security model](/docs/internals/security.html) are prerequisites for production
deployments. deployments.
This guide aims to walk step-by-step through a cluster setup that meets all of This guide aims to walk through the steps required to ensure the security
those security-related goals. guarantees hold.
We assume a cluster is already running with an appropriate number of servers and We assume a cluster is already running with an appropriate number of servers and
clients. To follow along with this guide in a dev environment you can follow our clients. To follow along with this guide in a dev environment you can follow our
[getting started guide](/intro/getting-started/install.html). For an actual [getting started guide](/intro/getting-started/install.html). For a production
production cluster we expect other reference material like the cluster we expect other reference material like the
[deployment](/docs/guides/deployment.html) and [deployment](/docs/guides/deployment.html) and
[performance](/docs/guides/performance.html) guides have been followed. [performance](/docs/guides/performance.html) guides have been followed.
The steps we need to take to get to a secure connect cluster are: The steps we need to get to a secure Connect cluster are:
1. [Configure ACLs](#configure-acls) 1. [Configure ACLs](#configure-acls)
1. [Configure Agent Transport Encryption](#configure-agent-transport-encryption) 1. [Configure Agent Transport Encryption](#configure-agent-transport-encryption)
@ -51,11 +51,12 @@ A secure ACL setup must meet these criteria:
1. **[ACL default 1. **[ACL default
policy](https://private-docs.consul.io/docs/agent/options.html#acl_default_policy) policy](https://private-docs.consul.io/docs/agent/options.html#acl_default_policy)
must be `deny`.** It is technically sufficient to keep default `allow` but must be `deny`.** It is technically sufficient to keep the default policy of
add an explicit ACL denying anonymous `service:write`. Note however that in `allow` but add an explicit ACL denying anonymous `service:write`. Note
this case the Connect intention graph will also default to `allow` and however that in this case the Connect intention graph will also default to
explicit `deny` intentions will be needed to restrict service access. It is `allow` and explicit `deny` intentions will be needed to restrict service
assumed for the remainder of this guide that ACL policy defaults to `deny`. access. It is assumed for the remainder of this guide that ACL policy
defaults to `deny`.
2. **Each service must have a distinct ACL token** that is restricted to 2. **Each service must have a distinct ACL token** that is restricted to
`service:write` only for the named service. Current Consul ACLs only support `service:write` only for the named service. Current Consul ACLs only support
prefix matching but in a near-future release we will allow exact name prefix matching but in a near-future release we will allow exact name
@ -66,27 +67,30 @@ A secure ACL setup must meet these criteria:
### Fine Grained Enforcement ### Fine Grained Enforcement
Connect intentions manage access based only on service identity so it is Connect intentions manage access based only on service identity so it is
sufficient for ACL tokens to only be unique per service and shared between sufficient for ACL tokens to only be unique per _service_ and shared between
instances. instances.
It is much better though if ACL tokens are unique per service _instance_ though. It is much better though if ACL tokens are unique per service _instance_ because
The reason for this is to limit the blast radius of a compromise. it limit the blast radius of a compromise.
A future release of Connect will support revoking specific certificates that A future release of Connect will support revoking specific certificates that
have been issued. For example if a single node in a datacenter has been have been issued. For example if a single node in a datacenter has been
compromised, it will be possible to find all certificates issued to the agent on compromised, it will be possible to find all certificates issued to the agent on
that node and revoke them blocking access to the intruder without taking that node and revoke them. This will block all access to the intruder without
unaffected instances of the service(s) on that node offline too. taking unaffected instances of the service(s) on that node offline too.
While this will work with service-unique tokens, there is nothing stopping an While this will work with service-unique tokens, there is nothing stopping an
attacker from obtaining certificates while spoofing the agent ID of another attacker from obtaining certificates while spoofing the agent ID or other
agent - these certificates will not appear to have been issued to the identifier these certificates will not appear to have been issued to the
compromised agent and so will not be revoked. If every service instance has a compromised agent and so will not be revoked.
unique token however, it will be possible to revoke all certificates that were
requested under that token which denies access to any certificate the attacker
could generate.
In practice managing per-instance tokens requires automated ACL provisioning, If every service instance has a unique token however, it will be possible to
revoke all certificates that were requested under that token. Assuming the
attacker can only access the tokens present on the compromised host, this
guarantees that any certificate they might have access to or requested directly
will be revoked.
In practice, managing per-instance tokens requires automated ACL provisioning,
for example using [HashiCorp's for example using [HashiCorp's
Vault](https://www.vaultproject.io/docs/secrets/consul/index.html). Vault](https://www.vaultproject.io/docs/secrets/consul/index.html).
@ -99,6 +103,10 @@ between the server and client agents or between client agent and application.
Follow the [encryption documentation](/docs/agent/encryption.html) to ensure Follow the [encryption documentation](/docs/agent/encryption.html) to ensure
both gossip encryption and RPC TLS are configured securely. both gossip encryption and RPC TLS are configured securely.
For now client and server TLS certificates are still managed by manual
configuration. In the future we plan to automate more of that with the same
mechanisms connect offers to user applications.
## Bootstrap Certificate Authority ## Bootstrap Certificate Authority
Consul Connect comes with a built in Certificate Authority (CA) that will Consul Connect comes with a built in Certificate Authority (CA) that will
@ -112,8 +120,6 @@ connect {
} }
``` ```
Note that server agents running in `-dev` mode have this enabled by default.
This config change requires a restart which you can perform one server at a time This config change requires a restart which you can perform one server at a time
to maintain availability in an existing cluster. to maintain availability in an existing cluster.
@ -131,23 +137,63 @@ integrated. We will expand the external CA systems that are supported in the
future and will allow seamless online migration to a different CA or future and will allow seamless online migration to a different CA or
bootstrapping with an external CA. bootstrapping with an external CA.
For production workloads we recommend using Vault as the CA such that the root For production workloads we recommend using Vault or another external CA once
key is not stored within Consul state at all. available such that the root key is not stored within Consul state at all.
TODO: link to vault config docs?
## Setup Host Firewall ## Setup Host Firewall
If using [managed proxies]() Consul will by default assign them ports from [a In order to enable inbound connections to connect proxies, you may need to
configurable range]() the default range is 20000 - 20255. If this feature is configure host or network firewalls to allow incoming connections to proxy
used, the agent assumes all ports in that range are both free to use (no other ports.
processes listening on them) and are exposed in the firewall to accept
connections from other service hosts.
TODO: could show example iptables rule but it seems kinda limited and obvious In addition to Consul agent's [communication
ports](https://private-docs.consul.io/docs/agent/options.html#ports) any
[managed proxies](/docs/connect/proxies.html#managed-proxies) will need to have
ports open to accept incoming connections.
Consul will by default assign them ports from [a configurable
range](https://private-docs.consul.io/docs/agent/options.html#ports) the default
range is 20000 - 20255. If this feature is used, the agent assumes all ports in
that range are both free to use (no other processes listening on them) and are
exposed in the firewall to accept connections from other service hosts.
Alternatively, managed proxies can have their public ports specified as part of
the [proxy configuration](#TODO) in the service registration. It is possible to use
this exclusively and prevent automated port selection by [configuring
`proxy_min_port` and
`proxy_max_port`](https://private-docs.consul.io/docs/agent/options.html#ports)
to both be `0`, forcing any managed proxies to have an explicit port configured.
It then becomes the same problem as opening ports necessary for any other
application and might be managed by configuration management or a scheduler.
## Configure Service Instances ## Configure Service Instances
TODO: With [necessary ACL tokens](#configure-acls) in place, all service registrations
- provide ACL token to API client/on disk need to have an appropriate ACL token present.
- optionally configure manged proxy
- notes about binding app only to localhost
For on-disk configuration the `token` parameter of the service definition must
be set.
For registration via the API [the token is passed in the request
header](https://private-docs.consul.io/api/index.html#acls) or by using the [Go
client configuration](https://godoc.org/github.com/hashicorp/consul/api#Config).
Note that by default API registration will not allow managed proxies to be
configured since it potentially opens a remote execution vulnerability if the
agent API endpoints are publicly accessible. This can be [configured
per-agent](https://private-docs.consul.io/docs/agent/options.html#connect_proxy).
For examples of service definitions with managed or unmanaged proxies see
[proxies documentation](/docs/connect/proxies.html#managed-proxies).
To avoid the overhead of a proxy, applications may [natively
integrate](/docs/connect/native.html) with connect.
### Protect Application Listener
If using any kind of proxy for connect, the application must ensure no untrusted
connections can be made to it's unprotected listening port. This is typically
done by binding to `localhost` and only allowing loopback traffic, but may also
be achieved using firewall rules or network namespacing.