mirror of https://github.com/hashicorp/consul
Connect production guide draft 1
parent
b757b5cc48
commit
ac0c5c2bfa
|
@ -14,17 +14,17 @@ designed to work with minimal configuration out of the box, but completing the
|
||||||
security model](/docs/internals/security.html) are prerequisites for production
|
security model](/docs/internals/security.html) are prerequisites for production
|
||||||
deployments.
|
deployments.
|
||||||
|
|
||||||
This guide aims to walk step-by-step through a cluster setup that meets all of
|
This guide aims to walk through the steps required to ensure the security
|
||||||
those security-related goals.
|
guarantees hold.
|
||||||
|
|
||||||
We assume a cluster is already running with an appropriate number of servers and
|
We assume a cluster is already running with an appropriate number of servers and
|
||||||
clients. To follow along with this guide in a dev environment you can follow our
|
clients. To follow along with this guide in a dev environment you can follow our
|
||||||
[getting started guide](/intro/getting-started/install.html). For an actual
|
[getting started guide](/intro/getting-started/install.html). For a production
|
||||||
production cluster we expect other reference material like the
|
cluster we expect other reference material like the
|
||||||
[deployment](/docs/guides/deployment.html) and
|
[deployment](/docs/guides/deployment.html) and
|
||||||
[performance](/docs/guides/performance.html) guides have been followed.
|
[performance](/docs/guides/performance.html) guides have been followed.
|
||||||
|
|
||||||
The steps we need to take to get to a secure connect cluster are:
|
The steps we need to get to a secure Connect cluster are:
|
||||||
|
|
||||||
1. [Configure ACLs](#configure-acls)
|
1. [Configure ACLs](#configure-acls)
|
||||||
1. [Configure Agent Transport Encryption](#configure-agent-transport-encryption)
|
1. [Configure Agent Transport Encryption](#configure-agent-transport-encryption)
|
||||||
|
@ -51,11 +51,12 @@ A secure ACL setup must meet these criteria:
|
||||||
|
|
||||||
1. **[ACL default
|
1. **[ACL default
|
||||||
policy](https://private-docs.consul.io/docs/agent/options.html#acl_default_policy)
|
policy](https://private-docs.consul.io/docs/agent/options.html#acl_default_policy)
|
||||||
must be `deny`.** It is technically sufficient to keep default `allow` but
|
must be `deny`.** It is technically sufficient to keep the default policy of
|
||||||
add an explicit ACL denying anonymous `service:write`. Note however that in
|
`allow` but add an explicit ACL denying anonymous `service:write`. Note
|
||||||
this case the Connect intention graph will also default to `allow` and
|
however that in this case the Connect intention graph will also default to
|
||||||
explicit `deny` intentions will be needed to restrict service access. It is
|
`allow` and explicit `deny` intentions will be needed to restrict service
|
||||||
assumed for the remainder of this guide that ACL policy defaults to `deny`.
|
access. It is assumed for the remainder of this guide that ACL policy
|
||||||
|
defaults to `deny`.
|
||||||
2. **Each service must have a distinct ACL token** that is restricted to
|
2. **Each service must have a distinct ACL token** that is restricted to
|
||||||
`service:write` only for the named service. Current Consul ACLs only support
|
`service:write` only for the named service. Current Consul ACLs only support
|
||||||
prefix matching but in a near-future release we will allow exact name
|
prefix matching but in a near-future release we will allow exact name
|
||||||
|
@ -66,27 +67,30 @@ A secure ACL setup must meet these criteria:
|
||||||
### Fine Grained Enforcement
|
### Fine Grained Enforcement
|
||||||
|
|
||||||
Connect intentions manage access based only on service identity so it is
|
Connect intentions manage access based only on service identity so it is
|
||||||
sufficient for ACL tokens to only be unique per service and shared between
|
sufficient for ACL tokens to only be unique per _service_ and shared between
|
||||||
instances.
|
instances.
|
||||||
|
|
||||||
It is much better though if ACL tokens are unique per service _instance_ though.
|
It is much better though if ACL tokens are unique per service _instance_ because
|
||||||
The reason for this is to limit the blast radius of a compromise.
|
it limit the blast radius of a compromise.
|
||||||
|
|
||||||
A future release of Connect will support revoking specific certificates that
|
A future release of Connect will support revoking specific certificates that
|
||||||
have been issued. For example if a single node in a datacenter has been
|
have been issued. For example if a single node in a datacenter has been
|
||||||
compromised, it will be possible to find all certificates issued to the agent on
|
compromised, it will be possible to find all certificates issued to the agent on
|
||||||
that node and revoke them blocking access to the intruder without taking
|
that node and revoke them. This will block all access to the intruder without
|
||||||
unaffected instances of the service(s) on that node offline too.
|
taking unaffected instances of the service(s) on that node offline too.
|
||||||
|
|
||||||
While this will work with service-unique tokens, there is nothing stopping an
|
While this will work with service-unique tokens, there is nothing stopping an
|
||||||
attacker from obtaining certificates while spoofing the agent ID of another
|
attacker from obtaining certificates while spoofing the agent ID or other
|
||||||
agent - these certificates will not appear to have been issued to the
|
identifier – these certificates will not appear to have been issued to the
|
||||||
compromised agent and so will not be revoked. If every service instance has a
|
compromised agent and so will not be revoked.
|
||||||
unique token however, it will be possible to revoke all certificates that were
|
|
||||||
requested under that token which denies access to any certificate the attacker
|
|
||||||
could generate.
|
|
||||||
|
|
||||||
In practice managing per-instance tokens requires automated ACL provisioning,
|
If every service instance has a unique token however, it will be possible to
|
||||||
|
revoke all certificates that were requested under that token. Assuming the
|
||||||
|
attacker can only access the tokens present on the compromised host, this
|
||||||
|
guarantees that any certificate they might have access to or requested directly
|
||||||
|
will be revoked.
|
||||||
|
|
||||||
|
In practice, managing per-instance tokens requires automated ACL provisioning,
|
||||||
for example using [HashiCorp's
|
for example using [HashiCorp's
|
||||||
Vault](https://www.vaultproject.io/docs/secrets/consul/index.html).
|
Vault](https://www.vaultproject.io/docs/secrets/consul/index.html).
|
||||||
|
|
||||||
|
@ -99,6 +103,10 @@ between the server and client agents or between client agent and application.
|
||||||
Follow the [encryption documentation](/docs/agent/encryption.html) to ensure
|
Follow the [encryption documentation](/docs/agent/encryption.html) to ensure
|
||||||
both gossip encryption and RPC TLS are configured securely.
|
both gossip encryption and RPC TLS are configured securely.
|
||||||
|
|
||||||
|
For now client and server TLS certificates are still managed by manual
|
||||||
|
configuration. In the future we plan to automate more of that with the same
|
||||||
|
mechanisms connect offers to user applications.
|
||||||
|
|
||||||
## Bootstrap Certificate Authority
|
## Bootstrap Certificate Authority
|
||||||
|
|
||||||
Consul Connect comes with a built in Certificate Authority (CA) that will
|
Consul Connect comes with a built in Certificate Authority (CA) that will
|
||||||
|
@ -112,8 +120,6 @@ connect {
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Note that server agents running in `-dev` mode have this enabled by default.
|
|
||||||
|
|
||||||
This config change requires a restart which you can perform one server at a time
|
This config change requires a restart which you can perform one server at a time
|
||||||
to maintain availability in an existing cluster.
|
to maintain availability in an existing cluster.
|
||||||
|
|
||||||
|
@ -131,23 +137,63 @@ integrated. We will expand the external CA systems that are supported in the
|
||||||
future and will allow seamless online migration to a different CA or
|
future and will allow seamless online migration to a different CA or
|
||||||
bootstrapping with an external CA.
|
bootstrapping with an external CA.
|
||||||
|
|
||||||
For production workloads we recommend using Vault as the CA such that the root
|
For production workloads we recommend using Vault or another external CA once
|
||||||
key is not stored within Consul state at all.
|
available such that the root key is not stored within Consul state at all.
|
||||||
|
|
||||||
|
TODO: link to vault config docs?
|
||||||
|
|
||||||
## Setup Host Firewall
|
## Setup Host Firewall
|
||||||
|
|
||||||
If using [managed proxies]() Consul will by default assign them ports from [a
|
In order to enable inbound connections to connect proxies, you may need to
|
||||||
configurable range]() the default range is 20000 - 20255. If this feature is
|
configure host or network firewalls to allow incoming connections to proxy
|
||||||
used, the agent assumes all ports in that range are both free to use (no other
|
ports.
|
||||||
processes listening on them) and are exposed in the firewall to accept
|
|
||||||
connections from other service hosts.
|
|
||||||
|
|
||||||
TODO: could show example iptables rule but it seems kinda limited and obvious
|
In addition to Consul agent's [communication
|
||||||
|
ports](https://private-docs.consul.io/docs/agent/options.html#ports) any
|
||||||
|
[managed proxies](/docs/connect/proxies.html#managed-proxies) will need to have
|
||||||
|
ports open to accept incoming connections.
|
||||||
|
|
||||||
|
Consul will by default assign them ports from [a configurable
|
||||||
|
range](https://private-docs.consul.io/docs/agent/options.html#ports) the default
|
||||||
|
range is 20000 - 20255. If this feature is used, the agent assumes all ports in
|
||||||
|
that range are both free to use (no other processes listening on them) and are
|
||||||
|
exposed in the firewall to accept connections from other service hosts.
|
||||||
|
|
||||||
|
Alternatively, managed proxies can have their public ports specified as part of
|
||||||
|
the [proxy configuration](#TODO) in the service registration. It is possible to use
|
||||||
|
this exclusively and prevent automated port selection by [configuring
|
||||||
|
`proxy_min_port` and
|
||||||
|
`proxy_max_port`](https://private-docs.consul.io/docs/agent/options.html#ports)
|
||||||
|
to both be `0`, forcing any managed proxies to have an explicit port configured.
|
||||||
|
|
||||||
|
It then becomes the same problem as opening ports necessary for any other
|
||||||
|
application and might be managed by configuration management or a scheduler.
|
||||||
|
|
||||||
## Configure Service Instances
|
## Configure Service Instances
|
||||||
|
|
||||||
TODO:
|
With [necessary ACL tokens](#configure-acls) in place, all service registrations
|
||||||
- provide ACL token to API client/on disk
|
need to have an appropriate ACL token present.
|
||||||
- optionally configure manged proxy
|
|
||||||
- notes about binding app only to localhost
|
|
||||||
|
|
||||||
|
For on-disk configuration the `token` parameter of the service definition must
|
||||||
|
be set.
|
||||||
|
|
||||||
|
For registration via the API [the token is passed in the request
|
||||||
|
header](https://private-docs.consul.io/api/index.html#acls) or by using the [Go
|
||||||
|
client configuration](https://godoc.org/github.com/hashicorp/consul/api#Config).
|
||||||
|
Note that by default API registration will not allow managed proxies to be
|
||||||
|
configured since it potentially opens a remote execution vulnerability if the
|
||||||
|
agent API endpoints are publicly accessible. This can be [configured
|
||||||
|
per-agent](https://private-docs.consul.io/docs/agent/options.html#connect_proxy).
|
||||||
|
|
||||||
|
For examples of service definitions with managed or unmanaged proxies see
|
||||||
|
[proxies documentation](/docs/connect/proxies.html#managed-proxies).
|
||||||
|
|
||||||
|
To avoid the overhead of a proxy, applications may [natively
|
||||||
|
integrate](/docs/connect/native.html) with connect.
|
||||||
|
|
||||||
|
### Protect Application Listener
|
||||||
|
|
||||||
|
If using any kind of proxy for connect, the application must ensure no untrusted
|
||||||
|
connections can be made to it's unprotected listening port. This is typically
|
||||||
|
done by binding to `localhost` and only allowing loopback traffic, but may also
|
||||||
|
be achieved using firewall rules or network namespacing.
|
Loading…
Reference in New Issue