mirror of https://github.com/hashicorp/consul
Merge pull request #7823 from hashicorp/docs-wanfed-mesh
Redo PR #7430 for new website (docs for WAN federation over mesh gateways)pull/7876/head
commit
0a77ea2bfc
|
@ -155,6 +155,7 @@ export default [
|
|||
content: ['envoy', 'built-in', 'integrate'],
|
||||
},
|
||||
'mesh_gateway',
|
||||
'wan-federation-via-mesh-gateways',
|
||||
{
|
||||
category: 'registration',
|
||||
content: ['service-registration', 'sidecar-service'],
|
||||
|
|
|
@ -0,0 +1,185 @@
|
|||
---
|
||||
layout: docs
|
||||
page_title: Connect - WAN Federation via Mesh Gateways
|
||||
sidebar_title: WAN Federation via Mesh Gateways <sup> Beta </sup>
|
||||
description: |-
|
||||
WAN federation via mesh gateways allows for Consul servers in different datacenters to be federated exclusively through mesh gateways.
|
||||
---
|
||||
|
||||
# WAN Federation via Mesh Gateways <sup>Beta</sup>
|
||||
|
||||
-> **1.8.0+:** This feature is available in Consul versions 1.8.0 and higher
|
||||
|
||||
~> This topic requires familiarity with [mesh gateways](/docs/connect/mesh_gateway).
|
||||
|
||||
WAN federation via mesh gateways allows for Consul servers in different datacenters
|
||||
to be federated exclusively through mesh gateways.
|
||||
|
||||
When setting up a
|
||||
[multi-datacenter](https://learn.hashicorp.com/consul/security-networking/datacenters)
|
||||
Consul cluster, operators must ensure that all Consul servers in every
|
||||
datacenter must be directly connectable over their WAN-advertised network
|
||||
address from each other.
|
||||
|
||||
If you are using Kubernetes, refer to our [Kubernetes Multi Cluster](/docs/k8s/installation/multi-cluster) documentation.
|
||||
|
||||
This requires that operators setting up the virtual machines or containers
|
||||
hosting the servers take additional steps to ensure the necessary routing and
|
||||
firewall rules are in place to allow the servers to speak to each other over
|
||||
the WAN.
|
||||
|
||||
Sometimes this prerequisite is difficult or undesirable to meet:
|
||||
|
||||
* **Difficult:** The datacenters may exist in multiple Kubernetes clusters that
|
||||
unfortunately have overlapping pod IP subnets, or may exist in different
|
||||
cloud provider VPCs that have overlapping subnets.
|
||||
|
||||
* **Undesirable:** Network security teams may not approve of granting so many
|
||||
firewall rules. When using platform autoscaling, keeping rules up to date becomes untenable.
|
||||
|
||||
Operators looking to simplify their WAN deployment and minimize the exposed
|
||||
security surface area can elect to join these datacenters together using [mesh
|
||||
gateways](/docs/connect/mesh_gateways.html) to do so.
|
||||
|
||||
## Architecture
|
||||
|
||||
There are two main kinds of communication that occur over the WAN link spanning
|
||||
the gulf between disparate Consul datacenters:
|
||||
|
||||
* **WAN gossip:** We leverage the serf and memberlist libraries to gossip
|
||||
around failure detector knowledge about Consul servers in each datacenter.
|
||||
By default this operates point to point between servers over `8302/udp` with
|
||||
a fallback to `8302/tcp` (which logs a warning indicating the network is
|
||||
misconfigured).
|
||||
|
||||
* **Cross-datacenter RPCs:** Consul servers expose a special multiplexed port
|
||||
over `8300/tcp`. Several distinct kinds of messages can be received on this
|
||||
port, such as RPC requests forwarded from servers in other datacenters.
|
||||
|
||||
|
||||
In this network topology individual Consul client agents on a LAN in one
|
||||
datacenter never need to directly dial servers in other datacenters. This
|
||||
means you could introduce a set of firewall rules prohibiting `10.0.0.0/24`
|
||||
from sending any traffic at all to `10.1.2.0/24` for security isolation.
|
||||
|
||||
You may already have configured [mesh
|
||||
gateways](https://learn.hashicorp.com/consul/developer-mesh/connect-gateways)
|
||||
to allow for services in the service mesh to freely connect between datacenters
|
||||
regardless of the lateral connectivity of the nodes hosting the Consul client
|
||||
agents.
|
||||
|
||||
By activating WAN federation via mesh gateways the servers
|
||||
can similarly use the existing mesh gateways to reach each other without
|
||||
themselves being directly reachable.
|
||||
|
||||
## Configuration
|
||||
|
||||
### TLS
|
||||
|
||||
All Consul servers in all datacenters should have TLS configured with certificates containing
|
||||
these SAN fields:
|
||||
|
||||
server.<this_datacenter>.<domain> (normal)
|
||||
<node_name>.server.<this_datacenter>.<domain> (needed for wan federation)
|
||||
|
||||
This can be achieved using any number of tools, including `consul tls cert
|
||||
create` with the `-node` flag.
|
||||
|
||||
### Mesh Gateways
|
||||
|
||||
There needs to be at least one mesh gateway configured to opt-in to exposing
|
||||
the servers in its configuration. When using the `consul connect envoy` CLI
|
||||
this is done by using the flag `-expose-servers`. All this does is to register
|
||||
the mesh gateway into the catalog with the additional piece of service metadata
|
||||
of `{"consul-wan-federation":"1"}`. If you are registering the mesh gateways
|
||||
into the catalog out of band you may simply add this to your existing
|
||||
registration payload.
|
||||
|
||||
!> Before activating the feature on an existing cluster you should ensure that
|
||||
there is at least one mesh gateway prepared to expose the servers registered in
|
||||
each datacenter otherwise the WAN will become only partly connected.
|
||||
|
||||
### Consul Server Options
|
||||
|
||||
There are a few necessary additional pieces of configuration beyond those
|
||||
required for standing up a
|
||||
[multi-datacenter](https://learn.hashicorp.com/consul/security-networking/datacenters)
|
||||
Consul cluster.
|
||||
|
||||
Consul servers in the _primary_ datacenter should add this snippet to the
|
||||
configuration file:
|
||||
|
||||
```hcl
|
||||
connect {
|
||||
enabled = true
|
||||
enable_mesh_gateway_wan_federation = true
|
||||
}
|
||||
```
|
||||
|
||||
Consul servers in all _secondary_ datacenters should add this snippet to the
|
||||
configuration file:
|
||||
|
||||
```hcl
|
||||
primary_gateways = [ "<primary-mesh-gateway-ip>:<primary-mesh-gateway-port>", ... ]
|
||||
connect {
|
||||
enabled = true
|
||||
enable_mesh_gateway_wan_federation = true
|
||||
}
|
||||
```
|
||||
|
||||
Any references to [`start_join_wan`](/docs/agent/options#start_join_wan) or [`retry_join_wan`](/docs/agent/options#retry_join_wan) should be omitted.
|
||||
|
||||
-> The `primary_gateways` configuration can also use `go-discover` syntax just
|
||||
like `retry_join_wan`.
|
||||
|
||||
### Bootstrapping
|
||||
|
||||
For ease of debugging (such as avoiding a flurry of misleading error messages)
|
||||
when intending to activate WAN federation via mesh gateways it is best to
|
||||
follow this general procedure:
|
||||
|
||||
### New secondary
|
||||
|
||||
1. Upgrade to the desired version of the consul binary for all servers,
|
||||
clients, and CLI.
|
||||
2. Start all consul servers and clients on the new version in the primary
|
||||
datacenter.
|
||||
3. Ensure the primary datacenter has at least one running, registered mesh gateway with
|
||||
the service metadata key of `{"consul-wan-federation":"1"}` set.
|
||||
4. Ensure you are _prepared_ to launch corresponding mesh gateways in all
|
||||
secondaries. When ACLs are enabled actually registering these requires
|
||||
upstream connectivity to the primary datacenter to authorize catalog
|
||||
registration.
|
||||
5. Ensure all servers in the primary datacenter have updated configuration and
|
||||
restart.
|
||||
6. Ensure all servers in the secondary datacenter have updated configuration.
|
||||
7. Start all consul servers and clients on the new version in the secondary
|
||||
datacenter.
|
||||
8. When ACLs are enabled, shortly afterwards it should become possible to
|
||||
resolve ACL tokens from the secondary, at which time it should be possible
|
||||
to launch the mesh gateways in the secondary datacenter.
|
||||
|
||||
|
||||
### Existing secondary
|
||||
|
||||
1. Upgrade to the desired version of the consul binary for all servers,
|
||||
clients, and CLI.
|
||||
2. Restart all consul servers and clients on the new version.
|
||||
3. Ensure each datacenter has at least one running, registered mesh gateway with the
|
||||
service metadata key of `{"consul-wan-federation":"1"}` set.
|
||||
4. Ensure all servers in the primary datacenter have updated configuration and
|
||||
restart.
|
||||
5. Ensure all servers in the secondary datacenter have updated configuration and
|
||||
restart.
|
||||
|
||||
### Verification
|
||||
|
||||
From any two datacenters joined together double check the following give you an
|
||||
expected result:
|
||||
|
||||
* Check that `consul members -wan` lists all servers in all datacenters with
|
||||
their _local_ ip addresses and are listed as `alive`.
|
||||
|
||||
* Ensure any API request that activates datacenter request forwarding. such as
|
||||
[`/v1/catalog/services?dc=<OTHER_DATACENTER_NAME>`](/api/catalog.html#dc-1)
|
||||
succeeds.
|
Loading…
Reference in New Issue