Prior to this PR, servers / agents would panic and crash if an ingress
or api gateway were configured to use a discovery chain that both:
1. Referenced a peered service
2. Had a mesh gateway mode of local
This could occur, because code for handling upstream watches was shared
between both connect-proxy and the gateways. As a short-term fix, this
PR ensures that the maps are always initialized for these gateway services.
This PR also wraps the proxycfg execution and service
registration calls with recover statements to ensure that future issues
like this do not put the server into an unrecoverable state.
Prior to this commit, secondary datacenters could not be initialized
as peering acceptors if ACLs were enabled. This is due to the fact that
internal server-to-server API calls would fail because the management
token was not generated. This PR makes it so that both primary and
secondary datacenters generate their own management token whenever
a leader is elected in their respective clusters.
Fix configuration merging for implicit tproxy upstreams.
Change the merging logic so that the wildcard upstream has correct proxy-defaults
and service-defaults values combined into it. It did not previously merge all fields,
and the wildcard upstream did not exist unless service-defaults existed (it ignored
proxy-defaults, essentially).
Change the way we fetch upstream configuration in the xDS layer so that it falls back
to the wildcard when no matching upstream is found. This is what allows implicit peer
upstreams to have the correct "merged" config.
Change proxycfg to always watch local mesh gateway endpoints whenever a peer upstream
is found. This simplifies the logic so that we do not have to inspect the "merged"
configuration on peer upstreams to extract the mesh gateway mode.
Previously, we'd begin a session with the xDS concurrency limiter
regardless of whether the proxy was registered in the catalog or in
the server's local agent state.
This caused problems for users who run `consul connect envoy` directly
against a server rather than a client agent, as the server's locally
registered proxies wouldn't be included in the limiter's capacity.
Now, the `ConfigSource` is responsible for beginning the session and we
only do so for services in the catalog.
Fixes: https://github.com/hashicorp/consul/issues/15753
Co-authored-by: Dan Upton <daniel@floppy.co>
* backport of commit 7d0cf566ca
* backport of commit 024c8a84a6
* removing unused reference to pboperator
Co-authored-by: John Murret <john.murret@hashicorp.com>
It turns out that by default the dev mode vault server will attempt to interact with the
filesystem to store the provided root token. If multiple vault instances are running
they'll all awkwardly share the filesystem and if timing results in one server stopping
while another one is starting then the starting one will error with:
Error initializing Dev mode: rename /home/circleci/.vault-token.tmp /home/circleci/.vault-token: no such file or directory
This change uses `-dev-no-store-token` to bypass that source of flakes. Also the
stdout/stderr from the vault process is included if the test fails.
The introduction of more `t.Parallel` use in https://github.com/hashicorp/consul/pull/15669
increased the likelihood of this failure, but any of the tests with multiple vaults in use
(or running multiple package tests in parallel that all use vault) were eventually going
to flake on this.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
All of the current integration tests where Vault is the Connect CA now use non-root tokens for the test. This helps us detect privilege changes in the vault model so we can keep our guides up to date.
One larger change was that the RenewIntermediate function got refactored slightly so it could be used from a test, rather than the large duplicated function we were testing in a test which seemed error prone.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
Consul used to rely on implicit issuer selection when calling Vault endpoints to issue new CSRs. Vault 1.11+ changed that behavior, which caused Consul to check the wrong (previous) issuer when renewing its Intermediate CA. This patch allows Consul to explicitly set a default issuer when it detects that the response from Vault is 1.11+.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>