Registering gRPC balancers is thread-unsafe because they are stored in a
global map variable that is accessed without holding a lock. Therefore,
it's expected that balancers are registered _once_ at the beginning of
your program (e.g. in a package `init` function) and certainly not after
you've started dialing connections, etc.
> NOTE: this function must only be called during initialization time
> (i.e. in an init() function), and is not thread-safe.
While this is fine for us in production, it's challenging for tests that
spin up multiple agents in-memory. We currently register a balancer per-
agent which holds agent-specific state that cannot safely be shared.
This commit introduces our own registry that _is_ thread-safe, and
implements the Builder interface such that we can call gRPC's `Register`
method once, on start-up. It uses the same pattern as our resolver
registry where we use the dial target's host (aka "authority"), which is
unique per-agent, to determine which builder to use.
* update hcp-sdk-go
* add version, datacenter and acl info
* fewer changes
* go mod tidy and lint
* less code
* remove duplicated dep
* fmt
* trigger ci
* Persist HCP management token from server config
We want to move away from injecting an initial management token into
Consul clusters linked to HCP. The reasoning is that by using a separate
class of token we can have more flexibility in terms of allowing HCP's
token to co-exist with the user's management token.
Down the line we can also more easily adjust the permissions attached to
HCP's token to limit it's scope.
With these changes, the cloud management token is like the initial
management token in that iit has the same global management policy and
if it is created it effectively bootstraps the ACL system.
* Update SDK and mock HCP server
The HCP management token will now be sent in a special field rather than
as Consul's "initial management" token configuration.
This commit also updates the mock HCP server to more accurately reflect
the behavior of the CCM backend.
* Refactor HCP bootstrapping logic and add tests
We want to allow users to link Consul clusters that already exist to
HCP. Existing clusters need care when bootstrapped by HCP, since we do
not want to do things like change ACL/TLS settings for a running
cluster.
Additional changes:
* Deconstruct MaybeBootstrap so that it can be tested. The HCP Go SDK
requires HTTPS to fetch a token from the Auth URL, even if the backend
server is mocked. By pulling the hcp.Client creation out we can modify
its TLS configuration in tests while keeping the secure behavior in
production code.
* Add light validation for data received/loaded.
* Sanitize initial_management token from received config, since HCP will
only ever use the CloudConfig.MangementToken.
* Add changelog entry
---------
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: John Murret <john.murret@hashicorp.com>
The grpc resolver implementation is fed from changes to the
router.Router. Within the router there is a map of various areas storing
the addressing information for servers in those areas. All map entries
are of the WAN variety except a single special entry for the LAN.
Addressing information in the LAN "area" are local addresses intended
for use when making a client-to-server or server-to-server request.
The client agent correctly updates this LAN area when receiving lan serf
events, so by extension the grpc resolver works fine in that scenario.
The server agent only initially populates a single entry in the LAN area
(for itself) on startup, and then never mutates that area map again.
For normal RPCs a different structure is used for LAN routing.
Additionally when selecting a server to contact in the local datacenter
it will randomly select addresses from either the LAN or WAN addressed
entries in the map.
Unfortunately this means that the grpc resolver stack as it exists on
server agents is either broken or only accidentally functions by having
servers dial each other over the WAN-accessible address. If the operator
disables the serf wan port completely likely this incidental functioning
would break.
This PR enforces that local requests for servers (both for stale reads
or leader forwarded requests) exclusively use the LAN "area" information
and also fixes it so that servers keep that area up to date in the
router.
A test for the grpc resolver logic was added, as well as a higher level
full-stack test to ensure the externally perceived bug does not return.
This commit swaps the partition field to the local partition for
discovery chains targeting peers. Prior to this change, peer upstreams
would always use a value of default regardless of which partition they
exist in. This caused several issues in xds / proxycfg because of id
mismatches.
Some prior fixes were made to deal with one-off id mismatches that this
PR also cleans up, since they are no longer needed.
* Backport squash structures for config entries
* Add changelog
* Rename changelog file for current PR
* change changelog file name
* Add enterprise only tag to changelog
* Fix changelog
* backport of commit 9ea73b3b8d
* backport of commit d3cffdeb4d
* backport of commit 0848aac017
* backport of commit 90b5e39d2d
* Refactor and fix flaky tests
* Fix bad merge
* add file that was never backported
* Fix bad merge again
* fix duplicate method
* remove extra import
* backport a slew of testing library code
* backport changes coinciding with library update
* backport changes coinciding with library update
---------
Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>
* backport of commit 537734d2ec
* backport of commit 523d313671
* backport of commit 8a113841d4
* backport of commit 368f8a51e9
---------
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
* backport of commit 892d389d9b
* backport of commit 8a2468d6b5
* backport of commit f56894fdc1
* backport of commit ced73fc2ce
---------
Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>
* backport of commit e14b4301fa
* backport of commit 525501337d
* backport of commit b1b2abc14a
* backport of commit ecaeff26aa
---------
Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>