* backport of commit dc685df58e
* backport of commit 3e27e57c48
* backport of commit b38fc6da37
* Add BoundReferences to ComputedTrafficPermissions (#20593)
(cherry picked from commit ab3c6cf1e5)
---------
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
* backport of commit e484c3c7dc
* backport of commit 76afe081a5
* backport of commit cb93adba79
* backport of commit a23ea51c82
---------
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
* no-op commit due to failed cherry-picking
* [1.18.x] mesh: use ComputedImplicitDestinations resource in the sidecar controller (#20553)
Wire the ComputedImplicitDestinations resource into the sidecar controller, replacing the inline version already present.
Also:
- Rewrite the controller to use the controller cache
- Rewrite it to no longer depend on ServiceEndpoints
- Remove the fetcher and (local) cache abstraction
---------
Co-authored-by: temp <temp@hashicorp.com>
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
mesh: add ComputedImplicitDestinations resource for future use (#20547)
Creates a new controller to create ComputedImplicitDestinations resources by
composing ComputedRoutes, Services, and ComputedTrafficPermissions to
infer all ParentRef services that could possibly send some portion of traffic to a
Service that has at least one accessible Workload Identity. A followup PR will
rewire the sidecar controller to make use of this new resource.
As this is a performance optimization, rather than a security feature the following
aspects of traffic permissions have been ignored:
- DENY rules
- port rules (all ports are allowed)
Also:
- Add some v2 TestController machinery to help test complex dependency mappers.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
* V1 Compat Exported Services Controller Optimizations (#20517)
V1 compat exported services controller optimizations
* Don't start the v2 exported services controller in v1 mode.
* Use the controller cache.
* Trigger the V1 Compat exported-services Controller when V1 Config Entries are Updated (#20456)
* Trigger the v1 compat exported-services controller when the v1 config entry is modified.
* Hook up exported-services config entries to the event publisher.
* Add tests to the v2 exported services shim.
* Use the local materializer trigger updates on the v1 compat exported services controller when exported-services config entries are modified.
* stop sleeping when context is cancelled
---------
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
* backport of commit 392b8d7573
* backport of commit b4716599ae
* backport of commit a03cb97cb0
* backport of commit 73b277cdef
* backport of commit e53b9794c8
---------
Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>
* backport of commit 2069bd134a
* backport of commit c0446fd670
* backport of commit 5227cc2bf1
---------
Co-authored-by: Nick Ethier <nethier@hashicorp.com>
catalog: improve the bound workload identity encoding on services (#20458)
The endpoints controller currently encodes the list of unique workload identities
referenced by all workload matched by a Service into a special data-bearing
status condition on that Service. This allows a downstream controller to avoid an
expensive watch on the ServiceEndpoints type just to get this data.
The current encoding does not lend itself well to machine parsing, which is what
the field is meant for, so this PR simplifies the encoding from:
"blah blah: " + strings.Join(ids, ",") + "."
to
strings.Join(ids, ",")
It also provides an exported utility function to easily extract this data.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
v2: ensure the controller caches are fully populated before first use (#20421)
The new controller caches are initialized before the DependencyMappers or the
Reconciler run, but importantly they are not populated. The expectation is that
when the WatchList call is made to the resource service it will send an initial
snapshot of all resources matching a single type, and then perpetually send
UPSERT/DELETE events afterward. This initial snapshot will cycle through the
caching layer and will catch it up to reflect the stored data.
Critically the dependency mappers and reconcilers will race against the restoration
of the caches on server startup or leader election. During this time it is possible a
mapper or reconciler will use the cache to lookup a specific relationship and
not find it. That very same reconciler may choose to then recompute some
persisted resource and in effect rewind it to a prior computed state.
Change
- Since we are updating the behavior of the WatchList RPC, it was aligned to
match that of pbsubscribe and pbpeerstream using a protobuf oneof instead of the enum+fields option.
- The WatchList rpc now has 3 alternating response events: Upsert, Delete,
EndOfSnapshot. When set the initial batch of "snapshot" Upserts sent on a new
watch, those operations will be followed by an EndOfSnapshot event before beginning
the never-ending sequence of Upsert/Delete events.
- Within the Controller startup code we will launch N+1 goroutines to execute WatchList
queries for the watched types. The UPSERTs will be applied to the nascent cache
only (no mappers will execute).
- Upon witnessing the END operation, those goroutines will terminate.
- When all cache priming routines complete, then the normal set of N+1 long lived
watch routines will launch to officially witness all events in the system using the
primed cached.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
* Add validation of MeshGateway name + listeners
* Adds test for ValidateMeshGateway
* Fixes data fetcher test for gatewayproxy
---------
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
[NET-6429] Program ProxyStateTemplate to route cross-partition traffic to the correct destination mesh gateway
* Program mesh port to route wildcarded gateway SNI to the appropriate remote partition's mesh gateway
* Update target + route ports in service endpoint refs when building PST
* Use proper name of local datacenter when constructing SNI for gateway target
* Use destination identities for TLS when routing L4 traffic through the mesh gateway
* Use new constants, move comment to correct location
* Use new constants for port names
* Update test assertions
* Undo debug logging change
* Use a full EndpointRef on ComputedRoutes targets instead of just the ID
Today, the `ComputedRoutes` targets have the appropriate ID set for their `ServiceEndpoints` reference; however, the `MeshPort` and `RoutePort` are assumed to be that of the target when adding the endpoints reference in the sidecar's `ProxyStateTemplate`.
This is problematic when the target lives behind a `MeshGateway` and the `Mesh/RoutePort` used in the sidecar's `ProxyStateTemplate` should be that of the `MeshGateway` instead of the target.
Instead of assuming the `MeshPort` and `RoutePort` when building the `ProxyStateTemplate` for the sidecar, let's just add the full `EndpointRef` -- including the ID and the ports -- when hydrating the computed destinations.
* Make sure the UID from the existing ServiceEndpoints makes it onto ComputedRoutes
* Update test assertions
* Undo confusing whitespace change
* Remove one-line function wrapper
* Use plural name for endpoints ref
* Add constants for gateway name, kind and port names
* Add Stop method to telemetry provider
Stop the main loop of the provider and set the config
to disabled.
* Add interface for telemetry provider
Added for easier testing. Also renamed Run to Start, which better
fits with Stop.
* Add Stop method to HCP manager
* Add manager interface, rename implementation
Add interface for easier testing, rename existing Manager to HCPManager.
* Stop HCP manager in link Finalizer
* Attempt to cleanup if resource has been deleted
The link should be cleaned up by the finalizer, but there's an edge
case in a multi-server setup where the link is fully deleted on one
server before the other server reconciles. This will cover the case
where the reconcile happens after the resource is deleted.
* Add a delete mananagement token function
Passes a function to the HCP manager that deletes the management token
that was initially created by the manager.
* Delete token as part of stopping the manager
* Lock around disabling config, remove descriptions
* Check for ACL write permissions on write
Link eventually will be creating a token, so require acl:write.
* Convert Run to Start, only allow to start once
* Always initialize HCP components at startup
* Support for updating config and client
* Pass HCP manager to controller
* Start HCP manager in link resource
Start as part of link creation rather than always starting. Update
the HCP manager with values from the link before starting as well.
* Fix metrics sink leaked goroutine
* Remove the hardcoded disabled hostname prefix
The HCP metrics sink will always be enabled, so the length of sinks will
always be greater than zero. This also means that we will also always
default to prefixing metrics with the hostname, which is what our
documentation states is the expected behavior anyway.
* Add changelog
* Check and set running status in one method
* Check for primary datacenter, add back test
* Clarify merge reasoning, fix timing issue in test
* Add comment about controller placement
* Expand on breaking change, fix typo in changelog
[OG Author: michael.zalimeni@hashicorp.com, rebase needed a separate PR]
* v2: support virtual port in Service port references
In addition to Service target port references, allow users to specify a
port by stringified virtual port value. This is useful in environments
such as Kubernetes where typical configuration is written in terms of
Service virtual ports rather than workload (pod) target port names.
Retaining the option of referencing target ports by name supports VMs,
Nomad, and other use cases where virtual ports are not used by default.
To support both uses cases at once, we will strictly interpret port
references based on whether the value is numeric. See updated
`ServicePort` docs for more details.
* v2: update service ref docs for virtual port support
Update proto and generated .go files with docs reflecting virtual port
reference support.
* v2: add virtual port references to L7 topo test
Add coverage for mixed virtual and target port references to existing
test.
* update failover policy controller tests to work with computed failover policy and assert error conditions against FailoverPolicy and ComputedFailoverPolicy resources
* accumulate services; don't overwrite them in enterprise
---------
Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com>
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
* If a workload does not implement a port, it should not be included in the list of endpoints for the Envoy cluster for that port.
* Adds tenancy tests for xds controller and xdsv2 resource generation, and adds all those files.
* The original change in this PR was for filtering the list of endpoints by the port being routed to (bullet 1). Since I made changes to sidecarproxycontroller golden files, I realized some of the golden files were unused because of the tenancy changes, so when I deleted those, that broke xds controller tests which weren't correctly using tenancy. So when I fixed that, then the xdsv2 tests broke, so I added tenancy support there too. So now, from sidecarproxy controller -> xds controller -> xdsv2 we now have tenancy support and all the golden files are lined up.
* API Gateway proto
* fix lint issue
* new line
* run make proto format
* checkpoint
* stub
* Update internal/mesh/internal/controllers/apigateways/controller.go
* Change logging of registered v2 resource endpoints to add /api prefix
Previous:
agent.http: Registered resource endpoint: endpoint=/demo/v1/executive
New:
agent.http: Registered resource endpoint: endpoint=/api/demo/v1/executive
This reduces confusion when attempting to call the APIs after looking at
the logs.
Add missing import
Add explicit enum case for deny action
Remove extra comments
Add build tags to ent and ce tests
Add copyright headers for the ce files
Fix case statements for ce validator
Remove ce tests with Deny traffic permissions
Fix more integration tests
Split more ce and ent tests, add back ent deny tests for traffic permissions controller
temp rename before rebase
Readd ent deny tests for traffic permissions controller
* panic when passing an incorrect type to the data fetcher
* Add assertions for sidecarproxy datafetcher as well
* rename assertion function
* Add in comments to ensure devs know about potential panics for using
invalid types
* fix method call
* Move config-dependent methods to separate package
In order to reuse the fetching and file creation part of the
bootstrap package, move the code that would cause cyclical
dependencies to a different package.
* Export needed bootstrap methods and variables
Also add back validating persisted config and update tests.
* Add support to check for just management token
Add a new method that fetches the bootstrap configuration only if
there isn't a valid management token file instead of checking for
all the hcp-config files.
* Pass data dir as a dependency to link controller
The link controller needs to check the data directory for
the hcp-config files.
* Fetch bootstrap config for token in controller
Load the management token when reconciling a link resource, which will
fetch the agent boostrap configuration if the token is not already
persisted locally. Skip this step if the cluster is in read-only mode.
* Validate resource ID format in link creation
* Handle unauthorized and forbidden errors
Check for 401 and 403s when making GNM requests, exit bootstrap fetch
loop and return specific failure statuses for link.
* Move test function to a testing file
* Log load and status write errors
* Exported services api implemented
* Tests added, refactored code
* Adding server tests
* changelog added
* Proto gen added
* Adding codegen changes
* changing url, response object
* Fixing lint error by having namespace and partition directly
* Tests changes
* refactoring tests
* Simplified uniqueness logic for exported services, sorted the response in order of service name
* Fix lint errors, refactored code
* Use black hole cluster for default router when no matches
* Update test assertions
* Use null route cluster instead of black hole cluster concept
* Update test assertions
* Add Initializer to the controller
The Initializer adds support for running any required initialization
steps when the controller is first started.
* Implement HCP Link initializer
The link initializer will create a Link resource if the
cloud configuration has been set.
* Simplify retry logic and testing
* Remove internal retry, replace with logging logic
Some edge case error testing had to be removed because it was no longer possible to force errors when going through the cache layer as opposed to the resource service itself.