mesh: add ComputedImplicitDestinations resource for future use (#20547)
Creates a new controller to create ComputedImplicitDestinations resources by
composing ComputedRoutes, Services, and ComputedTrafficPermissions to
infer all ParentRef services that could possibly send some portion of traffic to a
Service that has at least one accessible Workload Identity. A followup PR will
rewire the sidecar controller to make use of this new resource.
As this is a performance optimization, rather than a security feature the following
aspects of traffic permissions have been ignored:
- DENY rules
- port rules (all ports are allowed)
Also:
- Add some v2 TestController machinery to help test complex dependency mappers.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
* Change logging of registered v2 resource endpoints to add /api prefix
Previous:
agent.http: Registered resource endpoint: endpoint=/demo/v1/executive
New:
agent.http: Registered resource endpoint: endpoint=/api/demo/v1/executive
This reduces confusion when attempting to call the APIs after looking at
the logs.
* Add cache resource decoding helpers
* Implement a common package for workload selection facilities. This includes:
* Controller cache Index
* ACL hooks
* Dependency Mapper to go from workload to list of resources which select it
* Dependency Mapper to go from a resource which selects workloads to all the workloads it selects.
* Update the endpoints controller to use the cache instead of custom mappers.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
* Implement In-Process gRPC for use by controller caching/indexing
This replaces the pipe base listener implementation we were previously using. The new style CAN avoid cloning resources which our controller caching/indexing is taking advantage of to not duplicate resource objects in memory.
To maintain safety for controllers and for them to be able to modify data they get back from the cache and the resource service, the client they are presented in their runtime will be wrapped with an autogenerated client which clones request and response messages as they pass through the client.
Another sizable change in this PR is to consolidate how server specific gRPC services get registered and managed. Before this was in a bunch of different methods and it was difficult to track down how gRPC services were registered. Now its all in one place.
* Fix race in tests
* Ensure the resource service is registered to the multiplexed handler for forwarding from client agents
* Expose peer streaming on the internal handler
* Add a make target to run lint-consul-retry on all the modules
* Cleanup sdk/testutil/retry
* Fix a bunch of retry.Run* usage to not use the outer testing.T
* Fix some more recent retry lint issues and pin to v1.4.0 of lint-consul-retry
* Fix codegen copywrite lint issues
* Don’t perform cleanup after each retry attempt by default.
* Use the common testutil.TestingTB interface in test-integ/tenancy
* Fix retry tests
* Update otel access logging extension test to perform requests within the retry block
* [NET-6438] Add tenancy to xDS Tests
* [NET-6438] Add tenancy to xDS Tests
- Fixing imports
* [NET-6438] Add tenancy to xDS Tests
- Added cleanup post test run
* [NET-6356] Add tenancy to xDS Tests
- Added cleanup post test run
* [NET-6438] Add tenancy to xDS Tests
- using t.Cleanup instead of defer delete
* [NET-6438] Add tenancy to xDS Tests
- rebased
* [NET-6438] Add tenancy to xDS Tests
- rebased
* node health controller tenancy
* some prog
* some fixes
* revert
* pr comment resolved
* removed name
* Add namespace and tenancy in sidecar proxy controller test
* revert node health controller
* clean up data
* fix local
* copy from ENT
* removed dup code
* removed tenancy
* add test tenancies
As the V2 architecture hinges on eventual consistency and controllers reconciling the existing state in response to writes, there are potential issues we could run into regarding ordering and timing of operations. We want to be able to guarantee that given a set of resources the system will always eventually get to the desired correct state. The order of resource writes and delays in performing those writes should not alter the final outcome of reaching the desired state.
To that end, this commit introduces arbitrary randomized delays before performing resources writes into the `resourcetest.Client`. Its `PublishResources` method was already randomizing the order of resource writes. By default, no delay is added to normal writes and deletes but tests can opt-in via either passing hard coded options when creating the `resourcetest.Client` or using the `resourcetest.ConfigureTestCLIFlags` function to allow processing of CLI parameters.
In addition to allowing configurability of the request delay min and max, the client also has a configurable random number generator seed. When Using the CLI parameter helpers, a test log will be written noting the currently used settings. If the test fails then you can reproduce the same delays and order randomizations by providing the seed during the previous test failure.
Add some generic type hook wrappers to first decode the data
There seems to be a pattern for Validation, Mutation and Write Authorization hooks where they first need to decode the Any data before doing the domain specific work.
This PR introduces 3 new functions to generate wrappers around the other hooks to pre-decode the data into a DecodedResource and pass that in instead of the original pbresource.Resource.
This PR also updates the various catalog data types to use the new hook generators.
This change adds ACL hooks to the remaining catalog and mesh resources, excluding any computed ones. Those will for now continue using the default operator:x permissions.
It refactors a lot of the common testing functions so that they can be re-used between resources.
There are also some types that we don't yet support (e.g. virtual IPs) that this change adds ACL hooks to for future-proofing.
This implements the Filter field on pbcatalog.WorkloadSelector to be
a post-fetch in-memory filter using the https://github.com/hashicorp/go-bexpr
expression language to filter resources based on their envelope metadata fields.
All existing usages of WorkloadSelector should be able to make use of the filter.
* Introduce a new type `ComputedProxyConfiguration` and add a controller for it. This is needed for two reasons. The first one is that external integrations like kubernetes may need to read the fully computed and sorted proxy configuration per workload. The second reasons is that it makes sidecar-proxy controller logic quite a bit simpler as it no longer needs to do this.
* Generalize workload selection mapper and fix a bug where it would delete IDs from the tree if only one is left after a removal is done.
xRoute resource types contain a slice of parentRefs to services that they
manipulate traffic for. All xRoutes that have a parentRef to given Service
will be merged together to generate a ComputedRoutes resource
name-aligned with that Service.
This means that a write of an xRoute with 2 parent ref pointers will cause
at most 2 reconciles for ComputedRoutes.
If that xRoute's list of parentRefs were ever to be reduced, or otherwise
lose an item, that subsequent map event will only emit events for the current
set of refs. The removed ref will not cause the generated ComputedRoutes
related to that service to be re-reconciled to omit the influence of that xRoute.
To combat this, we will store on the ComputedRoutes resource a
BoundResources []*pbresource.Reference field with references to all
resources that were used to influence the generated output.
When the routes controller reconciles, it will use a bimapper to index this
influence, and the dependency mappers for the xRoutes will look
themselves up in that index to discover additional (former) ComputedRoutes
that need to be notified as well.
The ACLs.Read hook for a resource only allows for the identity of a
resource to be passed in for use in authz consideration. For some
resources we wish to allow for the current stored value to dictate how
to enforce the ACLs (such as reading a list of applicable services from
the payload and allowing service:read on any of them to control reading the enclosing resource).
This change update the interface to usually accept a *pbresource.ID,
but if the hook decides it needs more data it returns a sentinel error
and the resource service knows to defer the authz check until after
fetching the data from storage.
Ensure that configuring a FailoverPolicy for a service that is reachable via a xRoute or a direct upstream causes an envoy aggregate cluster to be created for the original cluster name, but with separate clusters for each one of the possible destinations.