Commit Graph

576 Commits (b9ad0dfa41e339d1982a31a44f1231abda1b998f)

Author SHA1 Message Date
Nitya Dhanushkodi 9975b8bd73
[NET-5455] Allow disabling request and idle timeouts with negative values in service router and service resolver (#19992)
* add coverage for testing these timeouts
2023-12-19 15:36:07 -08:00
Derek Menteer dfab5ade50
Fix ClusterLoadAssignment timeouts dropping endpoints. (#19871)
When a large number of upstreams are configured on a single envoy
proxy, there was a chance that it would timeout when waiting for
ClusterLoadAssignments. While this doesn't always immediately cause
issues, consul-dataplane instances appear to consistently drop
endpoints from their configurations after an xDS connection is
re-established (the server dies, random disconnect, etc).

This commit adds an `xds_fetch_timeout_ms` config to service registrations
so that users can set the value higher for large instances that have
many upstreams. The timeout can be disabled by setting a value of `0`.

This configuration was introduced to reduce the risk of causing a
breaking change for users if there is ever a scenario where endpoints
would never be received. Rather than just always blocking indefinitely
or for a significantly longer period of time, this config will affect
only the service instance associated with it.
2023-12-11 09:25:11 -06:00
Derek Menteer 0ac958f27b
Fix xDS missing endpoint race condition. (#19866)
This fixes the following race condition:
- Send update endpoints
- Send update cluster
- Recv ACK endpoints
- Recv ACK cluster

Prior to this fix, it would have resulted in the endpoints NOT existing in
Envoy. This occurred because the cluster update implicitly clears the endpoints
in Envoy, but we would never re-send the endpoint data to compensate for the
loss, because we would incorrectly ACK the invalid old endpoint hash. Since the
endpoint's hash did not actually change, they would not be resent.

The fix for this is to effectively clear out the invalid pending ACKs for child
resources whenever the parent changes. This ensures that we do not store the
child's hash as accepted when the race occurs.

An escape-hatch environment variable `XDS_PROTOCOL_LEGACY_CHILD_RESEND` was
added so that users can revert back to the old legacy behavior in the event
that this produces unknown side-effects. Visit the following thread for some
extra context on why certainty around these race conditions is difficult:
https://github.com/envoyproxy/envoy/issues/13009

This bug report and fix was mostly implemented by @ksmiley with some minor
tweaks.

Co-authored-by: Keith Smiley <ksmiley@salesforce.com>
2023-12-08 11:37:12 -06:00
Thomas Eckert 8125a32a4e
Add CE version of Gateway Upstream Disambiguation (#19860)
* Add CE version of gateway-upstream-disambiguation

* Use NamespaceOrDefault and PartitionOrDefault

* Add Changelog entry

* Remove the unneeded reassignment

* Use c.ID()
2023-12-07 17:56:14 -05:00
Dhia Ayachi d93f7f730d
parse config protocol on write to optimize disco-chain compilation (#19829)
* parse config protocol on write to optimize disco-chain compilation

* add changelog
2023-12-07 13:46:46 -05:00
John Murret 780e91688d
Migrate remaining individual resource tests for service mesh to TestAllResourcesFromSnapshot (#19583)
* migrate expose checks and paths  tests to resources_test.go

* fix failing expose paths tests

* fix the way endpoint resources get created to make expose tests pass.

* remove endpoint resources that are already inlined on local_app clusters

* renaiming and comments

* migrate remaining service mesh tests to resources_test.go

* cleanup

* update proxystateconverter to skip ading alpn to clusters and listener filterto match v1 behavior
2023-11-09 20:08:37 +00:00
John Murret f5bf256425
Migrate individual resource tests for API Gateway to TestAllResourcesFromSnapshot (#19584)
migrate individual api gateway tests to resources_test.go
2023-11-09 17:01:54 +00:00
John Murret a94fa4c3ed
Migrate individual resource tests for Mesh Gateway to TestAllResourcesFromSnapshot (#19502)
migrate mesh-gateway tests to resources_test.go
2023-11-09 16:39:16 +00:00
John Murret 4aa95f3d1f
Migrate individual resource tests for Ingress Gateway to TestAllResourcesFromSnapshot (#19506)
migrate ingress-gateway tests to resources_test.go
2023-11-09 16:08:07 +00:00
John Murret 2553d6e8b9
Migrate individual resource tests for Terminating Gateway to TestAllResourcesFromSnapshot (#19505)
migrate terminating-gateway tests to resources_test.go
2023-11-09 08:38:33 -07:00
John Murret 7de0b45ba4
Fix xds v2 from creating envoy endpoint resources when already inlined in the cluster (#19580)
* migrate expose checks and paths  tests to resources_test.go

* fix failing expose paths tests

* fix the way endpoint resources get created to make expose tests pass.

* wip

* remove endpoint resources that are already inlined on local_app clusters

* renaiming and comments
2023-11-08 22:18:51 +00:00
John Murret 5aff19f9bc
Migrate individual resource tests for JWT Provider to TestAllResourcesFromSnapshot (#19511)
migrate jwt provider tests to resources_test.go
2023-11-08 14:34:40 -07:00
John Murret 903ff7fccb
Migrate individual resource tests for custom configuration to TestAllResourcesFromSnapshot (#19512)
* Configure TestAllResourcesFromSnapshot to run V2 tests

* migrate custom configuration tests to resources_test.go
2023-11-08 10:34:23 -07:00
John Murret 09f73d1abf
Migrate individual resource tests for expose paths and checks to TestAllResourcesFromSnapshot (#19513)
* migrate expose checks and paths  tests to resources_test.go

* fix failing expose paths tests
2023-11-08 14:24:27 +00:00
John Murret 7bc2581c81
Migrate individual resource tests for Discovery Chains to TestAllResourcesFromSnapshot (#19508)
migrate disco chain tests to resources_test.go
2023-11-08 01:34:42 +00:00
John Murret f115cdb1d5
NET-6385 - Static routes that are inlined in listener filters are also created as a resource. (#19459)
* cover all protocols in local_app golden tests

* fix xds tests

* updating latest

* fix broken test

* add sorting of routers to TestBuildLocalApp to get rid of the flaking

* cover all protocols in local_app golden tests

* cover all protocols in local_app golden tests

* cover all protocols in local_app golden tests

* process envoy resource by walking the map.  use a map rather than array for envoy resource to prevent duplication.

* cleanup.  doc strings.

* update to latest

* fix broken test

* update tests after adding sorting of routers in local_app builder tests

* do not make endpoints for local_app

* fix catalog destinations only by creating clusters for any cluster not already created by walking the graph.

* Configure TestAllResourcesFromSnapshot to run V2 tests

* wip

* fix processing of failover groups

* add endpoints and clusters for any clusters that were not created from walking the listener -> path

* fix xds v2 golden files for clusters to include failover group clusters
2023-11-07 08:00:08 -07:00
John Murret 74daaa5043
XDS V1 should not make runs for TCP Disco Chains. (#19496)
* XDS V1 should not make runs for TCP Disco Chains.

* update TestEnvoyExtenderWithSnapshot
2023-11-03 14:53:17 -06:00
John Murret f0cf8f2f40
NET-6294 - v1 Agentless proxycfg datasource errors after v2 changes (#19365) 2023-10-27 14:06:38 -06:00
Michael Zalimeni a7803bd829
[NET-6305] xds: Ensure v2 route match and protocol are populated for gRPC (#19343)
* xds: Ensure v2 route match is populated for gRPC

Similar to HTTP, ensure that route match config (which is required by
Envoy) is populated when default values are used.

Because the default matches generated for gRPC contain a single empty
`GRPCRouteMatch`, and that proto does not directly support prefix-based
config, an interpretation of the empty struct is needed to generate the
same output that the `HTTPRouteMatch` is explicitly configured to
provide in internal/mesh/internal/controllers/routes/generate.go.

* xds: Ensure protocol set for gRPC resources

Add explicit protocol in `ProxyStateTemplate` builders and validate it
is always set on clusters. This ensures that HTTP filters and
`http2_protocol_options` are populated in all the necessary places for
gRPC traffic and prevents future unintended omissions of non-TCP
protocols.

Co-authored-by: John Murret <john.murret@hashicorp.com>

---------

Co-authored-by: John Murret <john.murret@hashicorp.com>
2023-10-25 17:43:58 +00:00
Andrew Stucki e414cbee4a
Use strict DNS for mesh gateways with hostnames (#19268)
* Use strict DNS for mesh gateways with hostnames

* Add changelog
2023-10-24 15:04:14 -04:00
Michael Zalimeni 5e517c5980
[NET-6221] Ensure LB policy set for locality-aware routing (CE) (#19283)
Ensure LB policy set for locality-aware routing (CE)

`overprovisioningFactor` should be overridden with the expected value
(100,000) when there are multiple endpoint groups. Update code and
tests to enforce this.

This is an Enterprise feature. This commit represents the CE portions of
the change; tests are added in the corresponding `consul-enterprise`
change.
2023-10-19 10:13:27 -04:00
John Maguire b78465b491
[NET-5810] CE changes for multiple virtual hosts (#19246)
CE changes for multiple virtual hosts
2023-10-17 15:08:04 +00:00
Thomas Eckert 76c60fdfac
Golden File Tests for TermGW w/ Cluster Peering (#19096)
Add intention to create golden file for terminating gateway peered trust bundle
2023-10-13 11:56:58 -04:00
Nitya Dhanushkodi 95d9b2c7e4
[NET-4931] xdsv2, sidecarproxycontroller, l4 trafficpermissions: support L7 (#19185)
* xdsv2: support l7 by adding xfcc policy/headers, tweaking routes, and make a bunch of listeners l7 tests pass

* sidecarproxycontroller: add l7 local app support 

* trafficpermissions: make l4 traffic permissions work on l7 workloads

* rename route name field for consistency with l4 cluster name field

* resolve conflicts and rebase

* fix: ensure route name is used in l7 destination route name as well. previously it was only in the route names themselves, now the route name and l7 destination route name line up
2023-10-12 23:45:45 +00:00
John Maguire 7a323c492b
[NET-5457] Golden Files for Multiple Virtual Hosts (#19131)
* Add new golden file tests

* Update with latest deterministic code
2023-10-11 18:11:29 +00:00
John Maguire 8bebfc147d
[NET-5457] Fix CE code for jwt multiple virtual hosts bug (#19123)
* Fix CE code for jwt multiple virtual hosts bug

* Fix struct definition

* fix bug with always appending route to jwt config

* Update comment to be correct

* Update comment
2023-10-10 16:25:36 -04:00
Chris S. Kim 92ce814693
Remove old build tags (#19128) 2023-10-10 10:58:06 -04:00
Thomas Eckert 342306c312
Allow connections through Terminating Gateways from peered clusters NET-3463 (#18959)
* Add InboundPeerTrustBundle maps to Terminating Gateway

* Add notify and cancelation of watch for inbound peer trust bundles

* Pass peer trust bundles to the RBAC creation function

* Regenerate Golden Files

* add changelog, also adds another spot that needed peeredTrustBundles

* Add basic test for terminating gateway with peer trust bundle

* Add intention to cluster peered golden test

* rerun codegen

* update changelog

* really update the changelog

---------

Co-authored-by: Melisa Griffin <melisa.griffin@hashicorp.com>
2023-10-05 21:54:23 +00:00
Eric Haberkorn f2b7b4591a
Fix Traffic Permissions Default Deny (#19028)
Whenver a traffic permission exists for a given workload identity, turn on default deny.

Previously, this was only working at the port level.
2023-10-04 09:58:28 -04:00
sarahalsmiller 9addd9ed7c
[NET-5788] Fix needed for JWTAuth in Consul Enterprise (#19038)
change needed for fix in consul-enterprise
2023-10-03 09:48:50 -05:00
Nitya Dhanushkodi 9a48266712
remove log (#19029) 2023-09-29 16:11:50 -07:00
Eric Haberkorn 7ce6ebaeb3
Handle Traffic Permissions With Empty Sources Properly (#19024)
Fix issues with empty sources

* Validate that each permission on traffic permissions resources has at least one source.
* Don't construct RBAC policies when there aren't any principals. This resulted in Envoy rejecting xDS updates with a validation error.

```
error=
  | rpc error: code = Internal desc = Error adding/updating listener(s) public_listener: Proto constraint validation failed (RBACValidationError.Rules: embedded message failed validation | caused by RBACValidationError.Policies[consul-intentions-layer4-1]: embedded message failed validation | caused by PolicyValidationError.Principals: value must contain at least 1 item(s)): rules {
```
2023-09-28 15:11:59 -04:00
Iryna Shustava e6b724d062
catalog,mesh,auth: Move resource types to the proto-public module (#18935) 2023-09-22 15:50:56 -06:00
Iryna Shustava d88888ee8b
catalog,mesh,auth: Bump versions to v2beta1 (#18930) 2023-09-22 10:51:15 -06:00
Nitya Dhanushkodi 0a11499588
net-5689 fix disabling panic threshold logic (#18958) 2023-09-21 15:52:30 -07:00
Nitya Dhanushkodi 3a2e62053a
v2: various fixes to make K8s tproxy multiport acceptance tests and manual explicit upstreams (single port) tests pass (#18874)
Adding coauthors who mobbed/paired at various points throughout last week.
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Co-authored-by: Iryna Shustava <iryna@hashicorp.com>
Co-authored-by: John Murret <john.murret@hashicorp.com>
Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com>
Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: Michael Wilkerson <mwilkerson@hashicorp.com>
2023-09-20 00:02:01 +00:00
Blake Covarrubias 019c62e1ba
xds: Use downstream protocol when connecting to local app (#18573)
Configure Envoy to use the same HTTP protocol version used by the
downstream caller when forwarding requests to a local application that
is configured with the protocol set to either `http2` or `grpc`.

This allows upstream applications that support both HTTP/1.1 and
HTTP/2 on a single port to receive requests using either protocol. This
is beneficial when the application primarily communicates using HTTP/2,
but also needs to support HTTP/1.1, such as to respond to Kubernetes
HTTP readiness/liveness probes.

Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>
2023-09-19 14:32:28 -07:00
Eric Haberkorn 21fdbbabbc
Wire up traffic permissions (#18812)
Wire up traffic permissions
2023-09-15 12:31:22 -04:00
R.B. Boyer 66e1cdf40c
mesh: Wire ComputedRoutes into the ProxyStateTemplate via the sidecar controller (#18752)
Reworks the sidecar controller to accept ComputedRoutes as an input and use it to generate appropriate ProxyStateTemplate resources containing L4/L7 mesh configuration.
2023-09-14 17:19:04 -05:00
Eric Haberkorn 12be06f8e5
Add V2 TCP traffic permissions (#18771)
Add support for TCP traffic permissions
2023-09-13 09:03:42 -04:00
Chris Thain 4724a4e169
Add Envoy golden test for OTEL access logging extension (#18760) 2023-09-12 09:58:53 -07:00
John Murret 62062fd4fd
NET-5132 - Configure multiport routing for connect proxies in TProxy mode (#18606)
* mesh-controller: handle L4 protocols for a proxy without upstreams

* sidecar-controller: Support explicit destinations for L4 protocols and single ports.

* This controller generates and saves ProxyStateTemplate for sidecar proxies.
* It currently supports single-port L4 ports only.
* It keeps a cache of all destinations to make it easier to compute and retrieve destinations.
* It will update the status of the pbmesh.Upstreams resource if anything is invalid.

* endpoints-controller: add workload identity to the service endpoints resource

* small fixes

* review comments

* Address PR comments

* sidecar-proxy controller: Add support for transparent proxy

This currently does not support inferring destinations from intentions.

* PR review comments

* mesh-controller: handle L4 protocols for a proxy without upstreams

* sidecar-controller: Support explicit destinations for L4 protocols and single ports.

* This controller generates and saves ProxyStateTemplate for sidecar proxies.
* It currently supports single-port L4 ports only.
* It keeps a cache of all destinations to make it easier to compute and retrieve destinations.
* It will update the status of the pbmesh.Upstreams resource if anything is invalid.

* endpoints-controller: add workload identity to the service endpoints resource

* small fixes

* review comments

* Make sure endpoint refs route to mesh port instead of an app port

* Address PR comments

* fixing copyright

* tidy imports

* sidecar-proxy controller: Add support for transparent proxy

This currently does not support inferring destinations from intentions.

* tidy imports

* add copyright headers

* Prefix sidecar proxy test files with source and destination.

* Update controller_test.go

* NET-5132 - Configure multiport routing for connect proxies in TProxy mode

* formatting golden files

* reverting golden files and adding changes in manually.  build implicit destinations still has some issues.

* fixing files that were incorrectly repeating the outbound listener

* PR comments

* extract AlpnProtocol naming convention to getAlpnProtocolFromPortName(portName)

* removing address level filtering.

* adding license to resources_test.go

---------

Co-authored-by: Iryna Shustava <iryna@hashicorp.com>
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
2023-09-12 01:17:56 +00:00
R.B. Boyer a69e901660
xds: update golden tests to be deterministic (#18707) 2023-09-11 11:40:19 -05:00
John Maguire 2c244b6f42
[APIGW] NET-5017 JWT Cleanup/Status Conditions (#18700)
* Fixes issues in setting status

* Update golden files for changes to xds generation to not use deprecated
methods

* Fixed default for validation of JWT for route
2023-09-07 19:03:09 +00:00
Iryna Shustava 4eb2197e82
dataplane: Allow getting bootstrap parameters when using V2 APIs (#18504)
This PR enables the GetEnvoyBootstrapParams endpoint to construct envoy bootstrap parameters from v2 catalog and mesh resources.

   * Make bootstrap request and response parameters less specific to services so that we can re-use them for workloads or service instances.
   * Remove ServiceKind from bootstrap params response. This value was unused previously and is not needed for V2.
   * Make access logs generation generic so that we can generate them using v1 or v2 resources.
2023-09-06 16:46:25 -06:00
Derek Menteer a698142325
Add extra logging for mesh health endpoints. (#18647) 2023-09-01 12:29:09 -05:00
John Maguire 9876923e23
Add the plumbing for APIGW JWT work (#18609)
* Add the plumbing for APIGW JWT work

* Remove unneeded import

* Add deep equal function for HTTPMatch

* Added plumbing for status conditions

* Remove unneeded comment

* Fix comments

* Add calls in xds listener for apigateway to setup listener jwt auth
2023-08-31 12:23:59 -04:00
Ashwin Venkatesh 797e42dc24
Watch the ProxyTracker from xDS controller (#18611) 2023-08-29 14:39:29 -07:00
John Murret 0e606504bc
NET-4944 - wire up controllers with proxy tracker (#18603)
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
2023-08-29 09:15:34 -06:00
John Murret 051f250edb
NET-5338 - NET-5338 - Run a v2 mode xds server (#18579)
* NET-5338 - NET-5338 - Run a v2 mode xds server

* fix linting
2023-08-24 16:44:14 -06:00