Fix xDS missing endpoint race condition.
This fixes the following race condition:
- Send update endpoints
- Send update cluster
- Recv ACK endpoints
- Recv ACK cluster
Prior to this fix, it would have resulted in the endpoints NOT existing in
Envoy. This occurred because the cluster update implicitly clears the endpoints
in Envoy, but we would never re-send the endpoint data to compensate for the
loss, because we would incorrectly ACK the invalid old endpoint hash. Since the
endpoint's hash did not actually change, they would not be resent.
The fix for this is to effectively clear out the invalid pending ACKs for child
resources whenever the parent changes. This ensures that we do not store the
child's hash as accepted when the race occurs.
An escape-hatch environment variable `XDS_PROTOCOL_LEGACY_CHILD_RESEND` was
added so that users can revert back to the old legacy behavior in the event
that this produces unknown side-effects.
This bug report and fix was mostly implemented by @ksmiley with some minor
tweaks.
Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>
Co-authored-by: Keith Smiley <ksmiley@salesforce.com>
backport of commit af6045cdf1
Co-authored-by: Ronald Ekambi <ronekambi@gmail.com>
Co-authored-by: Ronald <roncodingenthusiast@users.noreply.github.com>
* backport of commit 06507fe053
* backport of commit 14e160573d
* backport of commit 088ec70f90
---------
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
* [NET-5688] APIGateway UI Topology Fixes (#19657)
* Update catalog and ui endpoints to show APIGateway in gateway service
topology view
* Added initial implementation for service view
* updated ui
* Fix topology view for gateways
* Adding tests for gw controller
* remove unused args
* Undo formatting changes
* Fix call sites for upstream/downstream gw changes
* Add config entry tests
* Fix function calls again
* Move from ServiceKey to ServiceName, cleanup from PR review
* Add additional check for length of services in bound apigateway for
IsSame comparison
* fix formatting for proto
* gofmt
* Add DeepCopy for retrieved BoundAPIGateway
* gofmt
* gofmt
* Rename function to be more consistent
* Remove BUSL license
* Fix import
fix a panic in the CLI when deleting an acl policy with an unknown name (#19679)
* fix a panic in the CLI when deleting an acl policy with an unknown name
* add changelog
docs: Fix Kubernetes CRD example configs (#18878)
Fixes configuration examples for several Consul Kubernetes CRDs. The
CRDs were missing required fields such as `apiVersion`, `metadata`,
and `spec`.
Co-authored-by: Tu Nguyen <im2nguyen@gmail.com>
Bump google.golang.org/grpc to 1.56.3
This resolves [CVE-2023-44487](https://nvd.nist.gov/vuln/detail/CVE-2023-44487).
Also includes various fixes from later release versions required for
tests and linters to pass. See 77f44fa878
for the majority of these changes.
Co-authored-by: Chris Thain <chris.m.thain@gmail.com>
Backport of Add grpc keepalive configuration into release/1.15.x (#19339)
Add grpc keepalive configuration. (#19339)
Prior to the introduction of this configuration, grpc keepalive messages were
sent after 2 hours of inactivity on the stream. This posed issues in various
scenarios where the server-side xds connection balancing was unaware that envoy
instances were uncleanly killed / force-closed, since the connections would
only be cleaned up after ~5 minutes of TCP timeouts occurred. Setting this
config to a 30 second interval with a 20 second timeout ensures that at most,
it should take up to 50 seconds for a dead xds connection to be closed.
Backport of build(docker): always publish full and minor version tags for dev images into release/1.17.x (#19282)
backport of commit c6bb4a5341
Co-authored-by: DanStough <dan.stough@hashicorp.com>
Allow connections through Terminating Gateways from peered clusters NET-3463 (#18959)
* Add InboundPeerTrustBundle maps to Terminating Gateway
* Add notify and cancelation of watch for inbound peer trust bundles
* Pass peer trust bundles to the RBAC creation function
* Regenerate Golden Files
* add changelog, also adds another spot that needed peeredTrustBundles
* Add basic test for terminating gateway with peer trust bundle
* Add intention to cluster peered golden test
* rerun codegen
* update changelog
* really update the changelog
---------
Co-authored-by: Thomas Eckert <teckert@hashicorp.com>
Co-authored-by: Melisa Griffin <melisa.griffin@hashicorp.com>