Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
 
 
 
 
 
 
Go to file
Derek Menteer 0ac8ae6c3b
Fix xDS deadlock due to syncLoop termination. (#20867)
* Fix xDS deadlock due to syncLoop termination.

This fixes an issue where agentless xDS streams can deadlock permanently until
a server is restarted. When this issue occurs, no new proxies are able to
successfully connect to the server.

Effectively, the trigger for this deadlock stems from the following return
statement:
https://github.com/hashicorp/consul/blob/v1.18.0/agent/proxycfg-sources/catalog/config_source.go#L199-L202

When this happens, the entire `syncLoop()` terminates and stops consuming from
the following channel:
https://github.com/hashicorp/consul/blob/v1.18.0/agent/proxycfg-sources/catalog/config_source.go#L182-L192

Which results in the `ConfigSource.cleanup()` function never receiving a
response and holding a mutex indefinitely:
https://github.com/hashicorp/consul/blob/v1.18.0/agent/proxycfg-sources/catalog/config_source.go#L241-L247

Because this mutex is shared, it effectively deadlocks the server's ability to
process new xDS streams.

----

The fix to this issue involves removing the `chan chan struct{}` used like an
RPC-over-channels pattern and replacing it with two distinct channels:

+ `stopSyncLoopCh` - indicates that the `syncLoop()` should terminate soon.  +
`syncLoopDoneCh` - indicates that the `syncLoop()` has terminated.

Splitting these two concepts out and deferring a `close(syncLoopDoneCh)` in the
`syncLoop()` function ensures that the deadlock above should no longer occur.

We also now evict xDS connections of all proxies for the corresponding
`syncLoop()` whenever it encounters an irrecoverable error. This is done by
hoisting the new `syncLoopDoneCh` upwards so that it's visible to the xDS delta
processing. Prior to this fix, the behavior was to simply orphan them so they
would never receive catalog-registration or service-defaults updates.

* Add changelog.
2024-03-15 13:57:11 -05:00
.changelog Fix xDS deadlock due to syncLoop termination. (#20867) 2024-03-15 13:57:11 -05:00
.github deployer: add a bunch of test coverage and fix a few panics (#20694) 2024-02-22 13:31:50 -06:00
.release security: fix syntax for release scan config (#20279) 2024-01-19 17:08:54 +00:00
acl Add default intention policy (#20544) 2024-02-08 20:25:42 +00:00
agent Fix xDS deadlock due to syncLoop termination. (#20867) 2024-03-15 13:57:11 -05:00
api NET-6821 Disable Terminating Gateway Auto Host Header Rewrite (#20802) 2024-03-12 15:37:20 -05:00
bench Gets benchmarks running again and does a rough pass for 0.7.1. 2016-11-29 13:02:26 -08:00
build-support [NET-8368] security: bump Go version to 1.21.8 (#20812) 2024-03-14 09:46:15 -04:00
command Add `consul snapshot decode` command (#20824) 2024-03-14 12:59:06 -04:00
connect Retry lint fixes (#19151) 2023-12-06 12:11:32 -05:00
contributing Move contributing to docs 2021-08-30 16:17:09 -04:00
docs docs: developer docs for resource finalizers (#20631) 2024-02-15 16:41:00 +00:00
envoyextensions security: upgrade google.golang.org/protobuf to 1.33.0 (#20801) 2024-03-06 23:04:42 +00:00
grafana NET-6862: adding disk-io and disk usage metrics to k8s grafana dashboard (#20169) 2024-01-16 22:14:17 +05:30
grpcmocks/proto-public v2tenancy: add optional LicenseFeature to type Registration struct (#20673) 2024-02-20 14:42:31 -06:00
internal Fix xDS deadlock due to syncLoop termination. (#20867) 2024-03-15 13:57:11 -05:00
ipaddr [COMPLIANCE] License changes (#18443) 2023-08-11 09:12:13 -04:00
lib [CC-7434] Skip collecting data directory metrics in dev mode (#20521) 2024-02-07 16:59:06 -06:00
logging Trigger the V1 Compat exported-services Controller when V1 Config Entries are Updated (#20456) 2024-02-02 15:30:04 -05:00
proto security: upgrade google.golang.org/protobuf to 1.33.0 (#20801) 2024-03-06 23:04:42 +00:00
proto-public security: upgrade google.golang.org/protobuf to 1.33.0 (#20801) 2024-03-06 23:04:42 +00:00
sdk Fix SDK iptables.Config marshalling (#20451) 2024-02-02 12:25:00 -06:00
sentinel Remove old build tags (#19128) 2023-10-10 10:58:06 -04:00
service_os Remove old build tags (#19128) 2023-10-10 10:58:06 -04:00
snapshot [COMPLIANCE] License changes (#18443) 2023-08-11 09:12:13 -04:00
test security: upgrade google.golang.org/protobuf to 1.33.0 (#20801) 2024-03-06 23:04:42 +00:00
test-integ security: upgrade google.golang.org/protobuf to 1.33.0 (#20801) 2024-03-06 23:04:42 +00:00
testing/deployer security: upgrade google.golang.org/protobuf to 1.33.0 (#20801) 2024-03-06 23:04:42 +00:00
testrpc feat(v2): add consul service and workloads to catalog (#20077) 2024-01-03 15:14:42 -05:00
tlsutil [Cloud][CC-6925] Updates to pushing server state (#19682) 2023-12-04 10:25:18 -05:00
tools/internal-grpc-proxy [COMPLIANCE] License changes (#18443) 2023-08-11 09:12:13 -04:00
troubleshoot security: upgrade google.golang.org/protobuf to 1.33.0 (#20801) 2024-03-06 23:04:42 +00:00
types [COMPLIANCE] License changes (#18443) 2023-08-11 09:12:13 -04:00
ui Revert link existing but better 🪦 (#20830) 2024-03-13 13:59:00 -07:00
version Rev VERSION for 1.19.0-dev (#20437) 2024-02-01 13:08:53 -06:00
website K8s v1 Multiport documentation indentation updates (#20858) 2024-03-14 22:11:47 +00:00
.copywrite.hcl [DO NOT MERGE UNTIL EOY] update year in LICENSE and copywrite files (#19780) 2024-01-02 08:41:12 -08:00
.dockerignore Update the scripting 2018-06-14 21:42:47 -04:00
.gitignore [CE] Misc cleanup for V2 DNS (#20640) 2024-02-14 12:40:38 -05:00
.go-version [NET-8368] security: bump Go version to 1.21.8 (#20812) 2024-03-14 09:46:15 -04:00
.golangci.yml [NET-4968] Upgrade Go to 1.21 (#20062) 2024-01-12 09:57:38 -05:00
.grpcmocks.yaml In-Memory gRPC (#19942) 2024-01-12 11:54:07 -05:00
.pre-commit-config.yaml unhack: add pre-commit guidelines (#19617) 2023-11-15 10:57:40 -06:00
CHANGELOG.md Update main changelog for 1.18.0 (#20744) 2024-02-28 00:54:33 +00:00
Dockerfile Xw/net 5724 grpc client delete (#20309) 2024-01-24 15:17:54 -08:00
Dockerfile-windows Envoy Integration Test Windows (#18007) 2023-07-21 20:26:00 +05:30
LICENSE [DO NOT MERGE UNTIL EOY] update year in LICENSE and copywrite files (#19780) 2024-01-02 08:41:12 -08:00
Makefile Update mog version to be compatible with go 1.22 (#20692) 2024-03-04 18:24:22 +00:00
README.md Update README.md (#19413) 2023-10-31 08:45:47 -07:00
buf.work.yaml [COMPLIANCE] License changes (#18443) 2023-08-11 09:12:13 -04:00
go.mod security: upgrade google.golang.org/protobuf to 1.33.0 (#20801) 2024-03-06 23:04:42 +00:00
go.sum security: upgrade google.golang.org/protobuf to 1.33.0 (#20801) 2024-03-06 23:04:42 +00:00
main.go [COMPLIANCE] License changes (#18443) 2023-08-11 09:12:13 -04:00
scan.hcl [NET-6969] security: Re-enable Go Module + secrets security scans for release branches (#19978) 2023-12-21 15:11:05 +00:00

README.md

Consul logo Consul

License: BUSL-1.1 Docker Pulls Go Report Card

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.

Consul provides several key features:

  • Multi-Datacenter - Consul is built to be datacenter aware, and can support any number of regions without complex configuration.

  • Service Mesh - Consul Service Mesh enables secure service-to-service communication with automatic TLS encryption and identity-based authorization. Applications can use sidecar proxies in a service mesh configuration to establish TLS connections for inbound and outbound connections with Transparent Proxy.

  • API Gateway - Consul API Gateway manages access to services within Consul Service Mesh, allow users to define traffic and authorization policies to services deployed within the mesh.

  • Service Discovery - Consul makes it simple for services to register themselves and to discover other services via a DNS or HTTP interface. External services such as SaaS providers can be registered as well.

  • Health Checking - Health Checking enables Consul to quickly alert operators about any issues in a cluster. The integration with service discovery prevents routing traffic to unhealthy hosts and enables service level circuit breakers.

  • Dynamic App Configuration - An HTTP API that allows users to store indexed objects within Consul, for storing configuration parameters and application metadata.

Consul runs on Linux, macOS, FreeBSD, Solaris, and Windows and includes an optional browser based UI. A commercial version called Consul Enterprise is also available.

Please note: We take Consul's security and our users' trust very seriously. If you believe you have found a security issue in Consul, please responsibly disclose by contacting us at security@hashicorp.com.

Quick Start

A few quick start guides are available on the Consul website:

Documentation

Full, comprehensive documentation is available on the Consul website: https://consul.io/docs

Contributing

Thank you for your interest in contributing! Please refer to CONTRIBUTING.md for guidance. For contributions specifically to the browser based UI, please refer to the UI's README.md for guidance.