Commit Graph

547 Commits (cc2950874026f226047703ebcd8a64706b07099f)

Author SHA1 Message Date
hc-github-team-consul-core e2ec1f9718
backport of commit 49f7423ab8 (#16335)
Co-authored-by: cskh <hui.kang@hashicorp.com>
2023-02-21 10:33:43 -05:00
hc-github-team-consul-core e076fbb8f5
Backport of Apply agent partition to load services and agent api into release/1.14.x (#16041)
* backport of commit a42e86ffd8

* backport of commit 4ad0f7aff4

Co-authored-by: cskh <hui.kang@hashicorp.com>
2023-01-23 16:41:26 +00:00
hc-github-team-consul-core 3e3ab25f9b
xds: don't attempt to load-balance sessions for local proxies (#15789) (#16004)
Previously, we'd begin a session with the xDS concurrency limiter
regardless of whether the proxy was registered in the catalog or in
the server's local agent state.

This caused problems for users who run `consul connect envoy` directly
against a server rather than a client agent, as the server's locally
registered proxies wouldn't be included in the limiter's capacity.

Now, the `ConfigSource` is responsible for beginning the session and we
only do so for services in the catalog.

Fixes: https://github.com/hashicorp/consul/issues/15753

Co-authored-by: Dan Upton <daniel@floppy.co>
2023-01-19 10:34:01 +00:00
hc-github-team-consul-core 7c934a4bf9
Backport of agent: Fix assignment of error when auto-reloading cert and key file changes. into release/1.14.x (#15772)
* backport of commit 7d0cf566ca

* backport of commit 024c8a84a6

* removing unused reference to pboperator

Co-authored-by: John Murret <john.murret@hashicorp.com>
2022-12-15 21:26:43 +00:00
hc-github-team-consul-core ceb102f352
Backport of Prevent serving TLS via ports.grpc into release/1.14.x (#15342)
This pull request was automerged via backport-assistant
2022-11-11 15:29:50 -05:00
hc-github-team-consul-core 1d5ae30946
backport of commit c7aee51b3d (#15201)
This pull request was automerged via backport-assistant
2022-10-31 10:56:53 -04:00
Chris S. Kim 29a297d3e9
Refactor client RPC timeouts (#14965)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created.

Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts.
2022-10-18 15:05:09 -04:00
Chris S. Kim 3d2dffff16
Merge pull request #13388 from deblasis/feature/health-checks_windows_service
Feature: Health checks windows service
2022-10-17 09:26:19 -04:00
Dan Upton 328e3ff563
proxycfg: rate-limit delivery of config snapshots (#14960)
Adds a user-configurable rate limiter to proxycfg snapshot delivery,
with a default limit of 250 updates per second.

This addresses a problem observed in our load testing of Consul
Dataplane where updating a "global" resource such as a wildcard
intention or the proxy-defaults config entry could starve the Raft or
Memberlist goroutines of CPU time, causing general cluster instability.
2022-10-14 15:52:00 +01:00
Freddy ee4cdc4985
Merge pull request #14935 from hashicorp/fix/alias-leak 2022-10-13 16:31:15 -06:00
Riddhi Shah 345191a0df
Service http checks data source for agentless proxies (#14924)
Adds another datasource for proxycfg.HTTPChecks, for use on server agents. Typically these checks are performed by local client agents and there is no equivalent of this in agentless (where servers configure consul-dataplane proxies).
Hence, the data source is mostly a no-op on servers but in the case where the service is present within the local state, it delegates to the cache data source.
2022-10-12 07:49:56 -07:00
Paul Glass d17af23641
gRPC server metrics (#14922)
* Move stats.go from grpc-internal to grpc-middleware
* Update grpc server metrics with server type label
* Add stats test to grpc-external
* Remove global metrics instance from grpc server tests
2022-10-11 17:00:32 -05:00
freddygv f4cc4577ca Fix alias check leak
Preivously when alias check was removed it would not be stopped nor
cleaned up from the associated aliasChecks map.

This means that any time an alias check was deregistered we would
leak a goroutine for CheckAlias.run() because the stopCh would never
be closed.

This issue mostly affects service mesh deployments on platforms where
the client agent is mostly static but proxy services come and go
regularly, since by default sidecars are registered with an alias check.
2022-10-10 16:42:29 -06:00
DanStough 77ab28c5c7 feat: xDS updates for peerings control plane through mesh gw 2022-10-07 08:46:42 -06:00
Chris Chapman 46bea72212
Making suggested changes 2022-09-30 14:51:12 -07:00
Chris Chapman 81e267171b
Bind a dns mux handler to gRPC proxy 2022-09-29 21:44:45 -07:00
Nick Ethier 1c1b0994b8
add HCP integration component (#14723)
* add HCP integration

* lint: use non-deprecated logging interface
2022-09-26 14:58:15 -04:00
Alessandro De Blasis fc0dd92dcf fix(agent): uninitialized map panic error 2022-09-21 09:25:54 +01:00
freddygv 02d3ce1039 Add server certificate manager
This certificate manager will request a leaf certificate for server
agents and then keep them up to date.
2022-09-16 17:57:10 -06:00
Daniel Graña 8c98172f53
[BUGFIX] Do not use interval as timeout (#14619)
Do not use interval as timeout
2022-09-15 12:39:48 -04:00
skpratt 19f79aa9a6
PR #14057 follow up fix: service id parsing from sidecar id (#14541)
* fix service id parsing from sidecar id

* simplify suffix trimming
2022-09-09 09:47:10 -05:00
Dan Upton 1c2c975b0b
xDS Load Balancing (#14397)
Prior to #13244, connect proxies and gateways could only be configured by an
xDS session served by the local client agent.

In an upcoming release, it will be possible to deploy a Consul service mesh
without client agents. In this model, xDS sessions will be handled by the
servers themselves, which necessitates load-balancing to prevent a single
server from receiving a disproportionate amount of load and becoming
overwhelmed.

This introduces a simple form of load-balancing where Consul will attempt to
achieve an even spread of load (xDS sessions) between all healthy servers.
It does so by implementing a concurrent session limiter (limiter.SessionLimiter)
and adjusting the limit according to autopilot state and proxy service
registrations in the catalog.

If a server is already over capacity (i.e. the session limit is lowered),
Consul will begin draining sessions to rebalance the load. This will result
in the client receiving a `RESOURCE_EXHAUSTED` status code. It is the client's
responsibility to observe this response and reconnect to a different server.

Users of the gRPC client connection brokered by the
consul-server-connection-manager library will get this for free.

The rate at which Consul will drain sessions to rebalance load is scaled
dynamically based on the number of proxies in the catalog.
2022-09-09 15:02:01 +01:00
Derek Menteer f7c884f0af Merge branch 'main' of github.com:hashicorp/consul into derekm/split-grpc-ports 2022-09-08 14:53:08 -05:00
Derek Menteer bfe7c5e8af Remove rebuilding grpc server. 2022-09-08 13:45:44 -05:00
Derek Menteer 80d31458e5 Various cleanups. 2022-09-08 10:51:50 -05:00
Chris S. Kim 03df6c3ac6
Reuse http.DefaultTransport in UIMetricsProxy (#14521)
http.Transport keeps a pool of connections and should be reused when possible. We instantiate a new http.DefaultTransport for every metrics request, making large numbers of concurrent requests inefficiently spin up new connections instead of reusing open ones.
2022-09-08 11:02:05 -04:00
skpratt 3bf1edfb3f
move port and default check logic to locked step (#14057) 2022-09-06 19:35:31 -05:00
Daniel Upton 8c46e48e0d proxycfg-glue: server-local implementation of IntentionUpstreamsDestination
This is the OSS portion of enterprise PR 2463.

Generalises the serverIntentionUpstreams type to support matching on a
service or destination.
2022-09-06 23:27:25 +01:00
Daniel Upton f8dba7e9ac proxycfg-glue: server-local implementation of InternalServiceDump
This is the OSS portion of enterprise PR 2489.

This PR introduces a server-local implementation of the
proxycfg.InternalServiceDump interface that sources data from a blocking query
against the server's state store.

For simplicity, it only implements the subset of the Internal.ServiceDump RPC
handler actually used by proxycfg - as such the result type has been changed
to IndexedCheckServiceNodes to avoid confusion.
2022-09-06 23:27:25 +01:00
Daniel Upton a31738f76f proxycfg-glue: server-local implementation of ResolvedServiceConfig
This is the OSS portion of enterprise PR 2460.

Introduces a server-local implementation of the proxycfg.ResolvedServiceConfig
interface that sources data from a blocking query against the server's state
store.

It moves the service config resolution logic into the agent/configentry package
so that it can be used in both the RPC handler and data source.

I've also done a little re-arranging and adding comments to call out data
sources for which there is to be no server-local equivalent.
2022-09-06 23:27:25 +01:00
Derek Menteer f64771c707 Address PR comments. 2022-09-01 16:54:24 -05:00
Derek Menteer 0ceec9017b Expose `grpc_tls` via serf for cluster peering. 2022-08-29 13:43:49 -05:00
Derek Menteer 1255a8a20d Add separate grpc_tls port.
To ease the transition for users, the original gRPC
port can still operate in a deprecated mode as either
plain-text or TLS mode. This behavior should be removed
in a future release whenever we no longer support this.

The resulting behavior from this commit is:
  `ports.grpc > 0 && ports.grpc_tls > 0` spawns both plain-text and tls ports.
  `ports.grpc > 0 && grpc.tls == undefined` spawns a single plain-text port.
  `ports.grpc > 0 && grpc.tls != undefined` spawns a single tls port (backwards compat mode).
2022-08-29 13:43:43 -05:00
Alessandro De Blasis 26cc56bc68 fix(agent): removed redundant code in docker check as well 2022-08-29 18:15:59 +01:00
Alessandro De Blasis c0d647d11e fix(agent): removed redundant check on prev. running check 2022-08-29 17:53:39 +01:00
Alessandro De Blasis 5dee555888 Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
2022-08-15 08:26:55 +01:00
Alessandro De Blasis ab611eabc3 Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
2022-08-15 08:09:56 +01:00
Chris S. Kim e3046120b3 Close active listeners on error
If startListeners successfully created listeners for some of its input addresses but eventually failed, the function would return an error and existing listeners would not be cleaned up.
2022-08-09 12:22:39 -04:00
alex a45bb1f06b
block PeerName register requests (#13887)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-29 14:36:22 -07:00
Luke Kysow a1e6d69454
peering: add config to enable/disable peering (#13867)
* peering: add config to enable/disable peering

Add config:

```
peering {
  enabled = true
}
```

Defaults to true. When disabled:
1. All peering RPC endpoints will return an error
2. Leader won't start its peering establishment goroutines
3. Leader won't start its peering deletion goroutines
2022-07-22 15:20:21 -07:00
Daniel Upton a8df87f574 proxycfg-glue: server-local implementation of `ExportedPeeredServices`
This is the OSS portion of enterprise PR 2377.

Adds a server-local implementation of the proxycfg.ExportedPeeredServices
interface that sources data from a blocking query against the server's
state store.
2022-07-22 15:23:23 +01:00
Daniel Upton 3655802fdc proxycfg-glue: server-local implementation of `PeeredUpstreams`
This is the OSS portion of enterprise PR 2352.

It adds a server-local implementation of the proxycfg.PeeredUpstreams interface
based on a blocking query against the server's state store.

It also fixes an omission in the Virtual IP freeing logic where we were never
updating the max index (and therefore blocking queries against
VirtualIPsForAllImportedServices would not return on service deletion).
2022-07-21 13:51:59 +01:00
R.B. Boyer bb4d4040fb
server: ensure peer replication can successfully use TLS over external gRPC (#13733)
Ensure that the peer stream replication rpc can successfully be used with TLS activated.

Also:

- If key material is configured for the gRPC port but HTTPS is not
  enabled now TLS will still be activated for the gRPC port.

- peerstream replication stream opened by the establishing-side will now
  ignore grpc.WithBlock so that TLS errors will bubble up instead of
  being awkwardly delayed or suppressed
2022-07-15 13:15:50 -05:00
Dan Stough 49f3dadb8f feat: connect proxy xDS for destinations
Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>
2022-07-14 15:27:02 -04:00
Daniel Upton 3d74efa8ad proxycfg-glue: server-local implementation of `FederationStateListMeshGateways`
This is the OSS portion of enterprise PR 2265.

This PR provides a server-local implementation of the
proxycfg.FederationStateListMeshGateways interface based on blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton ccc672013e proxycfg-glue: server-local implementation of `GatewayServices`
This is the OSS portion of enterprise PR 2259.

This PR provides a server-local implementation of the proxycfg.GatewayServices
interface based on blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton 15a319dbfe proxycfg-glue: server-local implementation of `TrustBundle` and `TrustBundleList`
This is the OSS portion of enterprise PR 2250.

This PR provides server-local implementations of the proxycfg.TrustBundle and
proxycfg.TrustBundleList interfaces, based on local blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton 673d02d30f proxycfg-glue: server-local implementation of the `Health` interface
This is the OSS portion of enterprise PR 2249.

This PR introduces an implementation of the proxycfg.Health interface based on a
local materialized view of the health events.

It reuses the view and request machinery from agent/rpcclient/health, which made
it super straightforward.
2022-07-14 18:22:12 +01:00
Daniel Upton 3c533ceea8 proxycfg-glue: server-local implementation of `ServiceList`
This is the OSS portion of enterprise PR 2242.

This PR introduces a server-local implementation of the proxycfg.ServiceList
interface, backed by streaming events and a local materializer.
2022-07-14 18:22:12 +01:00
Daniel Upton fbf88d3b19 proxycfg-glue: server-local compiled discovery chain data source
This is the OSS portion of enterprise PR 2236.

Adds a local blocking query-based implementation of the proxycfg.CompiledDiscoveryChain interface.
2022-07-14 18:22:12 +01:00