Commit Graph

4552 Commits (560d410c6d52a23609ddf04d983d61c79a701c29)

Author SHA1 Message Date
Chris S. Kim de73171202 Handle wrapped errors in isFailedPreconditionErr 2022-08-11 11:16:02 -04:00
Daniel Kimsey 3c4fa9b468 Add support for filtering the 'List Services' API
1. Create a bexpr filter for performing the filtering
2. Change the state store functions to return the raw (not aggregated)
   list of ServiceNodes.
3. Move the aggregate service tags by name logic out of the state store
   functions into a new function called from the RPC endpoint
4. Perform the filtering in the endpoint before aggregation.
2022-08-10 16:52:32 -05:00
cskh 11e7a0d547
fix: shadowed err in retryJoin() (#14112)
- err value will be used later to surface the error message
  if r.join() returns any err.
2022-08-10 10:53:57 -04:00
skpratt 79c23a7cd2
Merge pull request #14056 from hashicorp/proxy-register-port-race
Refactor sidecar_service method to separate port assignment
2022-08-10 09:46:29 -05:00
skpratt aa77559819 Merge branch 'main' into proxy-register-port-race 2022-08-10 08:40:45 -05:00
Chris S. Kim e3046120b3 Close active listeners on error
If startListeners successfully created listeners for some of its input addresses but eventually failed, the function would return an error and existing listeners would not be cleaned up.
2022-08-09 12:22:39 -04:00
Chris S. Kim 6311c651de Add retry in TestAgentConnectCALeafCert_good 2022-08-09 11:20:37 -04:00
Kyle Havlovitz 6938b8c755
Merge pull request #13958 from hashicorp/gateway-wildcard-fix
Fix wildcard picking up services it shouldn't for ingress/terminating gateways
2022-08-08 12:54:40 -07:00
Kyle Havlovitz fe1fcea34f Add some extra handling for destination deletes 2022-08-08 11:38:13 -07:00
freddygv d421e18172 Update snapshot test 2022-08-08 09:17:15 -06:00
freddygv 1031ffc3c7 Re-validate existing secrets at state store
Previously establishment and pending secrets were only checked at the
RPC layer. However, given that these are Check-and-Set transactions we
should ensure that the given secrets are still valid when persisting a
secret exchange or promotion.

Otherwise it would be possible for concurrent requests to overwrite each
other.
2022-08-08 09:06:07 -06:00
freddygv 0ea4bfae94 Test fixes 2022-08-08 08:31:47 -06:00
freddygv c04515a844 Use proto message for each secrets write op
Previously there was a field indicating the operation that triggered a
secrets write. Now there is a message for each operation and it contains
the secret ID being persisted.
2022-08-08 01:41:00 -06:00
Kyle Havlovitz 6580566c3b Update ingress/terminating wildcard logic and handle destinations 2022-08-05 07:56:10 -07:00
freddygv 8067890787 Inherit active secret when exchanging 2022-08-03 17:32:53 -05:00
freddygv 60d6e28c97 Pass explicit signal with op for secrets write
Previously the updates to the peering secrets UUID table relied on
inferring what action triggered the update based on a reconciliation
against the existing secrets.

Instead we now explicitly require the operation to be given so that the
inference isn't necessary. This makes the UUID table logic easier to
reason about and fixes some related bugs.

There is also an update so that the peering secrets get handled on
snapshots/restores.
2022-08-03 17:25:12 -05:00
freddygv 9ca687bc7c Avoid deleting peering secret UUIDs at dialers
Dialers do not keep track of peering secret UUIDs, so they should not
attempt to clean up data from that table when their peering is deleted.

We also now keep peer server addresses when marking peerings for
deletion. Peer server addresses are used by the ShouldDial() helper
when determining whether the peering is for a dialer or an acceptor.
We need to keep this data so that peering secrets can be cleaned up
accordingly.
2022-08-03 16:34:57 -05:00
skpratt 58eed6b049
Merge pull request #13906 from skpratt/validate-port-agent-split
Separate port and socket path validation for local agent
2022-08-02 16:58:41 -05:00
Dhia Ayachi 7154367892
add token to the request when creating a cacheIntentions query (#14005) 2022-08-02 14:27:34 -04:00
Kyle Havlovitz 499211f907 Fix wildcard picking up services it shouldn't for ingress/terminating gateways 2022-08-02 09:41:31 -07:00
Daniel Upton 6452118c15 proxycfg-sources: fix hot loop when service not found in catalog
Fixes a bug where a service getting deleted from the catalog would cause
the ConfigSource to spin in a hot loop attempting to look up the service.

This is because we were returning a nil WatchSet which would always
unblock the select.

Kudos to @freddygv for discovering this!
2022-08-02 15:42:29 +01:00
Freddy 42996411cc
Various peering fixes (#13979)
* Avoid logging StreamSecretID
* Wrap additional errors in stream handler
* Fix flakiness in leader test and rename servers for clarity. There was
  a race condition where the peering was being deleted in the test
  before the stream was active. Now the test waits for the stream to be
  connected on both sides before deleting the associated peering.
* Run flaky test serially
2022-08-01 15:06:18 -06:00
DanStough 169ff71132 fix: ipv4 destination dns resolution 2022-08-01 16:45:57 -04:00
Luke Kysow 988e1fd35d
peering: default to false (#13963)
* defaulting to false because peering will be released as beta
* Ignore peering disabled error in bundles cachetype

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>
2022-08-01 15:22:36 -04:00
Freddy dacf703d20
Merge branch 'main' into fix-kv_entries-metric 2022-08-01 13:19:27 -06:00
Freddy 72b6d69652
Merge pull request #13499 from maxb/delete-unused-metric
Delete definition of metric `consul.acl.blocked.node.deregistration`
2022-08-01 12:31:05 -06:00
Dhia Ayachi 6fd65a4a45
Tgtwy egress HTTP support (#13953)
* add golden files

* add support to http in tgateway egress destination

* fix slice sorting to include both address and port when using server_names

* fix listener loop for http destination

* fix routes to generate a route per port and a virtualhost per port-address combination

* sort virtual hosts list to have a stable order

* extract redundant serviceNode
2022-08-01 14:12:43 -04:00
Matt Keeler f74d0cef7a
Implement/Utilize secrets for Peering Replication Stream (#13977) 2022-08-01 10:33:18 -04:00
alex a45bb1f06b
block PeerName register requests (#13887)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-29 14:36:22 -07:00
Luke Kysow 95096e2c03
peering: retry establishing connection more quickly on certain errors (#13938)
When we receive a FailedPrecondition error, retry that more quickly
because we expect it will resolve shortly. This is particularly
important in the context of Consul servers behind a load balancer
because when establishing a connection we have to retry until we
randomly land on a leader node.

The default retry backoff goes from 2s, 4s, 8s, etc. which can result in
very long delays quite quickly. Instead, this backoff retries in 8ms
five times, then goes exponentially from there: 16ms, 32ms, ... up to a
max of 8152ms.
2022-07-29 13:04:32 -07:00
Sarah Pratt 10a4999a87 Separate port and socket path requirement in case of local agent assignment 2022-07-29 13:28:21 -05:00
alex 92c615c35f
Merge pull request #13952 from hashicorp/sync-more-acl
sync more acl enforcement
2022-07-28 12:31:02 -07:00
Dhia Ayachi 256694b603
inject gateway addons to destination clusters (#13951) 2022-07-28 15:17:35 -04:00
acpana eae4e71492
sync more acl enforcement
sync w ent at 32756f7

Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-28 12:01:52 -07:00
alex 41f3343eac
Merge pull request #13929 from hashicorp/fix-validation
[sync] fix empty partitions matching
2022-07-28 10:14:49 -07:00
Sarah Pratt a3ef6f016e refactor sidecare_service method into parts 2022-07-28 09:07:13 -05:00
Ashwin Venkatesh eef9edaed9
Add peer counts to emitted metrics. (#13930) 2022-07-27 18:34:04 -04:00
Luke Kysow 465a9801e1
Merge pull request #13924 from hashicorp/lkysow/util-metric-peering
peering: don't track imported services/nodes in usage
2022-07-27 14:49:55 -07:00
acpana 6033584349
use EqualPartitions
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-27 14:48:30 -07:00
acpana 0351ca5136
better fix
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-27 14:28:08 -07:00
acpana 8b2ef80336
sync w ent
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-27 11:41:39 -07:00
Chris S. Kim 0999e05a7d Reduce arm64 flakes for TestConnectCA_ConfigurationSet_ChangeKeyConfig_Primary
There were 16 combinations of tests but 4 of them were duplicates since the default key type and bits were "ec" and 256. That entry was commented out to reduce the subtest count to 12.

testrpc.WaitForLeader was failing on arm64 environments; the cause is unknown but it might be due to the environment being flooded with parallel tests making RPC calls. The RPC polling+retry was replaced with a simpler check for leadership based on raft.
2022-07-27 13:54:34 -04:00
Chris S. Kim 8ead1caf53 Retry checks for virtual IP metadata 2022-07-27 13:54:34 -04:00
Chris S. Kim 62ed0250c3 Sort slice of ServiceNames deterministically 2022-07-27 13:54:34 -04:00
Sarah Pratt f520f6dd0f Separate port and socket path requirement in case of local agent assignment 2022-07-27 12:30:52 -05:00
Luke Kysow 740d54e730 peering: don't track imported services/nodes in usage
Services/nodes that are imported from other peers are stored in
state. We don't want to count those as part of our own cluster's usage.
2022-07-27 09:08:51 -07:00
cskh 4e292b7b72
chore: clarify the error message: service.service must not be empty (#13907)
- when register service using catalog endpoint, the key of service
  name actually should be "service". Add this information to the
  error message will help user to quickly fix in the request.
2022-07-27 10:16:46 -04:00
cskh 59e81a728e
chore: removed unused method AddService (#13905)
- This AddService is not used anywhere.
  AddServiceWithChecks is place of AddService
- Test code is updated
2022-07-26 16:54:53 -04:00
Luke Kysow 021b00e321 Remove duplicate comment 2022-07-26 10:19:49 -07:00
alex 437a28d18a
peering: prevent peering in same partition (#13851)
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-07-25 18:00:48 -07:00
Nitya Dhanushkodi 27bd895ac8
peering: remove validation that forces peering token server addresses to be an IP, allow hostname based addresses (#13874) 2022-07-25 16:33:47 -07:00
Luke Kysow 8c5b70d227
Rename receive to recv in tracker (#13896)
Because it's shorter
2022-07-25 16:08:03 -07:00
Luke Kysow 3530d3782d
peering: read endpoints can now return failing status (#13849)
Track streams that have been disconnected due to an error and
set their statuses to failing.
2022-07-25 14:27:53 -07:00
Kyle Havlovitz 93de25f87c
Merge pull request #13872 from hashicorp/remove-upstream-log
Remove extra logging from ingress upstream watch shutdown
2022-07-25 12:55:30 -07:00
Chris S. Kim 73a84f256f
Preserve PeeringState on upsert (#13666)
Fixes a bug where if the generate token is called twice, the second call upserts the zero-value (undefined) of PeeringState.
2022-07-25 14:37:56 -04:00
Chris S. Kim 8ed49ea4d0
Update envoy metrics label extraction for peered clusters and listeners (#13818)
Now that peered upstreams can generate envoy resources (#13758), we need a way to disambiguate local from peered resources in our metrics. The key difference is that datacenter and partition will be replaced with peer, since in the context of peered resources partition is ambiguous (could refer to the partition in a remote cluster or one that exists locally). The partition and datacenter of the proxy will always be that of the source service.

Regexes were updated to make emitting datacenter and partition labels mutually exclusive with peer labels.

Listener filter names were updated to better match the existing regex.

Cluster names assigned to peered upstreams were updated to be synthesized from local peer name (it previously used the externally provided primary SNI, which contained the peer name from the other side of the peering). Integration tests were updated to assert for the new peer labels.
2022-07-25 13:49:00 -04:00
DanStough 2da8949d78 feat: convert destination address to slice 2022-07-25 12:31:58 -04:00
Freddy f03cca7576
[OSS] Add ACL enforcement to peering endpoints (#13878) 2022-07-25 10:04:10 -06:00
Matt Keeler 58e4d8235b
Enable/Disable Peering Support in the UI (#13816)
We enabled/disable based on the config flag.
2022-07-25 11:50:11 -04:00
freddygv b544ce6485 Add ACL enforcement to peering endpoints 2022-07-25 09:34:29 -06:00
Kyle Havlovitz 016f963e7e Remove excess debug log from ingress upstream shutdown 2022-07-22 17:29:38 -07:00
alex 279d458e6e
peering: use ShouldDial to validate peer role (#13823)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-22 15:56:25 -07:00
Luke Kysow a1e6d69454
peering: add config to enable/disable peering (#13867)
* peering: add config to enable/disable peering

Add config:

```
peering {
  enabled = true
}
```

Defaults to true. When disabled:
1. All peering RPC endpoints will return an error
2. Leader won't start its peering establishment goroutines
3. Leader won't start its peering deletion goroutines
2022-07-22 15:20:21 -07:00
Kyle Havlovitz 0786517b56
Merge pull request #13847 from hashicorp/gateway-goroutine-leak
Fix goroutine leaks in proxycfg when using ingress gateway
2022-07-22 14:43:22 -07:00
Freddy f99df57840
[OSS] Add new peering ACL rule (#13848)
This commit adds a new ACL rule named "peering" to authorize
actions taken against peering-related endpoints.

The "peering" rule has several key properties:
- It is scoped to a partition, and MUST be defined in the default
  namespace.

- Its access level must be "read', "write", or "deny".

- Granting an access level will apply to all peerings. This ACL rule
  cannot be used to selective grant access to some peerings but not
  others.

- If the peering rule is not specified, we fall back to the "operator"
  rule and then the default ACL rule.
2022-07-22 14:42:23 -06:00
alex 927cee692b
peering: emit exported services count metric (#13811)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-22 12:05:08 -07:00
Daniel Upton a8df87f574 proxycfg-glue: server-local implementation of `ExportedPeeredServices`
This is the OSS portion of enterprise PR 2377.

Adds a server-local implementation of the proxycfg.ExportedPeeredServices
interface that sources data from a blocking query against the server's
state store.
2022-07-22 15:23:23 +01:00
Eric Haberkorn 501089292e
Add Cluster Peering Failover Support to Prepared Queries (#13835)
Add peering failover support to prepared queries
2022-07-22 09:14:43 -04:00
Nitya Dhanushkodi f47319b7c6
update generate token endpoint to take external addresses (#13844)
Update generate token endpoint (rpc, http, and api module)

If ServerExternalAddresses are set, it will override any addresses gotten from the "consul" service, and be used in the token instead, and dialed by the dialer. This allows for setting up a load balancer for example, in front of the consul servers.
2022-07-21 14:56:11 -07:00
acpana 12b773ab02
Rename peering internal to ~
sync ENT to 5679392c81

Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-21 10:51:05 -07:00
Luke Kysow 0c87be0845
peering: Add heartbeating to peering streams (#13806)
* Add heartbeating to peering streams
2022-07-21 10:03:27 -07:00
Daniel Upton 3655802fdc proxycfg-glue: server-local implementation of `PeeredUpstreams`
This is the OSS portion of enterprise PR 2352.

It adds a server-local implementation of the proxycfg.PeeredUpstreams interface
based on a blocking query against the server's state store.

It also fixes an omission in the Virtual IP freeing logic where we were never
updating the max index (and therefore blocking queries against
VirtualIPsForAllImportedServices would not return on service deletion).
2022-07-21 13:51:59 +01:00
Luke Kysow c411e6b326
Add send mutex to protect against concurrent sends (#13805) 2022-07-20 15:48:18 -07:00
Kyle Havlovitz 0be7d923dc Cancel upstream watches when the discovery chain has been removed 2022-07-20 14:26:52 -07:00
Kyle Havlovitz 31318d7049 Fix duplicate Notify calls for discovery chains in ingress gateways 2022-07-20 14:25:20 -07:00
Evan Culver 4116537b83
connect: Add support for Envoy 1.23, remove 1.19 (#13807) 2022-07-19 14:51:04 -07:00
Paul Glass 77afe0e76e
Extract AWS auth implementation out of Consul (#13760) 2022-07-19 16:26:44 -05:00
Chris S. Kim 495936300e
Make envoy resources for inferred peered upstreams (#13758)
Peered upstreams has a separate loop in xds from discovery chain upstreams. This PR adds similar but slightly modified code to add filters for peered upstream listeners, clusters, and endpoints in the case of transparent proxy.
2022-07-19 14:56:28 -04:00
alex de5a991d8c
peering: refactor reconcile, cleanup (#13795)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-19 11:43:29 -07:00
Luke Kysow e8d965e56f
peerstream: set keepalive enforcement to 15s (#13796)
The client is set to send keepalive pings every 30s. The server
keepalive enforcement must be set to a number less than that,
otherwise it will disconnect clients for sending pings too often.
MinTime governs the minimum amount of time between pings.
2022-07-18 16:12:03 -07:00
alex a9ae2ff4fa
peering: track exported services (#13784)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-18 10:20:04 -07:00
R.B. Boyer cd513aeead
peerstream: require a resource subscription to receive updates of that type (#13767)
This mimics xDS's discovery protocol where you must request a resource
explicitly for the exporting side to send those events to you.

As part of this I aligned the overall ResourceURL with the TypeURL that
gets embedded into the encoded protobuf Any construct. The
CheckServiceNodes is now wrapped in a better named "ExportedService"
struct now.
2022-07-15 15:03:40 -05:00
R.B. Boyer c737301093
peerstream: fix test assertions (#13780) 2022-07-15 14:43:24 -05:00
Luke Kysow 46381b1a7f
Add docs for peerStreamServer vs peeringServer. (#13781) 2022-07-15 12:23:05 -07:00
Luke Kysow ca3d7c964c
peerstream: dialer should reconnect when stream closes (#13745)
* peerstream: dialer should reconnect when stream closes

If the stream is closed unexpectedly (i.e. when we haven't received
a terminated message), the dialer should attempt to re-establish the
stream.

Previously, the `HandleStream` would return `nil` when the stream
was closed. The caller then assumed the stream was terminated on purpose
and so didn't reconnect when instead it was stopped unexpectedly and
the dialer should have attempted to reconnect.
2022-07-15 11:58:33 -07:00
R.B. Boyer bb4d4040fb
server: ensure peer replication can successfully use TLS over external gRPC (#13733)
Ensure that the peer stream replication rpc can successfully be used with TLS activated.

Also:

- If key material is configured for the gRPC port but HTTPS is not
  enabled now TLS will still be activated for the gRPC port.

- peerstream replication stream opened by the establishing-side will now
  ignore grpc.WithBlock so that TLS errors will bubble up instead of
  being awkwardly delayed or suppressed
2022-07-15 13:15:50 -05:00
alex adb5ffa1a6
peering: track imported services (#13718) 2022-07-15 10:20:43 -07:00
Matt Keeler 257f88d4df
Use Node Name for peering healthSnapshot instead of ID (#13773)
A Node ID is not a required field with Consul’s data model. Therefore we cannot reliably expect all uses to have it. However the node name is required and must be unique so its equally as good of a key for the internal healthSnapshot node tracking.
2022-07-15 10:51:38 -04:00
Matt Keeler 05b5e7e2ca
Enable partition support for peering establishment (#13772)
Prior to this the dialing side of the peering would only ever work within the default partition. This commit allows properly parsing the partition field out of the API struct request body, query param and header.
2022-07-15 10:07:07 -04:00
Dan Stough 49f3dadb8f feat: connect proxy xDS for destinations
Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>
2022-07-14 15:27:02 -04:00
Daniel Upton 3d74efa8ad proxycfg-glue: server-local implementation of `FederationStateListMeshGateways`
This is the OSS portion of enterprise PR 2265.

This PR provides a server-local implementation of the
proxycfg.FederationStateListMeshGateways interface based on blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton ccc672013e proxycfg-glue: server-local implementation of `GatewayServices`
This is the OSS portion of enterprise PR 2259.

This PR provides a server-local implementation of the proxycfg.GatewayServices
interface based on blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton 15a319dbfe proxycfg-glue: server-local implementation of `TrustBundle` and `TrustBundleList`
This is the OSS portion of enterprise PR 2250.

This PR provides server-local implementations of the proxycfg.TrustBundle and
proxycfg.TrustBundleList interfaces, based on local blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton 673d02d30f proxycfg-glue: server-local implementation of the `Health` interface
This is the OSS portion of enterprise PR 2249.

This PR introduces an implementation of the proxycfg.Health interface based on a
local materialized view of the health events.

It reuses the view and request machinery from agent/rpcclient/health, which made
it super straightforward.
2022-07-14 18:22:12 +01:00
Daniel Upton 3c533ceea8 proxycfg-glue: server-local implementation of `ServiceList`
This is the OSS portion of enterprise PR 2242.

This PR introduces a server-local implementation of the proxycfg.ServiceList
interface, backed by streaming events and a local materializer.
2022-07-14 18:22:12 +01:00
Daniel Upton fbf88d3b19 proxycfg-glue: server-local compiled discovery chain data source
This is the OSS portion of enterprise PR 2236.

Adds a local blocking query-based implementation of the proxycfg.CompiledDiscoveryChain interface.
2022-07-14 18:22:12 +01:00
Chris S. Kim f56810132f Check if an upstream is implicit from either intentions or peered services 2022-07-13 16:53:20 -04:00
Chris S. Kim 02cff2394d Use new maps for proxycfg peered data 2022-07-13 16:05:10 -04:00
Chris S. Kim 7f32cba735 Add new watch.Map type to refactor proxycfg 2022-07-13 16:05:10 -04:00
Chris S. Kim b4ffa9ae0c Scrub VirtualIPs before exporting 2022-07-13 16:05:10 -04:00