consul

Commit Graph

Author	SHA1	Message	Date
Dan Stough	626249fbf5	[OSS] fix: wait and try longer to peer through mesh gw (#15328 )	2022-11-10 13:54:00 -05:00
Kyle Schochenmaier	bf0f61a878	removes ioutil usage everywhere which was deprecated in go1.16 (#15297 ) * update go version to 1.18 for api and sdk, go mod tidy * removes ioutil usage everywhere which was deprecated in go1.16 in favour of io and os packages. Also introduces a lint rule which forbids use of ioutil going forward. Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-11-10 10:26:01 -06:00
Derek Menteer	b64972d486	Bring back parameter ServerExternalAddresses in GenerateToken endpoint (#15267 ) Re-add ServerExternalAddresses parameter in GenerateToken endpoint This reverts commit `5e156772f6` and adds extra functionality to support newer peering behaviors.	2022-11-08 14:55:18 -06:00
Chris S. Kim	985a4ee1b1	Update hcp-scada-provider to fix diamond dependency problem with go-msgpack (#15185 )	2022-11-07 11:34:30 -05:00
Derek Menteer	f4cb2f82bf	Backport various fixes from ENT. (#15254 ) * Regenerate golden files. * Backport from ENT: "Avoid race" Original commit: 5006c8c858b0e332be95271ef9ba35122453315b Original author: freddygv * Backport from ENT: "chore: fix flake peerstream test" Original commit: b74097e7135eca48cc289798c5739f9ef72c0cc8 Original author: DanStough	2022-11-03 16:34:57 -05:00
Derek Menteer	693c8a4706	Allow peering endpoints to bypass verify_incoming.	2022-10-31 09:56:30 -05:00
R.B. Boyer	300860412c	chore: update golangci-lint to v1.50.1 (#15022 )	2022-10-24 11:48:02 -05:00
freddygv	d65e60de86	Return forbidden on permission denied This commit updates the establish endpoint to bubble up a 403 status code to callers when the establishment secret from the token is invalid. This is a signal that a new peering token must be generated.	2022-10-20 17:11:49 -06:00
Nitya Dhanushkodi	5e156772f6	Remove ability to specify external addresses in GenerateToken endpoint (#14930 ) * Reverts "update generate token endpoint to take external addresses (#13844)" This reverts commit `f47319b7c6`.	2022-10-19 09:31:36 -07:00
freddygv	e69bc727ec	Update peering establishment to maybe use gateways When peering through mesh gateways we expect outbound dials to peer servers to flow through the local mesh gateway addresses. Now when establishing a peering we get a list of dial addresses as a ring buffer that includes local mesh gateway addresses if the local DC is configured to peer through mesh gateways. The ring buffer includes the mesh gateway addresses first, but also includes the remote server addresses as a fallback. This fallback is present because it's possible that direct egress from the servers may be allowed. If not allowed then the leader will cycle back to a mesh gateway address through the ring. When attempting to dial the remote servers we retry up to a fixed timeout. If using mesh gateways we also have an initial wait in order to allow for the mesh gateways to configure themselves. Note that if we encounter a permission denied error we do not retry since that error indicates that the secret in the peering token is invalid.	2022-10-13 14:57:55 -06:00
Derek Menteer	1e394da400	Disallow peering to the same cluster.	2022-10-13 14:11:02 -05:00
Derek Menteer	caa1396255	Add remote peer partition and datacenter info.	2022-10-13 10:37:41 -05:00
Paul Glass	d17af23641	gRPC server metrics (#14922 ) * Move stats.go from grpc-internal to grpc-middleware * Update grpc server metrics with server type label * Add stats test to grpc-external * Remove global metrics instance from grpc server tests	2022-10-11 17:00:32 -05:00
Freddy	4abad02abd	Merge pull request #14796 from hashicorp/peering/use-connect-ca	2022-10-07 10:37:37 -06:00
freddygv	3034df6a5c	Require Connect and TLS to generate peering tokens By requiring Connect and a gRPC TLS listener we can automatically configure TLS for all peering control-plane traffic.	2022-10-07 09:06:29 -06:00
freddygv	fac3ddc857	Use internal server certificate for peering TLS A previous commit introduced an internally-managed server certificate to use for peering-related purposes. Now the peering token has been updated to match that behavior: - The server name matches the structure of the server cert - The CA PEMs correspond to the Connect CA Note that if Conect is disabled, and by extension the Connect CA, we fall back to the previous behavior of returning the manually configured certs and local server SNI. Several tests were updated to use the gRPC TLS port since they enable Connect by default. This means that the peering token will embed the Connect CA, and the dialer will expect a TLS listener.	2022-10-07 09:05:32 -06:00
DanStough	77ab28c5c7	feat: xDS updates for peerings control plane through mesh gw	2022-10-07 08:46:42 -06:00
Eric Haberkorn	1b565444be	Rename `PeerName` to `Peer` on prepared queries and exported services (#14854 )	2022-10-04 14:46:15 -04:00
Eric Haberkorn	80e51ff907	Add exported services event to cluster peering replication. (#14797 )	2022-09-29 15:37:19 -04:00
malizz	84b0f408fa	Support Stale Queries for Trust Bundle Lookups (#14724 ) * initial commit * add tags, add conversations * add test for query options utility functions * update previous tests * fix test * don't error out on empty context * add changelog * update decode config	2022-09-28 09:56:59 -07:00
DanStough	2a2debee64	feat(peering): validate server name conflicts on establish	2022-09-14 11:37:30 -04:00
Dan Upton	1c2c975b0b	xDS Load Balancing (#14397 ) Prior to #13244, connect proxies and gateways could only be configured by an xDS session served by the local client agent. In an upcoming release, it will be possible to deploy a Consul service mesh without client agents. In this model, xDS sessions will be handled by the servers themselves, which necessitates load-balancing to prevent a single server from receiving a disproportionate amount of load and becoming overwhelmed. This introduces a simple form of load-balancing where Consul will attempt to achieve an even spread of load (xDS sessions) between all healthy servers. It does so by implementing a concurrent session limiter (limiter.SessionLimiter) and adjusting the limit according to autopilot state and proxy service registrations in the catalog. If a server is already over capacity (i.e. the session limit is lowered), Consul will begin draining sessions to rebalance the load. This will result in the client receiving a `RESOURCE_EXHAUSTED` status code. It is the client's responsibility to observe this response and reconnect to a different server. Users of the gRPC client connection brokered by the consul-server-connection-manager library will get this for free. The rate at which Consul will drain sessions to rebalance load is scaled dynamically based on the number of proxies in the catalog.	2022-09-09 15:02:01 +01:00
freddygv	650e48624d	Allow terminated peerings to be deleted Peerings are terminated when a peer decides to delete the peering from their end. Deleting a peering sends a termination message to the peer and triggers them to mark the peering as terminated but does NOT delete the peering itself. This is to prevent peerings from disappearing from both sides just because one side deleted them. Previously the Delete endpoint was skipping the deletion if the peering was not marked as active. However, terminated peerings are also inactive. This PR makes some updates so that peerings marked as terminated can be deleted by users.	2022-08-26 10:52:47 -06:00
Luke Kysow	988e1fd35d	peering: default to false (#13963 ) * defaulting to false because peering will be released as beta * Ignore peering disabled error in bundles cachetype Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: freddygv <freddy@hashicorp.com> Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>	2022-08-01 15:22:36 -04:00
Matt Keeler	f74d0cef7a	Implement/Utilize secrets for Peering Replication Stream (#13977 )	2022-08-01 10:33:18 -04:00
alex	437a28d18a	peering: prevent peering in same partition (#13851 ) Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-07-25 18:00:48 -07:00
freddygv	b544ce6485	Add ACL enforcement to peering endpoints	2022-07-25 09:34:29 -06:00
alex	279d458e6e	peering: use ShouldDial to validate peer role (#13823 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-22 15:56:25 -07:00
Luke Kysow	a1e6d69454	peering: add config to enable/disable peering (#13867 ) * peering: add config to enable/disable peering Add config: ``` peering { enabled = true } ``` Defaults to true. When disabled: 1. All peering RPC endpoints will return an error 2. Leader won't start its peering establishment goroutines 3. Leader won't start its peering deletion goroutines	2022-07-22 15:20:21 -07:00
Nitya Dhanushkodi	f47319b7c6	update generate token endpoint to take external addresses (#13844 ) Update generate token endpoint (rpc, http, and api module) If ServerExternalAddresses are set, it will override any addresses gotten from the "consul" service, and be used in the token instead, and dialed by the dialer. This allows for setting up a load balancer for example, in front of the consul servers.	2022-07-21 14:56:11 -07:00
R.B. Boyer	bb4d4040fb	server: ensure peer replication can successfully use TLS over external gRPC (#13733 ) Ensure that the peer stream replication rpc can successfully be used with TLS activated. Also: - If key material is configured for the gRPC port but HTTPS is not enabled now TLS will still be activated for the gRPC port. - peerstream replication stream opened by the establishing-side will now ignore grpc.WithBlock so that TLS errors will bubble up instead of being awkwardly delayed or suppressed	2022-07-15 13:15:50 -05:00
Dan Upton	b9e525d689	grpc: rename public/private directories to external/internal (#13721 ) Previously, public referred to gRPC services that are both exposed on the dedicated gRPC port and have their definitions in the proto-public directory (so were considered usable by 3rd parties). Whereas private referred to services on the multiplexed server port that are only usable by agents and other servers. Now, we're splitting these definitions, such that external/internal refers to the port and public/private refers to whether they can be used by 3rd parties. This is necessary because the peering replication API needs to be exposed on the dedicated port, but is not (yet) suitable for use by 3rd parties.	2022-07-13 16:33:48 +01:00
R.B. Boyer	af04851637	peering: move peer replication to the external gRPC port (#13698 ) Peer replication is intended to be between separate Consul installs and effectively should be considered "external". This PR moves the peer stream replication bidirectional RPC endpoint to the external gRPC server and ensures that things continue to function.	2022-07-08 12:01:13 -05:00
Chris S. Kim	f07132dacc	Revise possible states for a peering. (#13661 ) These changes are primarily for Consul's UI, where we want to be more specific about the state a peering is in. - The "initial" state was renamed to pending, and no longer applies to peerings being established from a peering token. - Upon request to establish a peering from a peering token, peerings will be set as "establishing". This will help distinguish between the two roles: the cluster that generates the peering token and the cluster that establishes the peering. - When marked for deletion, peering state will be set to "deleting". This way the UI determines the deletion via the state rather than the "DeletedAt" field. Co-authored-by: freddygv <freddy@hashicorp.com>	2022-07-04 10:47:58 -04:00
Daniel Upton	653b8c4f9d	proxycfg: server-local config entry data sources This is the OSS portion of enterprise PR 2056. This commit provides server-local implementations of the proxycfg.ConfigEntry and proxycfg.ConfigEntryList interfaces, that source data from streaming events. It makes use of the LocalMaterializer type introduced for peering replication, adding the necessary support for authorization. It also adds support for "wildcard" subscriptions (within a topic) to the event publisher, as this is needed to fetch service-resolvers for all services when configuring mesh gateways. Currently, events will be emitted for just the ingress-gateway, service-resolver, and mesh config entry types, as these are the only entries required by proxycfg — the events will be emitted on topics named IngressGateway, ServiceResolver, and MeshConfig topics respectively. Though these events will only be consumed "locally" for now, they can also be consumed via the gRPC endpoint (confirmed using grpcurl) so using them from client agents should be a case of swapping the LocalMaterializer for an RPCMaterializer.	2022-07-04 10:48:36 +01:00
alex	cd9ca4290a	peering: add imported/exported counts to peering (#13644 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-06-29 14:07:30 -07:00
alex	beb8b03e8a	peering: reconcile/ hint active state for list (#13619 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-06-29 09:43:50 -07:00
R.B. Boyer	0fa828db76	peering: replicate all SpiffeID values necessary for the importing side to do SAN validation (#13612 ) When traversing an exported peered service, the discovery chain evaluation at the other side may re-route the request to a variety of endpoints. Furthermore we intend to terminate mTLS at the mesh gateway for arriving peered traffic that is http-like (L7), so the caller needs to know the mesh gateway's SpiffeID in that case as well. The following new SpiffeID values will be shipped back in the peerstream replication: - tcp: all possible SpiffeIDs resulting from the service-resolver component of the exported discovery chain - http-like: the SpiffeID of the mesh gateway	2022-06-27 14:37:18 -05:00
R.B. Boyer	e8ea3d7c3b	state: peering ID assignment cannot happen inside of the state store (#13525 ) Move peering ID assignment outisde of the FSM, so that the ID is written to the raft log and the same ID is used by all voters, and after restarts.	2022-06-21 13:04:08 -05:00
freddygv	f3843809da	Avoid deleting peerings marked as terminated. When our peer deletes the peering it is locally marked as terminated. This termination should kick off deleting all imported data, but should not delete the peering object itself. Keeping peerings marked as terminated acts as a signal that the action took place.	2022-06-14 15:37:09 -06:00
freddygv	6453375ab2	Add leader routine to clean up peerings Once a peering is marked for deletion a new leader routine will now clean up all imported resources and then the peering itself. A lot of the logic was grabbed from the namespace/partitions deferred deletions but with a handful of simplifications: - The rate limiting is not configurable. - Deleting imported nodes/services/checks is done by deleting nodes with the Txn API. The services and checks are deleted as a side-effect. - There is no "round rate limiter" like with namespaces and partitions. This is because peerings are purely local, and deleting a peering in the datacenter does not depend on deleting data from other DCs like with WAN-federated namespaces. All rate limiting is handled by the Raft rate limiter.	2022-06-14 15:36:50 -06:00
freddygv	cc921a9c78	Update peering state and RPC for deferred deletion When deleting a peering we do not want to delete the peering and all imported data in a single operation, since deleting a large amount of data at once could overload Consul. Instead we defer deletion of peerings so that: 1. When a peering deletion request is received via gRPC the peering is marked for deletion by setting the DeletedAt field. 2. A leader routine will monitor for peerings that are marked for deletion and kick off a throttled deletion of all imported resources before deleting the peering itself. This commit mostly addresses point #1 by modifying the peering service to mark peerings for deletion. Another key change is to add a PeeringListDeleted state store function which can return all peerings marked for deletion. This function is what will be watched by the deferred deletion leader routine.	2022-06-13 12:10:32 -06:00
R.B. Boyer	7001e1151c	peering: rename initiate to establish in the context of the APIs (#13419 )	2022-06-10 11:10:46 -05:00
freddygv	647c57a416	Add agent cache-type for TrustBundleListByService There are a handful of changes in this commit: * When querying trust bundles for a service we need to be able to specify the namespace of the service. * The endpoint needs to track the index because the cache watches use it. * Extracted bulk of the endpoint's logic to a state store function so that index tracking could be tested more easily. * Removed check for service existence, deferring that sort of work to ACL authz * Added the cache type	2022-06-01 17:05:10 -06:00
freddygv	8b58fa8afe	Update assumptions around exported-service config Given that the exported-services config entry can use wildcards, the precedence for wildcards is handled as with intentions. The most exact match is the match that applies for any given service. We do not take the union of all that apply. Another update that was made was to reflect that only one exported-services config entry applies to any given service in a partition. This is a pre-existing constraint that gets enforced by the Normalize() method on that config entry type.	2022-06-01 17:03:51 -06:00
Freddy	9427700270	[OSS] Add grpc endpoint to fetch a specific trust bundle (#13292 ) Co-authored-by: R.B. Boyer <rb@hashicorp.com>	2022-05-31 09:54:40 -06:00
Chris S. Kim	6d3bea7129	Add support for streaming CA roots to peers (#13260 ) Sender watches for changes to CA roots and sends them through the replication stream. Receiver saves CA roots to tablePeeringTrustBundle	2022-05-26 15:24:09 -04:00
R.B. Boyer	1a8834e1c8	peering: replicate expected SNI, SPIFFE, and service protocol to peers (#13218 ) The importing peer will need to know what SNI and SPIFFE name corresponds to each exported service. Additionally it will need to know at a high level the protocol in use (L4/L7) to generate the appropriate connection pool and local metrics. For replicated connect synthetic entities we edit the `Connect{}` part of a `NodeService` to have a new section: { "PeerMeta": { "SNI": [ "web.default.default.owt.external.183150d5-1033-3672-c426-c29205a576b8.consul" ], "SpiffeID": [ "spiffe://183150d5-1033-3672-c426-c29205a576b8.consul/ns/default/dc/dc1/svc/web" ], "Protocol": "tcp" } } This data is then replicated and saved as-is at the importing side. Both SNI and SpiffeID are slices for now until I can be sure we don't need them for how mesh gateways will ultimately work.	2022-05-25 12:37:44 -05:00
R.B. Boyer	3e4a522882	peering: replicate discovery chains information to importing peers Treat each exported service as a "discovery chain" and replicate one synthetic CheckServiceNode for each chain and remote mesh gateway. The health will be a flattened generated check of the checks for that mesh gateway node.	2022-05-19 14:21:44 -05:00
Freddy	b38be4c0ed	Patches to peering initiation for POC demo (#13076 ) Co-authored-by: R.B. Boyer <rb@hashicorp.com>	2022-05-13 13:01:00 -06:00

1 2

55 Commits (3df68751f5da251f32f4cada0648ac0598de476a)