consul

Commit Graph

Author	SHA1	Message	Date
Gabriel Santos	e53af28bd7	Middleware: `RequestRecorder` reports calls below 1ms as decimal value (#12905 ) * Typos * Test failing * Convert values <1ms to decimal * Fix test * Update docs and test error msg * Applied suggested changes to test case * Changelog file and suggested changes * Update .changelog/12905.txt Co-authored-by: Chris S. Kim <kisunji92@gmail.com> * suggested change - start duration with microseconds instead of nanoseconds * fix error * suggested change - floats Co-authored-by: alex <8968914+acpana@users.noreply.github.com> Co-authored-by: Chris S. Kim <kisunji92@gmail.com>	2022-09-15 13:04:37 -04:00
Daniel Graña	8c98172f53	[BUGFIX] Do not use interval as timeout (#14619 ) Do not use interval as timeout	2022-09-15 12:39:48 -04:00
Evan Culver	d0416f593c	connect: Bump latest Envoy to 1.23.1 in test matrix (#14573 )	2022-09-14 13:20:16 -07:00
DanStough	485e1b5d4e	fix(peering): generate token metrics only for leader	2022-09-14 11:37:30 -04:00
DanStough	2a2debee64	feat(peering): validate server name conflicts on establish	2022-09-14 11:37:30 -04:00
Kyle Havlovitz	60cee76746	Merge pull request #14516 from hashicorp/ca-ttl-fixes Fix inconsistent TTL behavior in CA providers	2022-09-13 16:07:36 -07:00
Kyle Havlovitz	d67bccd210	Update intermediate pki mount/role when reconfiguring Vault provider	2022-09-13 15:42:26 -07:00
Kyle Havlovitz	f46955101a	connect/ca: Clarify behavior around IntermediateCertTTL in CA config	2022-09-13 15:42:26 -07:00
DanStough	0150e88200	feat: add PeerThroughMeshGateways to mesh config	2022-09-13 17:19:54 -04:00
Derek Menteer	0aa13733a0	Add CSR check for number of URIs. (#14579 ) Add CSR check for number of URIs.	2022-09-13 14:21:47 -05:00
Derek Menteer	db83ff4fa6	Add input validation for auto-config JWT authorization checks.	2022-09-13 11:16:36 -05:00
cskh	f22685b969	Config-entry: Support proxy config in service-defaults (#14395 ) * Config-entry: Support proxy config in service-defaults * Update website/content/docs/connect/config-entries/service-defaults.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2022-09-12 10:41:58 -04:00
Eric Haberkorn	aa8268e50c	Implement Cluster Peering Redirects (#14445 ) implement cluster peering redirects	2022-09-09 13:58:28 -04:00
skpratt	b761589340	add non-double-prefixed metrics (#14193 )	2022-09-09 12:13:43 -05:00
skpratt	19f79aa9a6	PR #14057 follow up fix: service id parsing from sidecar id (#14541 ) * fix service id parsing from sidecar id * simplify suffix trimming	2022-09-09 09:47:10 -05:00
Dan Upton	1c2c975b0b	xDS Load Balancing (#14397 ) Prior to #13244, connect proxies and gateways could only be configured by an xDS session served by the local client agent. In an upcoming release, it will be possible to deploy a Consul service mesh without client agents. In this model, xDS sessions will be handled by the servers themselves, which necessitates load-balancing to prevent a single server from receiving a disproportionate amount of load and becoming overwhelmed. This introduces a simple form of load-balancing where Consul will attempt to achieve an even spread of load (xDS sessions) between all healthy servers. It does so by implementing a concurrent session limiter (limiter.SessionLimiter) and adjusting the limit according to autopilot state and proxy service registrations in the catalog. If a server is already over capacity (i.e. the session limit is lowered), Consul will begin draining sessions to rebalance the load. This will result in the client receiving a `RESOURCE_EXHAUSTED` status code. It is the client's responsibility to observe this response and reconnect to a different server. Users of the gRPC client connection brokered by the consul-server-connection-manager library will get this for free. The rate at which Consul will drain sessions to rebalance load is scaled dynamically based on the number of proxies in the catalog.	2022-09-09 15:02:01 +01:00
Derek Menteer	f7c884f0af	Merge branch 'main' of github.com:hashicorp/consul into derekm/split-grpc-ports	2022-09-08 14:53:08 -05:00
Derek Menteer	bfe7c5e8af	Remove rebuilding grpc server.	2022-09-08 13:45:44 -05:00
Derek Menteer	80d31458e5	Various cleanups.	2022-09-08 10:51:50 -05:00
Chris S. Kim	03df6c3ac6	Reuse http.DefaultTransport in UIMetricsProxy (#14521 ) http.Transport keeps a pool of connections and should be reused when possible. We instantiate a new http.DefaultTransport for every metrics request, making large numbers of concurrent requests inefficiently spin up new connections instead of reusing open ones.	2022-09-08 11:02:05 -04:00
Chris S. Kim	1c4a6eef4f	Merge pull request #14285 from hashicorp/NET-638-push-server-address-updates-to-the-peer peering: Subscribe to server address changes and push updates to peers	2022-09-07 09:30:45 -04:00
skpratt	3bf1edfb3f	move port and default check logic to locked step (#14057 )	2022-09-06 19:35:31 -05:00
Freddy	f4dfd42e0a	Add SpiffeID for Consul server agents (#14485 ) Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com> By adding a SpiffeID for server agents, servers can now request a leaf certificate from the Connect CA. This new Spiffe ID has a key property: servers are identified by their datacenter name and trust domain. All servers that share these attributes will share a ServerURI. The aim is to use these certificates to verify the server name of ANY server in a Consul datacenter.	2022-09-06 17:58:13 -06:00
Daniel Upton	8c46e48e0d	proxycfg-glue: server-local implementation of IntentionUpstreamsDestination This is the OSS portion of enterprise PR 2463. Generalises the serverIntentionUpstreams type to support matching on a service or destination.	2022-09-06 23:27:25 +01:00
Daniel Upton	f8dba7e9ac	proxycfg-glue: server-local implementation of InternalServiceDump This is the OSS portion of enterprise PR 2489. This PR introduces a server-local implementation of the proxycfg.InternalServiceDump interface that sources data from a blocking query against the server's state store. For simplicity, it only implements the subset of the Internal.ServiceDump RPC handler actually used by proxycfg - as such the result type has been changed to IndexedCheckServiceNodes to avoid confusion.	2022-09-06 23:27:25 +01:00
Daniel Upton	a31738f76f	proxycfg-glue: server-local implementation of ResolvedServiceConfig This is the OSS portion of enterprise PR 2460. Introduces a server-local implementation of the proxycfg.ResolvedServiceConfig interface that sources data from a blocking query against the server's state store. It moves the service config resolution logic into the agent/configentry package so that it can be used in both the RPC handler and data source. I've also done a little re-arranging and adding comments to call out data sources for which there is to be no server-local equivalent.	2022-09-06 23:27:25 +01:00
Derek Menteer	bf769daae4	Merge branch 'main' of github.com:hashicorp/consul into derekm/split-grpc-ports	2022-09-06 10:51:04 -05:00
Derek Menteer	02ae66bda8	Add kv txn get-not-exists operation.	2022-09-06 10:28:59 -05:00
Chris S. Kim	953808e899	PR feedback on terminated state checking	2022-09-06 10:28:20 -04:00
Chris S. Kim	ddb9375cb6	Add testcase for parsing grpc_port	2022-09-06 10:17:44 -04:00
Kyle Havlovitz	d97ccccdd5	Merge pull request #14429 from hashicorp/ca-prune-intermediates Prune old expired intermediate certs when appending a new one	2022-09-02 15:34:33 -07:00
cskh	0f7d4efac3	fix(txn api): missing proxy config in registering proxy service (#14471 ) * fix(txn api): missing proxy config in registering proxy service	2022-09-02 14:28:05 -04:00
Chris S. Kim	ec36755cc0	Properly assert for ServerAddresses replication request	2022-09-02 11:44:54 -04:00
Chris S. Kim	d1d9dbff8e	Fix terminate not returning early	2022-09-02 11:44:38 -04:00
Derek Menteer	f64771c707	Address PR comments.	2022-09-01 16:54:24 -05:00
Kyle Havlovitz	0c2fb7252d	Prune intermediates before appending new one	2022-09-01 14:24:30 -07:00
Luke Kysow	81d7cc41dc	Use proxy address for default check (#14433 ) When a sidecar proxy is registered, a check is automatically added. Previously, the address this check used was the underlying service's address instead of the proxy's address, even though the check is testing if the proxy is up. This worked in most cases because the proxy ran on the same IP as the underlying service but it's not guaranteed and so the proper default address should be the proxy's address.	2022-09-01 14:03:35 -07:00
malizz	f1054dada9	fix TestProxyConfigEntry (#14435 )	2022-09-01 11:37:47 -07:00
malizz	b3ac8f48ca	Add additional parameters to envoy passive health check config (#14238 ) * draft commit * add changelog, update test * remove extra param * fix test * update type to account for nil value * add test for custom passive health check * update comments and tests * update description in docs * fix missing commas	2022-09-01 09:59:11 -07:00
Chris S. Kim	f2b147e575	Add Internal.ServiceDump support for querying by PeerName	2022-09-01 10:32:59 -04:00
Chris S. Kim	e62f830fa8	Merge pull request #13998 from jorgemarey/f-new-tracing-envoy Add new envoy tracing configuration	2022-09-01 08:57:23 -04:00
Derek Menteer	cf7f24a6ec	Change serf-tag references to field references.	2022-08-31 16:38:42 -05:00
malizz	a80e0bcd00	validate args before deleting proxy defaults (#14290 ) * validate args before deleting proxy defaults * add changelog * validate name when normalizing proxy defaults * add test for proxyConfigEntry * add comments	2022-08-31 13:03:38 -07:00
Kyle Havlovitz	113454645d	Prune old expired intermediate certs when appending a new one	2022-08-31 11:41:58 -07:00
Alessandro De Blasis	60c7c831c6	Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service	2022-08-30 18:49:20 +01:00
Eric Haberkorn	3726a0ab7a	Finish up cluster peering failover (#14396 )	2022-08-30 11:46:34 -04:00
Chris S. Kim	560d410c6d	Merge branch 'main' into NET-638-push-server-address-updates-to-the-peer # Conflicts: # agent/grpc-external/services/peerstream/stream_test.go	2022-08-30 11:09:25 -04:00
Jorge Marey	3f3bb8831e	Fix typos. Add test. Add documentation	2022-08-30 16:59:02 +02:00
Jorge Marey	ed7b34128f	Add new tracing configuration	2022-08-30 16:59:02 +02:00
Freddy	97d1db759f	Merge pull request #13496 from maxb/fix-kv_entries-metric	2022-08-29 15:35:11 -06:00
Freddy	829a2a8722	Merge pull request #14364 from hashicorp/peering/term-delete	2022-08-29 15:33:18 -06:00
Max Bowsher	decc9231ee	Merge branch 'main' into fix-kv_entries-metric	2022-08-29 22:22:10 +01:00
Chris S. Kim	5010fa5c03	Merge pull request #14371 from hashicorp/kisunji/peering-metrics-update Adjust metrics reporting for peering tracker	2022-08-29 17:16:19 -04:00
Chris S. Kim	74ddf040dd	Add heartbeat timeout grace period when accounting for peering health	2022-08-29 16:32:26 -04:00
Derek Menteer	0ceec9017b	Expose `grpc_tls` via serf for cluster peering.	2022-08-29 13:43:49 -05:00
Derek Menteer	1255a8a20d	Add separate grpc_tls port. To ease the transition for users, the original gRPC port can still operate in a deprecated mode as either plain-text or TLS mode. This behavior should be removed in a future release whenever we no longer support this. The resulting behavior from this commit is: `ports.grpc > 0 && ports.grpc_tls > 0` spawns both plain-text and tls ports. `ports.grpc > 0 && grpc.tls == undefined` spawns a single plain-text port. `ports.grpc > 0 && grpc.tls != undefined` spawns a single tls port (backwards compat mode).	2022-08-29 13:43:43 -05:00
freddygv	310608fb19	Add validation to prevent switching dialing mode This prevents unexpected changes to the output of ShouldDial, which should never change unless a peering is deleted and recreated.	2022-08-29 12:31:13 -06:00
Eric Haberkorn	72f90754ae	Update max_ejection_percent on outlier detection for peered clusters to 100% (#14373 ) We can't trust health checks on peered services when service resolvers, splitters and routers are used.	2022-08-29 13:46:41 -04:00
Alessandro De Blasis	26cc56bc68	fix(agent): removed redundant code in docker check as well	2022-08-29 18:15:59 +01:00
Alessandro De Blasis	c0d647d11e	fix(agent): removed redundant check on prev. running check	2022-08-29 17:53:39 +01:00
Chris S. Kim	def529edd3	Rename test	2022-08-29 10:34:50 -04:00
Chris S. Kim	93271f649c	Fix test	2022-08-29 10:20:30 -04:00
Eric Haberkorn	1099665473	Update the structs and discovery chain for service resolver redirects to cluster peers. (#14366 )	2022-08-29 09:51:32 -04:00
Alessandro De Blasis	f3437eaf05	Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service Signed-off-by: Alessandro De Blasis <alex@deblasis.net>	2022-08-28 18:09:31 +01:00
Alessandro De Blasis	f634e36811	fix(OSServiceCheck): fixes following code-review	2022-08-28 17:56:30 +01:00
Chris S. Kim	4d97e2f936	Adjust metrics reporting for peering tracker	2022-08-26 17:34:17 -04:00
freddygv	650e48624d	Allow terminated peerings to be deleted Peerings are terminated when a peer decides to delete the peering from their end. Deleting a peering sends a termination message to the peer and triggers them to mark the peering as terminated but does NOT delete the peering itself. This is to prevent peerings from disappearing from both sides just because one side deleted them. Previously the Delete endpoint was skipping the deletion if the peering was not marked as active. However, terminated peerings are also inactive. This PR makes some updates so that peerings marked as terminated can be deleted by users.	2022-08-26 10:52:47 -06:00
Chris S. Kim	937a8ec742	Fix casing	2022-08-26 11:56:26 -04:00
Chris S. Kim	87962b9713	Merge branch 'main' into catalog-service-list-filter	2022-08-26 11:16:06 -04:00
Chris S. Kim	e2fe8b8d65	Fix tests for enterprise	2022-08-26 11:14:02 -04:00
Chris S. Kim	1c43a1a7b4	Merge branch 'main' into NET-638-push-server-address-updates-to-the-peer # Conflicts: # agent/grpc-external/services/peerstream/stream_test.go	2022-08-26 10:43:56 -04:00
Chris S. Kim	6ddcc04613	Replace ring buffer with async version (#14314 ) We need to watch for changes to peerings and update the server addresses which get served by the ring buffer. Also, if there is an active connection for a peer, we are getting up-to-date server addresses from the replication stream and can safely ignore the token's addresses which may be stale.	2022-08-26 10:27:13 -04:00
alex	30ff2e9a35	peering: add peer health metric (#14004 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-08-25 16:32:59 -07:00
Chris S. Kim	181063cd23	Exit loop when context is cancelled	2022-08-25 11:48:25 -04:00
cskh	41aea65214	Fix: the inboundconnection limit filter should be placed in front of http co… (#14325 ) * fix: the inboundconnection limit should be placed in front of http connection manager Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2022-08-24 14:13:10 -04:00
Chris S. Kim	8c94d1a80c	Update test comment	2022-08-24 13:50:24 -04:00
Chris S. Kim	5f2959329f	Add check for zero-length server addresses	2022-08-24 13:30:52 -04:00
skpratt	919da33331	no-op: refactor usagemetrics tests for clarity and DRY cases (#14313 )	2022-08-24 12:00:09 -05:00
Pablo Ruiz García	1f293e5244	Added new auto_encrypt.grpc_server_tls config option to control AutoTLS enabling of GRPC Server's TLS usage Fix for #14253 Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>	2022-08-24 12:31:38 -04:00
Dan Upton	3b993f2da7	dataplane: update envoy bootstrap params for consul-dataplane (#14017 ) Contains 2 changes to the GetEnvoyBootstrapParams response to support consul-dataplane. Exposing node_name and node_id: consul-dataplane will support providing either the node_id or node_name in its configuration. Unfortunately, supporting both in the xDS meta adds a fair amount of complexity (partly because most tables are currently indexed on node_name) so for now we're going to return them both from the bootstrap params endpoint, allowing consul-dataplane to exchange a node_id for a node_name (which it will supply in the xDS meta). Properly setting service for gateways: To avoid the need to special case gateways in consul-dataplane, service will now either be the destination service name for connect proxies, or the gateway service name. This means it can be used as-is in Envoy configuration (i.e. as a cluster name or in metric tags).	2022-08-24 12:03:15 +01:00
Daniel Upton	13c04a13af	proxycfg: terminate stream on irrecoverable errors This is the OSS portion of enterprise PR 2339. It improves our handling of "irrecoverable" errors in proxycfg data sources. The canonical example of this is what happens when the ACL token presented by Envoy is deleted/revoked. Previously, the stream would get "stuck" until the xDS server re-checked the token (after 5 minutes) and terminated the stream. Materializers would also sit burning resources retrying something that could never succeed. Now, it is possible for data sources to mark errors as "terminal" which causes the xDS stream to be closed immediately. Similarly, the submatview.Store will evict materializers when it observes they have encountered such an error.	2022-08-23 20:17:49 +01:00
Chris S. Kim	81e965479b	PR feedback to specify Node name in test mock	2022-08-23 11:51:04 -04:00
Eric Haberkorn	58901ad7df	Cluster peering failover disco chain changes (#14296 )	2022-08-23 09:13:43 -04:00
Chris S. Kim	cdc8b0634d	Fix flakes	2022-08-22 14:45:31 -04:00
Chris S. Kim	03e92826aa	Increase heartbeat rate to reduce test flakes	2022-08-22 14:24:05 -04:00
Chris S. Kim	06ba9775ee	Remove check for ResponseNonce	2022-08-22 13:55:01 -04:00
Chris S. Kim	547fb9570e	Add missing mock assertions	2022-08-22 13:55:01 -04:00
Chris S. Kim	adff2eef16	Fix data race newMockSnapshotHandler has an assertion on t.Cleanup which gets called before the event publisher is cancelled. This commit reorders the context.WithCancel so it properly gets cancelled before the assertion is made.	2022-08-22 13:55:01 -04:00
cskh	060531a29a	Fix: add missing ent meta for test (#14289 )	2022-08-22 13:51:04 -04:00
Chris S. Kim	4e40e1d222	Handle server addresses update as client	2022-08-22 13:42:12 -04:00
Chris S. Kim	584d3409c4	Send server addresses on update from server	2022-08-22 13:41:44 -04:00
Chris S. Kim	c9d8ad3939	Add new subscription for server addresses	2022-08-22 13:40:25 -04:00
Chris S. Kim	028b87d51f	Cleanup unused logger	2022-08-22 13:40:23 -04:00
Chris S. Kim	df951bd601	Expose external gRPC port in autopilot The grpc_port was added to a NodeService's meta in `ea58f235f5`	2022-08-22 10:07:00 -04:00
cskh	527ebd068a	fix: missing MaxInboundConnections field in service-defaults config entry (#14072 ) * fix: missing max_inbound_connections field in merge config	2022-08-19 14:11:21 -04:00
cskh	e84e4b8868	Fix: upgrade pkg imdario/merg to prevent merge config panic (#14237 ) * upgrade imdario/merg to prevent merge config panic * test: service definition takes precedence over service-defaults in merged results	2022-08-17 21:14:04 -04:00
James Hartig	f92883bbce	Use the maximum jitter when calculating the timeout The timeout should include the maximum possible jitter since the server will randomly add to it's timeout a jitter. If the server's timeout is less than the client's timeout then the client will return an i/o deadline reached error. Before: ``` time curl 'http://localhost:8500/v1/catalog/service/service?dc=other-dc&stale=&wait=600s&index=15820644' rpc error making call: i/o deadline reached real 10m11.469s user 0m0.018s sys 0m0.023s ``` After: ``` time curl 'http://localhost:8500/v1/catalog/service/service?dc=other-dc&stale=&wait=600s&index=15820644' [...] real 10m35.835s user 0m0.021s sys 0m0.021s ```	2022-08-17 10:24:09 -04:00
Eric Haberkorn	1a73b0ca20	Add `Targets` field to service resolver failovers. (#14162 ) This field will be used for cluster peering failover.	2022-08-15 09:20:25 -04:00
Alessandro De Blasis	5dee555888	Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service Signed-off-by: Alessandro De Blasis <alex@deblasis.net>	2022-08-15 08:26:55 +01:00
Alessandro De Blasis	ab611eabc3	Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service Signed-off-by: Alessandro De Blasis <alex@deblasis.net>	2022-08-15 08:09:56 +01:00
cskh	d46b515b64	fix: missing segment and partition (#14194 )	2022-08-12 15:21:39 -04:00
Eric Haberkorn	ebd5513d4b	Refactor failover code to use Envoy's aggregate clusters (#14178 )	2022-08-12 14:30:46 -04:00
cskh	81931e52c3	feat(telemetry): add labels to serf and memberlist metrics (#14161 ) * feat(telemetry): add labels to serf and memberlist metrics * changelog * doc update Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-08-11 22:09:56 -04:00
Chris S. Kim	4c928cb2f7	Handle breaking change for ServiceVirtualIP restore (#14149 ) Consul 1.13.0 changed ServiceVirtualIP to use PeeredServiceName instead of ServiceName which was a breaking change for those using service mesh and wanted to restore their snapshot after upgrading to 1.13.0. This commit handles existing data with older ServiceName and converts it during restore so that there are no issues when restoring from older snapshots.	2022-08-11 14:47:10 -04:00
Chris S. Kim	3926009405	Add test to verify forwarding	2022-08-11 11:16:02 -04:00
Chris S. Kim	1ef22360c3	Register peerStreamServer internally to enable RPC forwarding	2022-08-11 11:16:02 -04:00
Chris S. Kim	de73171202	Handle wrapped errors in isFailedPreconditionErr	2022-08-11 11:16:02 -04:00
Daniel Kimsey	3c4fa9b468	Add support for filtering the 'List Services' API 1. Create a bexpr filter for performing the filtering 2. Change the state store functions to return the raw (not aggregated) list of ServiceNodes. 3. Move the aggregate service tags by name logic out of the state store functions into a new function called from the RPC endpoint 4. Perform the filtering in the endpoint before aggregation.	2022-08-10 16:52:32 -05:00
cskh	11e7a0d547	fix: shadowed err in retryJoin() (#14112 ) - err value will be used later to surface the error message if r.join() returns any err.	2022-08-10 10:53:57 -04:00
skpratt	79c23a7cd2	Merge pull request #14056 from hashicorp/proxy-register-port-race Refactor sidecar_service method to separate port assignment	2022-08-10 09:46:29 -05:00
skpratt	aa77559819	Merge branch 'main' into proxy-register-port-race	2022-08-10 08:40:45 -05:00
Chris S. Kim	e3046120b3	Close active listeners on error If startListeners successfully created listeners for some of its input addresses but eventually failed, the function would return an error and existing listeners would not be cleaned up.	2022-08-09 12:22:39 -04:00
Chris S. Kim	6311c651de	Add retry in TestAgentConnectCALeafCert_good	2022-08-09 11:20:37 -04:00
Kyle Havlovitz	6938b8c755	Merge pull request #13958 from hashicorp/gateway-wildcard-fix Fix wildcard picking up services it shouldn't for ingress/terminating gateways	2022-08-08 12:54:40 -07:00
Kyle Havlovitz	fe1fcea34f	Add some extra handling for destination deletes	2022-08-08 11:38:13 -07:00
freddygv	d421e18172	Update snapshot test	2022-08-08 09:17:15 -06:00
freddygv	1031ffc3c7	Re-validate existing secrets at state store Previously establishment and pending secrets were only checked at the RPC layer. However, given that these are Check-and-Set transactions we should ensure that the given secrets are still valid when persisting a secret exchange or promotion. Otherwise it would be possible for concurrent requests to overwrite each other.	2022-08-08 09:06:07 -06:00
freddygv	0ea4bfae94	Test fixes	2022-08-08 08:31:47 -06:00
freddygv	c04515a844	Use proto message for each secrets write op Previously there was a field indicating the operation that triggered a secrets write. Now there is a message for each operation and it contains the secret ID being persisted.	2022-08-08 01:41:00 -06:00
Kyle Havlovitz	6580566c3b	Update ingress/terminating wildcard logic and handle destinations	2022-08-05 07:56:10 -07:00
freddygv	8067890787	Inherit active secret when exchanging	2022-08-03 17:32:53 -05:00
freddygv	60d6e28c97	Pass explicit signal with op for secrets write Previously the updates to the peering secrets UUID table relied on inferring what action triggered the update based on a reconciliation against the existing secrets. Instead we now explicitly require the operation to be given so that the inference isn't necessary. This makes the UUID table logic easier to reason about and fixes some related bugs. There is also an update so that the peering secrets get handled on snapshots/restores.	2022-08-03 17:25:12 -05:00
freddygv	9ca687bc7c	Avoid deleting peering secret UUIDs at dialers Dialers do not keep track of peering secret UUIDs, so they should not attempt to clean up data from that table when their peering is deleted. We also now keep peer server addresses when marking peerings for deletion. Peer server addresses are used by the ShouldDial() helper when determining whether the peering is for a dialer or an acceptor. We need to keep this data so that peering secrets can be cleaned up accordingly.	2022-08-03 16:34:57 -05:00
skpratt	58eed6b049	Merge pull request #13906 from skpratt/validate-port-agent-split Separate port and socket path validation for local agent	2022-08-02 16:58:41 -05:00
Dhia Ayachi	7154367892	add token to the request when creating a cacheIntentions query (#14005 )	2022-08-02 14:27:34 -04:00
Kyle Havlovitz	499211f907	Fix wildcard picking up services it shouldn't for ingress/terminating gateways	2022-08-02 09:41:31 -07:00
Daniel Upton	6452118c15	proxycfg-sources: fix hot loop when service not found in catalog Fixes a bug where a service getting deleted from the catalog would cause the ConfigSource to spin in a hot loop attempting to look up the service. This is because we were returning a nil WatchSet which would always unblock the select. Kudos to @freddygv for discovering this!	2022-08-02 15:42:29 +01:00
Freddy	42996411cc	Various peering fixes (#13979 ) * Avoid logging StreamSecretID * Wrap additional errors in stream handler * Fix flakiness in leader test and rename servers for clarity. There was a race condition where the peering was being deleted in the test before the stream was active. Now the test waits for the stream to be connected on both sides before deleting the associated peering. * Run flaky test serially	2022-08-01 15:06:18 -06:00
DanStough	169ff71132	fix: ipv4 destination dns resolution	2022-08-01 16:45:57 -04:00
Luke Kysow	988e1fd35d	peering: default to false (#13963 ) * defaulting to false because peering will be released as beta * Ignore peering disabled error in bundles cachetype Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: freddygv <freddy@hashicorp.com> Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>	2022-08-01 15:22:36 -04:00
Freddy	dacf703d20	Merge branch 'main' into fix-kv_entries-metric	2022-08-01 13:19:27 -06:00
Freddy	72b6d69652	Merge pull request #13499 from maxb/delete-unused-metric Delete definition of metric `consul.acl.blocked.node.deregistration`	2022-08-01 12:31:05 -06:00
Dhia Ayachi	6fd65a4a45	Tgtwy egress HTTP support (#13953 ) * add golden files * add support to http in tgateway egress destination * fix slice sorting to include both address and port when using server_names * fix listener loop for http destination * fix routes to generate a route per port and a virtualhost per port-address combination * sort virtual hosts list to have a stable order * extract redundant serviceNode	2022-08-01 14:12:43 -04:00
Matt Keeler	f74d0cef7a	Implement/Utilize secrets for Peering Replication Stream (#13977 )	2022-08-01 10:33:18 -04:00
alex	a45bb1f06b	block PeerName register requests (#13887 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-29 14:36:22 -07:00
Luke Kysow	95096e2c03	peering: retry establishing connection more quickly on certain errors (#13938 ) When we receive a FailedPrecondition error, retry that more quickly because we expect it will resolve shortly. This is particularly important in the context of Consul servers behind a load balancer because when establishing a connection we have to retry until we randomly land on a leader node. The default retry backoff goes from 2s, 4s, 8s, etc. which can result in very long delays quite quickly. Instead, this backoff retries in 8ms five times, then goes exponentially from there: 16ms, 32ms, ... up to a max of 8152ms.	2022-07-29 13:04:32 -07:00
Sarah Pratt	10a4999a87	Separate port and socket path requirement in case of local agent assignment	2022-07-29 13:28:21 -05:00
alex	92c615c35f	Merge pull request #13952 from hashicorp/sync-more-acl sync more acl enforcement	2022-07-28 12:31:02 -07:00
Dhia Ayachi	256694b603	inject gateway addons to destination clusters (#13951 )	2022-07-28 15:17:35 -04:00
acpana	eae4e71492	sync more acl enforcement sync w ent at 32756f7 Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-28 12:01:52 -07:00
alex	41f3343eac	Merge pull request #13929 from hashicorp/fix-validation [sync] fix empty partitions matching	2022-07-28 10:14:49 -07:00
Sarah Pratt	a3ef6f016e	refactor sidecare_service method into parts	2022-07-28 09:07:13 -05:00
Ashwin Venkatesh	eef9edaed9	Add peer counts to emitted metrics. (#13930 )	2022-07-27 18:34:04 -04:00
Luke Kysow	465a9801e1	Merge pull request #13924 from hashicorp/lkysow/util-metric-peering peering: don't track imported services/nodes in usage	2022-07-27 14:49:55 -07:00
acpana	6033584349	use EqualPartitions Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-27 14:48:30 -07:00
acpana	0351ca5136	better fix Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-27 14:28:08 -07:00
acpana	8b2ef80336	sync w ent Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-27 11:41:39 -07:00
Chris S. Kim	0999e05a7d	Reduce arm64 flakes for TestConnectCA_ConfigurationSet_ChangeKeyConfig_Primary There were 16 combinations of tests but 4 of them were duplicates since the default key type and bits were "ec" and 256. That entry was commented out to reduce the subtest count to 12. testrpc.WaitForLeader was failing on arm64 environments; the cause is unknown but it might be due to the environment being flooded with parallel tests making RPC calls. The RPC polling+retry was replaced with a simpler check for leadership based on raft.	2022-07-27 13:54:34 -04:00
Chris S. Kim	8ead1caf53	Retry checks for virtual IP metadata	2022-07-27 13:54:34 -04:00
Chris S. Kim	62ed0250c3	Sort slice of ServiceNames deterministically	2022-07-27 13:54:34 -04:00
Sarah Pratt	f520f6dd0f	Separate port and socket path requirement in case of local agent assignment	2022-07-27 12:30:52 -05:00
Luke Kysow	740d54e730	peering: don't track imported services/nodes in usage Services/nodes that are imported from other peers are stored in state. We don't want to count those as part of our own cluster's usage.	2022-07-27 09:08:51 -07:00
cskh	4e292b7b72	chore: clarify the error message: service.service must not be empty (#13907 ) - when register service using catalog endpoint, the key of service name actually should be "service". Add this information to the error message will help user to quickly fix in the request.	2022-07-27 10:16:46 -04:00
cskh	59e81a728e	chore: removed unused method AddService (#13905 ) - This AddService is not used anywhere. AddServiceWithChecks is place of AddService - Test code is updated	2022-07-26 16:54:53 -04:00
Luke Kysow	021b00e321	Remove duplicate comment	2022-07-26 10:19:49 -07:00
alex	437a28d18a	peering: prevent peering in same partition (#13851 ) Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-07-25 18:00:48 -07:00
Nitya Dhanushkodi	27bd895ac8	peering: remove validation that forces peering token server addresses to be an IP, allow hostname based addresses (#13874 )	2022-07-25 16:33:47 -07:00
Luke Kysow	8c5b70d227	Rename receive to recv in tracker (#13896 ) Because it's shorter	2022-07-25 16:08:03 -07:00
Luke Kysow	3530d3782d	peering: read endpoints can now return failing status (#13849 ) Track streams that have been disconnected due to an error and set their statuses to failing.	2022-07-25 14:27:53 -07:00
Kyle Havlovitz	93de25f87c	Merge pull request #13872 from hashicorp/remove-upstream-log Remove extra logging from ingress upstream watch shutdown	2022-07-25 12:55:30 -07:00
Chris S. Kim	73a84f256f	Preserve PeeringState on upsert (#13666 ) Fixes a bug where if the generate token is called twice, the second call upserts the zero-value (undefined) of PeeringState.	2022-07-25 14:37:56 -04:00
Chris S. Kim	8ed49ea4d0	Update envoy metrics label extraction for peered clusters and listeners (#13818 ) Now that peered upstreams can generate envoy resources (#13758), we need a way to disambiguate local from peered resources in our metrics. The key difference is that datacenter and partition will be replaced with peer, since in the context of peered resources partition is ambiguous (could refer to the partition in a remote cluster or one that exists locally). The partition and datacenter of the proxy will always be that of the source service. Regexes were updated to make emitting datacenter and partition labels mutually exclusive with peer labels. Listener filter names were updated to better match the existing regex. Cluster names assigned to peered upstreams were updated to be synthesized from local peer name (it previously used the externally provided primary SNI, which contained the peer name from the other side of the peering). Integration tests were updated to assert for the new peer labels.	2022-07-25 13:49:00 -04:00
DanStough	2da8949d78	feat: convert destination address to slice	2022-07-25 12:31:58 -04:00
Freddy	f03cca7576	[OSS] Add ACL enforcement to peering endpoints (#13878 )	2022-07-25 10:04:10 -06:00
Matt Keeler	58e4d8235b	Enable/Disable Peering Support in the UI (#13816 ) We enabled/disable based on the config flag.	2022-07-25 11:50:11 -04:00
freddygv	b544ce6485	Add ACL enforcement to peering endpoints	2022-07-25 09:34:29 -06:00
Kyle Havlovitz	016f963e7e	Remove excess debug log from ingress upstream shutdown	2022-07-22 17:29:38 -07:00
alex	279d458e6e	peering: use ShouldDial to validate peer role (#13823 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-22 15:56:25 -07:00
Luke Kysow	a1e6d69454	peering: add config to enable/disable peering (#13867 ) * peering: add config to enable/disable peering Add config: ``` peering { enabled = true } ``` Defaults to true. When disabled: 1. All peering RPC endpoints will return an error 2. Leader won't start its peering establishment goroutines 3. Leader won't start its peering deletion goroutines	2022-07-22 15:20:21 -07:00
Kyle Havlovitz	0786517b56	Merge pull request #13847 from hashicorp/gateway-goroutine-leak Fix goroutine leaks in proxycfg when using ingress gateway	2022-07-22 14:43:22 -07:00
Freddy	f99df57840	[OSS] Add new peering ACL rule (#13848 ) This commit adds a new ACL rule named "peering" to authorize actions taken against peering-related endpoints. The "peering" rule has several key properties: - It is scoped to a partition, and MUST be defined in the default namespace. - Its access level must be "read', "write", or "deny". - Granting an access level will apply to all peerings. This ACL rule cannot be used to selective grant access to some peerings but not others. - If the peering rule is not specified, we fall back to the "operator" rule and then the default ACL rule.	2022-07-22 14:42:23 -06:00
alex	927cee692b	peering: emit exported services count metric (#13811 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-22 12:05:08 -07:00
Daniel Upton	a8df87f574	proxycfg-glue: server-local implementation of `ExportedPeeredServices` This is the OSS portion of enterprise PR 2377. Adds a server-local implementation of the proxycfg.ExportedPeeredServices interface that sources data from a blocking query against the server's state store.	2022-07-22 15:23:23 +01:00
Eric Haberkorn	501089292e	Add Cluster Peering Failover Support to Prepared Queries (#13835 ) Add peering failover support to prepared queries	2022-07-22 09:14:43 -04:00
Nitya Dhanushkodi	f47319b7c6	update generate token endpoint to take external addresses (#13844 ) Update generate token endpoint (rpc, http, and api module) If ServerExternalAddresses are set, it will override any addresses gotten from the "consul" service, and be used in the token instead, and dialed by the dialer. This allows for setting up a load balancer for example, in front of the consul servers.	2022-07-21 14:56:11 -07:00
acpana	12b773ab02	Rename peering internal to ~ sync ENT to 5679392c81 Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-21 10:51:05 -07:00
Luke Kysow	0c87be0845	peering: Add heartbeating to peering streams (#13806 ) * Add heartbeating to peering streams	2022-07-21 10:03:27 -07:00
Daniel Upton	3655802fdc	proxycfg-glue: server-local implementation of `PeeredUpstreams` This is the OSS portion of enterprise PR 2352. It adds a server-local implementation of the proxycfg.PeeredUpstreams interface based on a blocking query against the server's state store. It also fixes an omission in the Virtual IP freeing logic where we were never updating the max index (and therefore blocking queries against VirtualIPsForAllImportedServices would not return on service deletion).	2022-07-21 13:51:59 +01:00
Luke Kysow	c411e6b326	Add send mutex to protect against concurrent sends (#13805 )	2022-07-20 15:48:18 -07:00
Kyle Havlovitz	0be7d923dc	Cancel upstream watches when the discovery chain has been removed	2022-07-20 14:26:52 -07:00
Kyle Havlovitz	31318d7049	Fix duplicate Notify calls for discovery chains in ingress gateways	2022-07-20 14:25:20 -07:00
Evan Culver	4116537b83	connect: Add support for Envoy 1.23, remove 1.19 (#13807 )	2022-07-19 14:51:04 -07:00
Paul Glass	77afe0e76e	Extract AWS auth implementation out of Consul (#13760 )	2022-07-19 16:26:44 -05:00
Chris S. Kim	495936300e	Make envoy resources for inferred peered upstreams (#13758 ) Peered upstreams has a separate loop in xds from discovery chain upstreams. This PR adds similar but slightly modified code to add filters for peered upstream listeners, clusters, and endpoints in the case of transparent proxy.	2022-07-19 14:56:28 -04:00
alex	de5a991d8c	peering: refactor reconcile, cleanup (#13795 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-19 11:43:29 -07:00
Luke Kysow	e8d965e56f	peerstream: set keepalive enforcement to 15s (#13796 ) The client is set to send keepalive pings every 30s. The server keepalive enforcement must be set to a number less than that, otherwise it will disconnect clients for sending pings too often. MinTime governs the minimum amount of time between pings.	2022-07-18 16:12:03 -07:00
alex	a9ae2ff4fa	peering: track exported services (#13784 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-07-18 10:20:04 -07:00
R.B. Boyer	cd513aeead	peerstream: require a resource subscription to receive updates of that type (#13767 ) This mimics xDS's discovery protocol where you must request a resource explicitly for the exporting side to send those events to you. As part of this I aligned the overall ResourceURL with the TypeURL that gets embedded into the encoded protobuf Any construct. The CheckServiceNodes is now wrapped in a better named "ExportedService" struct now.	2022-07-15 15:03:40 -05:00
R.B. Boyer	c737301093	peerstream: fix test assertions (#13780 )	2022-07-15 14:43:24 -05:00
Luke Kysow	46381b1a7f	Add docs for peerStreamServer vs peeringServer. (#13781 )	2022-07-15 12:23:05 -07:00
Luke Kysow	ca3d7c964c	peerstream: dialer should reconnect when stream closes (#13745 ) * peerstream: dialer should reconnect when stream closes If the stream is closed unexpectedly (i.e. when we haven't received a terminated message), the dialer should attempt to re-establish the stream. Previously, the `HandleStream` would return `nil` when the stream was closed. The caller then assumed the stream was terminated on purpose and so didn't reconnect when instead it was stopped unexpectedly and the dialer should have attempted to reconnect.	2022-07-15 11:58:33 -07:00
R.B. Boyer	bb4d4040fb	server: ensure peer replication can successfully use TLS over external gRPC (#13733 ) Ensure that the peer stream replication rpc can successfully be used with TLS activated. Also: - If key material is configured for the gRPC port but HTTPS is not enabled now TLS will still be activated for the gRPC port. - peerstream replication stream opened by the establishing-side will now ignore grpc.WithBlock so that TLS errors will bubble up instead of being awkwardly delayed or suppressed	2022-07-15 13:15:50 -05:00
alex	adb5ffa1a6	peering: track imported services (#13718 )	2022-07-15 10:20:43 -07:00
Matt Keeler	257f88d4df	Use Node Name for peering healthSnapshot instead of ID (#13773 ) A Node ID is not a required field with Consul’s data model. Therefore we cannot reliably expect all uses to have it. However the node name is required and must be unique so its equally as good of a key for the internal healthSnapshot node tracking.	2022-07-15 10:51:38 -04:00
Matt Keeler	05b5e7e2ca	Enable partition support for peering establishment (#13772 ) Prior to this the dialing side of the peering would only ever work within the default partition. This commit allows properly parsing the partition field out of the API struct request body, query param and header.	2022-07-15 10:07:07 -04:00
Dan Stough	49f3dadb8f	feat: connect proxy xDS for destinations Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>	2022-07-14 15:27:02 -04:00
Daniel Upton	3d74efa8ad	proxycfg-glue: server-local implementation of `FederationStateListMeshGateways` This is the OSS portion of enterprise PR 2265. This PR provides a server-local implementation of the proxycfg.FederationStateListMeshGateways interface based on blocking queries.	2022-07-14 18:22:12 +01:00
Daniel Upton	ccc672013e	proxycfg-glue: server-local implementation of `GatewayServices` This is the OSS portion of enterprise PR 2259. This PR provides a server-local implementation of the proxycfg.GatewayServices interface based on blocking queries.	2022-07-14 18:22:12 +01:00
Daniel Upton	15a319dbfe	proxycfg-glue: server-local implementation of `TrustBundle` and `TrustBundleList` This is the OSS portion of enterprise PR 2250. This PR provides server-local implementations of the proxycfg.TrustBundle and proxycfg.TrustBundleList interfaces, based on local blocking queries.	2022-07-14 18:22:12 +01:00
Daniel Upton	673d02d30f	proxycfg-glue: server-local implementation of the `Health` interface This is the OSS portion of enterprise PR 2249. This PR introduces an implementation of the proxycfg.Health interface based on a local materialized view of the health events. It reuses the view and request machinery from agent/rpcclient/health, which made it super straightforward.	2022-07-14 18:22:12 +01:00
Daniel Upton	3c533ceea8	proxycfg-glue: server-local implementation of `ServiceList` This is the OSS portion of enterprise PR 2242. This PR introduces a server-local implementation of the proxycfg.ServiceList interface, backed by streaming events and a local materializer.	2022-07-14 18:22:12 +01:00
Daniel Upton	fbf88d3b19	proxycfg-glue: server-local compiled discovery chain data source This is the OSS portion of enterprise PR 2236. Adds a local blocking query-based implementation of the proxycfg.CompiledDiscoveryChain interface.	2022-07-14 18:22:12 +01:00
Chris S. Kim	f56810132f	Check if an upstream is implicit from either intentions or peered services	2022-07-13 16:53:20 -04:00
Chris S. Kim	02cff2394d	Use new maps for proxycfg peered data	2022-07-13 16:05:10 -04:00
Chris S. Kim	7f32cba735	Add new watch.Map type to refactor proxycfg	2022-07-13 16:05:10 -04:00
Chris S. Kim	b4ffa9ae0c	Scrub VirtualIPs before exporting	2022-07-13 16:05:10 -04:00
Kyle Havlovitz	9097e2b0f0	Merge pull request #13699 from hashicorp/tgate-http2-upstream Respect http2 protocol for upstreams of terminating gateways	2022-07-13 09:41:15 -07:00
Dan Upton	b9e525d689	grpc: rename public/private directories to external/internal (#13721 ) Previously, public referred to gRPC services that are both exposed on the dedicated gRPC port and have their definitions in the proto-public directory (so were considered usable by 3rd parties). Whereas private referred to services on the multiplexed server port that are only usable by agents and other servers. Now, we're splitting these definitions, such that external/internal refers to the port and public/private refers to whether they can be used by 3rd parties. This is necessary because the peering replication API needs to be exposed on the dedicated port, but is not (yet) suitable for use by 3rd parties.	2022-07-13 16:33:48 +01:00
R.B. Boyer	30fffd0c90	peerstream: some cosmetic refactors to make this easier to follow (#13732 ) - Use some protobuf construction helper methods for brevity. - Rename a local variable to avoid later shadowing. - Rename the Nonce field to be more like xDS's naming. - Be more explicit about which PeerID fields are empty.	2022-07-13 10:00:35 -05:00
Kyle Havlovitz	7d0c692374	Use protocol from resolved config entry, not gateway service	2022-07-12 16:23:40 -07:00
Kyle Havlovitz	7162e3bde2	Enable http2 options for grpc protocol	2022-07-12 14:38:44 -07:00
R.B. Boyer	c5c216008d	peering: always send the mesh gateway SpiffeID even for tcp services (#13728 ) If someone were to switch a peer-exported service from L4 to L7 there would be a brief SAN validation hiccup as traffic shifted to the mesh gateway for termination. This PR sends the mesh gateway SpiffeID down all the time so the clients always expect a switch.	2022-07-12 11:38:13 -05:00
R.B. Boyer	f0e6e4e697	state: prohibit changing an exported tcp discovery chain in a way that would break SAN validation (#13727 ) For L4/tcp exported services the mesh gateways will not be terminating TLS. A caller in one peer will be directly establishing TLS connections to the ultimate exported service in the other peer. The caller will be doing SAN validation using the replicated SpiffeID values shipped from the exporting side. There are a class of discovery chain edits that could be done on the exporting side that would cause the introduction of a new SpiffeID value. In between the time of the config entry update on the exporting side and the importing side getting updated peer stream data requests to the exported service would fail due to SAN validation errors. This is unacceptable so instead prohibit the exporting peer from making changes that would break peering in this way.	2022-07-12 11:17:33 -05:00
R.B. Boyer	2317f37b4d	state: prohibit exported discovery chains to have cross-datacenter or cross-partition references (#13726 ) Because peerings are pairwise, between two tuples of (datacenter, partition) having any exported reference via a discovery chain that crosses out of the peered datacenter or partition will ultimately not be able to work for various reasons. The biggest one is that there is no way in the ultimate destination to configure an intention that can allow an external SpiffeID to access a service. This PR ensures that a user simply cannot do this, so they won't run into weird situations like this.	2022-07-12 11:03:41 -05:00
Chris S. Kim	a6634db4a5	Return error if ServerAddresses is empty (#13714 )	2022-07-12 11:09:00 -04:00
Kyle Havlovitz	439eccdd80	Respect http2 protocol for upstreams of terminating gateways	2022-07-08 14:30:45 -07:00
R.B. Boyer	af04851637	peering: move peer replication to the external gRPC port (#13698 ) Peer replication is intended to be between separate Consul installs and effectively should be considered "external". This PR moves the peer stream replication bidirectional RPC endpoint to the external gRPC server and ensures that things continue to function.	2022-07-08 12:01:13 -05:00
R.B. Boyer	ea58f235f5	server: broadcast the public grpc port using lan serf and update the consul service in the catalog with the same data (#13687 ) Currently servers exchange information about their WAN serf port and RPC port with serf tags, so that they all learn of each other's addressing information. We intend to make larger use of the new public-facing gRPC port exposed on all of the servers, so this PR addresses that by passing around the gRPC port via serf tags and then ensuring the generated consul service in the catalog has metadata about that new port as well for ease of non-serf-based lookup.	2022-07-07 13:55:41 -05:00
Freddy	3542138e4d	Parse peer name for virtual IP DNS queries (#13602 ) This commit updates the DNS query locality parsing so that the virtual IP for an imported service can be queried. Note that: - Support for parsing a peer in other service discovery queries was not added. - Querying another datacenter for a virtual IP is not supported. This was technically allowed in 1.11 but is being rolled back for 1.13 because it is not a use-case we intended to support. Virtual IPs in different datacenters are going to collide because they are allocated sequentially.	2022-07-06 10:30:04 -06:00
R.B. Boyer	2a945facec	test: update mockery use to put mocks into test files (#13656 ) --testonly doesn't do anything anymore so switch to --filename instead	2022-07-05 16:57:15 -05:00
Chris S. Kim	f07132dacc	Revise possible states for a peering. (#13661 ) These changes are primarily for Consul's UI, where we want to be more specific about the state a peering is in. - The "initial" state was renamed to pending, and no longer applies to peerings being established from a peering token. - Upon request to establish a peering from a peering token, peerings will be set as "establishing". This will help distinguish between the two roles: the cluster that generates the peering token and the cluster that establishes the peering. - When marked for deletion, peering state will be set to "deleting". This way the UI determines the deletion via the state rather than the "DeletedAt" field. Co-authored-by: freddygv <freddy@hashicorp.com>	2022-07-04 10:47:58 -04:00
Daniel Upton	45886848b4	proxycfg: server-local intention upstreams data source This is the OSS portion of enterprise PR 2157. It builds on the local blocking query work in #13438 to implement the proxycfg.IntentionUpstreams interface using server-local data. Also moves the ACL filtering logic from agent/consul into the acl/filter package so that it can be reused here.	2022-07-04 10:48:36 +01:00
Daniel Upton	37ccbd2826	proxycfg: server-local intentions data source This is the OSS portion of enterprise PR 2141. This commit provides a server-local implementation of the `proxycfg.Intentions` interface that sources data from streaming events. It adds events for the `service-intentions` config entry type, and then consumes event streams (via materialized views) for the service's explicit intentions and any applicable wildcard intentions, merging them into a single list of intentions. An alternative approach I considered was to consume _all_ intention events (via `SubjectWildcard`) and filter out the irrelevant ones. This would admittedly remove some complexity in the `agent/proxycfg-glue` package but at the expense of considerable overhead from waking potentially many thousands of connect proxies every time any intention is updated.	2022-07-04 10:48:36 +01:00
Daniel Upton	653b8c4f9d	proxycfg: server-local config entry data sources This is the OSS portion of enterprise PR 2056. This commit provides server-local implementations of the proxycfg.ConfigEntry and proxycfg.ConfigEntryList interfaces, that source data from streaming events. It makes use of the LocalMaterializer type introduced for peering replication, adding the necessary support for authorization. It also adds support for "wildcard" subscriptions (within a topic) to the event publisher, as this is needed to fetch service-resolvers for all services when configuring mesh gateways. Currently, events will be emitted for just the ingress-gateway, service-resolver, and mesh config entry types, as these are the only entries required by proxycfg — the events will be emitted on topics named IngressGateway, ServiceResolver, and MeshConfig topics respectively. Though these events will only be consumed "locally" for now, they can also be consumed via the gRPC endpoint (confirmed using grpcurl) so using them from client agents should be a case of swapping the LocalMaterializer for an RPCMaterializer.	2022-07-04 10:48:36 +01:00
alex	cd9ca4290a	peering: add imported/exported counts to peering (#13644 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-06-29 14:07:30 -07:00
Chris S. Kim	b186731a2e	Fix ENT drift in files (#13647 )	2022-06-29 16:53:22 -04:00
Chris S. Kim	d8b7940e40	Add internal endpoint to fetch peered upstream candidates from VirtualIP table (#13642 ) For initial cluster peering TProxy support we consider all imported services of a partition to be potential upstreams. We leverage the VirtualIP table because it stores plain service names (e.g. "api", not "api-sidecar-proxy").	2022-06-29 16:34:58 -04:00
Eric Haberkorn	653cb42944	Fix spelling mistake in serverless patcher (#13607 ) passhthrough -> passthrough	2022-06-29 15:21:21 -04:00
alex	07bc22e405	no 1.9 style metrics (#13532 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-06-29 09:46:37 -07:00
alex	beb8b03e8a	peering: reconcile/ hint active state for list (#13619 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-06-29 09:43:50 -07:00
R.B. Boyer	31b95c747b	xds: modify rbac rules to use the XFCC header for peered L7 enforcement (#13629 ) When the protocol is http-like, and an intention has a peered source then the normal RBAC mTLS SAN field check is replaces with a joint combo of: mTLS SAN field must be the service's local mesh gateway leaf cert AND the first XFCC header (from the MGW) must have a URI field that matches the original intention source Also: - Update the regex program limit to be much higher than the teeny defaults, since the RBAC regex constructions are more complicated now. - Fix a few stray panics in xds generation.	2022-06-29 10:29:54 -05:00
R.B. Boyer	de0f9ac519	xds: have mesh gateways forward peered SpiffeIDs using the XFCC header (#13625 )	2022-06-28 15:32:42 -05:00
R.B. Boyer	1a9c86ea8f	xds: mesh gateways now correctly load up peer-exported discovery chains using L7 protocols (#13624 ) A mesh gateway will now configure the filter chains for L7 exported services using the correct discovery chain information.	2022-06-28 14:52:25 -05:00
R.B. Boyer	0fa828db76	peering: replicate all SpiffeID values necessary for the importing side to do SAN validation (#13612 ) When traversing an exported peered service, the discovery chain evaluation at the other side may re-route the request to a variety of endpoints. Furthermore we intend to terminate mTLS at the mesh gateway for arriving peered traffic that is http-like (L7), so the caller needs to know the mesh gateway's SpiffeID in that case as well. The following new SpiffeID values will be shipped back in the peerstream replication: - tcp: all possible SpiffeIDs resulting from the service-resolver component of the exported discovery chain - http-like: the SpiffeID of the mesh gateway	2022-06-27 14:37:18 -05:00
Max Bowsher	ef4b9e541f	Merge branch 'main' into fix-kv_entries-metric	2022-06-27 18:57:03 +01:00
alex	53f0cf5835	peering, internal: support UIServices, UINodes, UINodeInfo (#13577 )	2022-06-24 15:17:35 -07:00
Chris S. Kim	2e4cb6f77d	Add new index for PeeredServiceName and ServiceVirtualIP (#13582 ) For TProxy we will be leveraging the VirtualIP table, which needs to become peer-aware	2022-06-24 14:38:39 -04:00
alex	20ecf0febd	Merge pull request #13570 from hashicorp/acpance/peering-oss-intentions oss: peering, http: get peer service intentions (#2098)	2022-06-23 08:15:59 -07:00
Will Jordan	34ecbc1d71	Add per-node max indexes (#12399 ) Adds fine-grained node.[node] entries to the index table, allowing blocking queries to return fine-grained indexes that prevent them from returning immediately when unrelated nodes/services are updated. Co-authored-by: kisunji <ckim@hashicorp.com>	2022-06-23 11:13:25 -04:00
Chris S. Kim	ba89a7d9b0	Make memdb indexers generic (#13558 ) We have many indexer functions in Consul which take interface{} and type assert before building the index. We can use generics to get rid of the initial plumbing and pass around functions with better defined signatures. This has two benefits: 1) Less verbosity; 2) Developers can parse the argument types to memdb schemas without having to introspect the function for the type assertion.	2022-06-23 11:07:19 -04:00
Matt Keeler	7a4d13b0b2	Port over the index 0 -> 1 code that lived in the old rpc setQueryMeta function. (#13561 )	2022-06-23 09:34:47 -04:00
acpana	99c2e11328	oss: peering, http: get peer service intentions (#2098 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-06-22 16:25:09 -07:00
R.B. Boyer	e8ea3d7c3b	state: peering ID assignment cannot happen inside of the state store (#13525 ) Move peering ID assignment outisde of the FSM, so that the ID is written to the raft log and the same ID is used by all voters, and after restarts.	2022-06-21 13:04:08 -05:00
Matt Keeler	cb01702cd2	Add server local blocking queries and watches (#13438 ) Co-authored-by: Dan Upton <daniel@floppy.co>	2022-06-21 13:36:49 -04:00
Chris S. Kim	fb5eb20563	Pass trust domain to RBAC to validate and fix use of wrong peer trust bundles (#13508 )	2022-06-20 22:47:14 -04:00
Max Bowsher	7b97b8abd2	Delete definition of metric `consul.acl.blocked.node.registration` Although the metric is defined, there is no code which ever sets its value - the code in question is genuinely asymmetric - there are 3 types of object for which registration can be tracked, but only 2 for which deregistration can be tracked.	2022-06-19 17:38:04 +01:00
Max Bowsher	7c19c701e1	Fix incorrect name and doc for kv_entries metric The name of the metric as registered with the metrics library to provide the help string, was incorrect compared with the actual code that sets the metric value - bring them into sync. Also, the help message was incorrect. Rather than copy the help message from telemetry.mdx, which was correct, but felt a bit unnatural in the way it was worded, update both of them to a new wording.	2022-06-19 11:58:23 +01:00
Dan Upton	e00e3a0bc3	Move ACLResolveResult into acl/resolver package (#13467 ) Having this type live in the agent/consul package makes it difficult to put anything that relies on token resolution (e.g. the new gRPC services) in separate packages without introducing import cycles. For example, if package foo imports agent/consul for the ACLResolveResult type it means that agent/consul cannot import foo to register its service. We've previously worked around this by wrapping the ACLResolver to "downgrade" its return type to an acl.Authorizer - aside from the added complexity, this also loses the resolved identity information. In the future, we may want to move the whole ACLResolver into the acl/resolver package. For now, putting the result type there at least, fixes the immediate import cycle issues.	2022-06-17 10:24:43 +01:00
DanStough	4b402e3119	feat: tgtwy xDS generation for destinations Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>	2022-06-16 16:17:49 -04:00
alex	bd4ddb3720	peering: block Intention.Apply ops (#13451 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-06-16 12:07:28 -07:00
alex	b3e99784a6	peering, state: account for peer intentions (#13443 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-06-16 10:27:31 -07:00
R.B. Boyer	da8cea58c9	xds: begin refactor to always pass test snapshots through all xDS types (#13461 )	2022-06-15 14:58:28 -05:00
R.B. Boyer	201d1458c3	xds: mesh gateways now have their own leaf certificate when involved in a peering (#13460 ) This is only configured in xDS when a service with an L7 protocol is exported. They also load any relevant trust bundles for the peered services to eventually use for L7 SPIFFE validation during mTLS termination.	2022-06-15 14:36:18 -05:00
Riddhi Shah	411edc876b	[OSS] Support merge-central-config option in node services list API (#13450 ) Adds the merge-central-config query param option to the /catalog/node-services/:node-name API, to get a service definition in the response that is merged with central defaults (proxy-defaults/service-defaults). Updated the consul connect envoy command to use this option when retrieving the proxy service details so as to render the bootstrap configuration correctly.	2022-06-15 08:30:31 -07:00
Evan Culver	7f8c650d61	connect: Use Envoy 1.22.2 instead of 1.22.1 (#13444 )	2022-06-14 15:29:41 -07:00
freddygv	f3843809da	Avoid deleting peerings marked as terminated. When our peer deletes the peering it is locally marked as terminated. This termination should kick off deleting all imported data, but should not delete the peering object itself. Keeping peerings marked as terminated acts as a signal that the action took place.	2022-06-14 15:37:09 -06:00
freddygv	6453375ab2	Add leader routine to clean up peerings Once a peering is marked for deletion a new leader routine will now clean up all imported resources and then the peering itself. A lot of the logic was grabbed from the namespace/partitions deferred deletions but with a handful of simplifications: - The rate limiting is not configurable. - Deleting imported nodes/services/checks is done by deleting nodes with the Txn API. The services and checks are deleted as a side-effect. - There is no "round rate limiter" like with namespaces and partitions. This is because peerings are purely local, and deleting a peering in the datacenter does not depend on deleting data from other DCs like with WAN-federated namespaces. All rate limiting is handled by the Raft rate limiter.	2022-06-14 15:36:50 -06:00
Evan Culver	ba6136eb42	connect: Update Envoy support matrix to latest patch releases (#13431 )	2022-06-14 13:19:09 -07:00
alex	a0a49ce2a6	peering: intentions list test (#13435 )	2022-06-14 10:59:53 -07:00
freddygv	6c8ab1bbac	Fixup stream tear-down steps. 1. Fix a bug where the peering leader routine would not track all active peerings in the "stored" reconciliation map. This could lead to tearing down streams where the token was generated, since the ConnectedStreams() method used for reconciliation returns all streams and not just the ones initiated by this leader routine. 2. Fix a race where stream contexts were being canceled before termination messages were being processed by a peer. Previously the leader routine would tear down streams by canceling their context right after the termination message was sent. This context cancelation could be propagated to the server side faster than the termination message. Now there is a change where the dialing peer uses CloseSend() to signal when no more messages will be sent. Eventually the server peer will read an EOF after receiving and processing the preceding termination message. Using CloseSend() is actually not enough to address the issue mentioned, since it doesn't wait for the server peer to finish processing messages. Because of this now the dialing peer also reads from the stream until an error signals that there are no more messages. Receiving an EOF from our peer indicates that they processed the termination message and have no additional work to do. Given that the stream is being closed, all the messages received by Recv are discarded. We only check for errors to avoid importing new data.	2022-06-13 12:10:42 -06:00
freddygv	cc921a9c78	Update peering state and RPC for deferred deletion When deleting a peering we do not want to delete the peering and all imported data in a single operation, since deleting a large amount of data at once could overload Consul. Instead we defer deletion of peerings so that: 1. When a peering deletion request is received via gRPC the peering is marked for deletion by setting the DeletedAt field. 2. A leader routine will monitor for peerings that are marked for deletion and kick off a throttled deletion of all imported resources before deleting the peering itself. This commit mostly addresses point #1 by modifying the peering service to mark peerings for deletion. Another key change is to add a PeeringListDeleted state store function which can return all peerings marked for deletion. This function is what will be watched by the deferred deletion leader routine.	2022-06-13 12:10:32 -06:00
Freddy	71b254522e	Clean up imported nodes/services/checks as needed (#13367 ) Previously, imported data would never be deleted. As nodes/services/checks were registered and deregistered, resources deleted from the exporting cluster would accumulate in the imported cluster. This commit makes updates to replication so that whenever an update is received for a service name we reconcile what was present in the catalog against what was received. This handleUpdateService method can handle both updates and deletions.	2022-06-13 11:52:28 -06:00
Mark Anderson	edbf19f4e8	Merge pull request #13357 from hashicorp/ma/add-build-date-oss Add build date (oss)	2022-06-13 08:43:20 -07:00
Chris S. Kim	a02e9abcc1	Update RBAC to handle imported services (#13404 ) When converting from Consul intentions to xds RBAC rules, services imported from other peers must encode additional data like partition (from the remote cluster) and trust domain. This PR updates the PeeringTrustBundle to hold the sending side's local partition as ExportedPartition. It also updates RBAC code to encode SpiffeIDs of imported services with the ExportedPartition and TrustDomain.	2022-06-10 17:15:22 -04:00
R.B. Boyer	f557509e58	xds: allow for peered upstreams to use tagged addresses that are hostnames (#13422 ) Mesh gateways can use hostnames in their tagged addresses (#7999). This is useful if you were to expose a mesh gateway using a cloud networking load balancer appliance that gives you a DNS name but no reliable static IPs. Envoy cannot accept hostnames via EDS and those must be configured using CDS. There was already logic when configuring gateways in other locations in the code, but given the illusions in play for peering the downstream of a peered service wasn't aware that it should be doing that. Also: - ensuring that we always try to use wan-like addresses to cross peer boundaries.	2022-06-10 16:11:40 -05:00
Kyle Havlovitz	7f62571419	Add dns node lookup support in partitions	2022-06-10 11:23:51 -07:00
R.B. Boyer	7001e1151c	peering: rename initiate to establish in the context of the APIs (#13419 )	2022-06-10 11:10:46 -05:00
Mark Anderson	dd22ceccd1	Change default dates Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-06-09 17:07:41 -07:00
Mark Anderson	f65093f1c6	Fixup some more tests Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-06-09 17:04:05 -07:00
Mark Anderson	19c87be3a6	Add build date to self endpoint Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-06-09 17:04:05 -07:00
Mark Anderson	ec060e5e37	Build date in config file Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-06-09 17:04:05 -07:00
R.B. Boyer	bba3eb8cdd	peering: mesh gateways are required for cross-peer service mesh communication (#13410 ) Require use of mesh gateways in order for service mesh data plane traffic to flow between peers. This also adds plumbing for envoy integration tests involving peers, and one starter peering test.	2022-06-09 11:05:18 -05:00
Alessandro De Blasis	06304bfb0d	lint: conversion	2022-06-09 16:17:20 +01:00
Alessandro De Blasis	28f19e4627	tests: removed redundant probe test	2022-06-09 15:49:45 +01:00
Alessandro De Blasis	af083cc5ba	tests: added syscall mocking and tests for Check_OSService	2022-06-09 15:48:34 +01:00
kisunji	196a1c468a	Add missing index for read	2022-06-08 13:53:31 -04:00
kisunji	d026d84880	Add IntentionMatch tests for source peers	2022-06-08 13:53:31 -04:00
kisunji	bb0b42da12	Update ServiceIntentionSourceIndex to handle peer	2022-06-08 13:53:31 -04:00
Chris S. Kim	bb832e2bba	Add SourcePeer fields to relevant Intentions types (#13390 )	2022-06-08 13:24:10 -04:00
R.B. Boyer	7423886136	peering: allow protobuf requests to populate the default partition or namespace (#13398 )	2022-06-08 11:55:18 -05:00
Dhia Ayachi	ec0d267a35	Fix intentions wildcard dest (#13397 ) * when enterprise meta are wildcard assume it's a service intention * fix partition and namespace * move kind outside the loops * get the kind check outside the loop and add a comment Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>	2022-06-08 10:38:55 -04:00
R.B. Boyer	edb2e55335	peering: avoid a race between peering establishment and termination (#13389 )	2022-06-07 16:29:09 -05:00
Dhia Ayachi	7393374fc0	Egress gtw/intention rpc endpoint (#13354 ) * update gateway-services table with endpoints * fix failing test * remove unneeded config in test * rename "endpoint" to "destination" * more endpoint renaming to destination in tests * update isDestination based on service-defaults config entry creation * use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist) * set unknown state to empty to avoid modifying alot of tests * fix logic to set the kind correctly on CRUD * fix failing tests * add missing tests and fix service delete * fix failing test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * fix a bug with kind and add relevant test * fix compile error * fix failing tests * add kind to clone * fix failing tests * fix failing tests in catalog endpoint * fix service dump test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * remove duplicate tests * first draft of destinations intention in connect proxy * remove ServiceDestinationList * fix failing tests * fix agent/consul failing tests * change to filter intentions in the state store instead of adding a field. * fix failing tests * fix comment * fix comments * store service kind destination and add relevant tests * changes based on review * filter on destinations when querying source match * change state store API to get an IntentionTarget parameter * add intentions tests * add destination upstream endpoint * fix failing test * fix failing test and a bug with wildcard intentions * fix failing test * Apply suggestions from code review Co-authored-by: alex <8968914+acpana@users.noreply.github.com> * add missing test and clarify doc * fix style * gofmt intention.go * fix merge introduced issue Co-authored-by: Dan Stough <dan.stough@hashicorp.com> Co-authored-by: alex <8968914+acpana@users.noreply.github.com> Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>	2022-06-07 15:55:02 -04:00
Dhia Ayachi	5ec3274ae5	Egress gtw/connect destination intentions (#13341 ) * update gateway-services table with endpoints * fix failing test * remove unneeded config in test * rename "endpoint" to "destination" * more endpoint renaming to destination in tests * update isDestination based on service-defaults config entry creation * use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist) * set unknown state to empty to avoid modifying alot of tests * fix logic to set the kind correctly on CRUD * fix failing tests * add missing tests and fix service delete * fix failing test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * fix a bug with kind and add relevant test * fix compile error * fix failing tests * add kind to clone * fix failing tests * fix failing tests in catalog endpoint * fix service dump test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * remove duplicate tests * first draft of destinations intention in connect proxy * remove ServiceDestinationList * fix failing tests * fix agent/consul failing tests * change to filter intentions in the state store instead of adding a field. * fix failing tests * fix comment * fix comments * store service kind destination and add relevant tests * changes based on review * filter on destinations when querying source match * Apply suggestions from code review Co-authored-by: alex <8968914+acpana@users.noreply.github.com> * fix style * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * rename destinationType to targetType. Co-authored-by: Dan Stough <dan.stough@hashicorp.com> Co-authored-by: alex <8968914+acpana@users.noreply.github.com> Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>	2022-06-07 15:03:59 -04:00
Alessandro De Blasis	b59c19bb06	feat: windows service health checks	2022-06-07 18:27:14 +01:00
R.B. Boyer	ab758b7b32	peering: allow mesh gateways to proxy L4 peered traffic (#13339 ) Mesh gateways will now enable tcp connections with SNI names including peering information so that those connections may be proxied. Note: this does not change the callers to use these mesh gateways.	2022-06-06 14:20:41 -05:00
Fulvio	d457d8b6ce	UDP check for service stanza #12221 (#12722 ) * UDP check for service stanza #12221 * add pass status on timeout condition * delete useless files * Update check_test.go improve comment in test * fix test * fix requested changes and update TestRuntimeConfig_Sanitize.golden * add freeport to TestCheckUDPCritical * improve comment for CheckUDP struct * fix requested changes * fix requested changes * fix requested changes * add UDP to proto * add UDP to proto and add a changelog * add requested test on agent_endpoint_test.go * add test for given endpoints * fix failing tests * add documentation for udp healthcheck * regenerate proto using buf * Update website/content/api-docs/agent/check.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> * Update website/content/api-docs/agent/check.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> * Update website/content/docs/discovery/checks.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> * Update website/content/docs/ecs/configuration-reference.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> * Update website/content/docs/ecs/configuration-reference.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> * add debug echo * add debug circle-ci * add debug circle-ci bash * use echo instead of status_stage * remove debug and status from devtools script and use echo instead * Update website/content/api-docs/agent/check.mdx Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * fix test * replace status_stage with status * replace functions with echo Co-authored-by: Dhia Ayachi <dhia@hashicorp.com> Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>	2022-06-06 15:13:19 -04:00
alex	bbbc50815a	peering: send leader addr (#13342 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-06-06 10:00:38 -07:00
Dan Upton	b168424398	xds: remove HTTPCheckFetcher dependency (#13366 ) This is the OSS portion of enterprise PR 1994 Rather than directly interrogating the agent-local state for HTTP checks using the `HTTPCheckFetcher` interface, we now rely on the config snapshot containing the checks. This reduces the number of changes required to support server xDS sessions. It's not clear why the fetching approach was introduced in `931d167ebb`.	2022-06-06 15:15:33 +01:00
R.B. Boyer	019aeaa57d	peering: update how cross-peer upstreams and represented in proxycfg and rendered in xds (#13362 ) This removes unnecessary, vestigal remnants of discovery chains.	2022-06-03 16:42:50 -05:00
cskh	74158a8aa2	Add isLeader metric to track if a server is a leader (#13304 ) CTIA-21: sdd is_leader metric to track if a server is a leader Co-authored-by: alex <8968914+acpana@users.noreply.github.com>	2022-06-03 13:07:37 -04:00
Freddy	32f125cc0f	Merge pull request #13340 from hashicorp/peering/public-listener	2022-06-02 15:15:29 -06:00
Chris S. Kim	73af9e9737	Fix KVSGet method to handle QueryOptions properly (#13344 )	2022-06-02 12:26:18 -04:00
Freddy	a09c776645	Update public listener with SPIFFE Validator Envoy's SPIFFE certificate validation extension allows for us to validate against different root certificates depending on the trust domain of the dialing proxy. If there are any trust bundles from peers in the config snapshot then we use the SPIFFE validator as the validation context, rather than the usual TrustedCA. The injected validation config includes the local root certificates as well.	2022-06-01 17:06:33 -06:00
freddygv	647c57a416	Add agent cache-type for TrustBundleListByService There are a handful of changes in this commit: * When querying trust bundles for a service we need to be able to specify the namespace of the service. * The endpoint needs to track the index because the cache watches use it. * Extracted bulk of the endpoint's logic to a state store function so that index tracking could be tested more easily. * Removed check for service existence, deferring that sort of work to ACL authz * Added the cache type	2022-06-01 17:05:10 -06:00
freddygv	8b58fa8afe	Update assumptions around exported-service config Given that the exported-services config entry can use wildcards, the precedence for wildcards is handled as with intentions. The most exact match is the match that applies for any given service. We do not take the union of all that apply. Another update that was made was to reflect that only one exported-services config entry applies to any given service in a partition. This is a pre-existing constraint that gets enforced by the Normalize() method on that config entry type.	2022-06-01 17:03:51 -06:00
Freddy	74ca6406ea	Configure upstream TLS context with peer root certs (#13321 ) For mTLS to work between two proxies in peered clusters with different root CAs, proxies need to configure their outbound listener to use different root certificates for validation. Up until peering was introduced proxies would only ever use one set of root certificates to validate all mesh traffic, both inbound and outbound. Now an upstream proxy may have a leaf certificate signed by a CA that's different from the dialing proxy's. This PR makes changes to proxycfg and xds so that the upstream TLS validation uses different root certificates depending on which cluster is being dialed.	2022-06-01 15:53:52 -06:00
R.B. Boyer	8e530701ce	test: regenerate golden files (#13336 ) make envoy-regen go test ./agent/config -update	2022-06-01 15:17:03 -05:00
Chris S. Kim	fcdd031911	Revert getPathSuffixUnescaped (#13256 )	2022-06-01 13:17:14 -04:00
Dan Upton	adeabed126	proxycfg: replace direct agent cache usage with interfaces (#13320 ) This is the OSS portion of enterprise PRs 1904, 1905, 1906, 1907, 1949, and 1971. It replaces the proxycfg manager's direct dependency on the agent cache with interfaces that will be implemented differently when serving xDS sessions from a Consul server.	2022-06-01 16:18:06 +01:00
Chris S. Kim	67860bd248	Reimplement fs.FileInfo interface (#13315 ) Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-06-01 11:09:51 -04:00
Dhia Ayachi	1b779240ae	update gateway-services table with endpoints (#13217 ) * update gateway-services table with endpoints * fix failing test * remove unneeded config in test * rename "endpoint" to "destination" * more endpoint renaming to destination in tests * update isDestination based on service-defaults config entry creation * use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist) * set unknown state to empty to avoid modifying alot of tests * fix logic to set the kind correctly on CRUD * fix failing tests * add missing tests and fix service delete * fix failing test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * fix a bug with kind and add relevant test * fix compile error * fix failing tests * add kind to clone * fix failing tests * fix failing tests in catalog endpoint * fix service dump test * Apply suggestions from code review Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * remove duplicate tests * rename consts and fix kind when no destination is defined in the service-defaults. * rename Kind to ServiceKind and change switch to use .(type) Co-authored-by: Dan Stough <dan.stough@hashicorp.com>	2022-05-31 16:20:12 -04:00
Chris S. Kim	f0a9b30174	Update repo to use go:embed (#10996 ) Replace bindata packages with stdlib go:embed. Modernize some uiserver code with newer interfaces introduced in go 1.16 (mainly working with fs.File instead of http.File. Remove steps that are no longer used from our build files. Add Github Action to detect differences in agent/uiserver/dist and verify that the files are correct (by compiling UI assets and comparing contents).	2022-05-31 15:33:56 -04:00
Riddhi Shah	1a901953e2	[OSS] Fix merge central config tests (#13309 ) Setting the right enterprise meta to fix the merge central config tests. Re-added the tests that were failing on the OSS to ENT merge.	2022-05-31 12:04:19 -07:00
freddygv	364758ef2f	Use embedded SpiffeID for peered upstreams	2022-05-31 09:55:37 -06:00
freddygv	c8edec0ab6	Remove intermediate representation of SPIFFE IDs xDS only ever uses the string representation, so we can avoid passing around connect.SpiffeIDService objects around.	2022-05-31 09:55:37 -06:00
freddygv	870e7c72d7	Return SPIFFE ID for connect proxies in PeerMeta Proxies dialing exporting services need to know the SPIFFE ID of services dialed so that the upstream's SANs can be validated. This commit attaches the SPIFFE ID to all connect proxies exported over the peering stream so that they are available to importing clusters. The data in the SPIFFE ID cannot be re-constructed in peer clusters because the partition of exported services is overwritten on imports.	2022-05-31 09:55:37 -06:00
Freddy	9427700270	[OSS] Add grpc endpoint to fetch a specific trust bundle (#13292 ) Co-authored-by: R.B. Boyer <rb@hashicorp.com>	2022-05-31 09:54:40 -06:00
Matt Keeler	3795769729	Fix a flaky test (#13282 ) At the end of this test we were trying to ensure that updating a service in the local state causes it to re-register the service with the config manager. The config manager in the same method will also call RegisteredProxies to determine if any need to be removed. This portion of the test is not attempting to verify that behavior. Because the test is only blocked waiting for the Register event before it can end and assert all the mock expectations were met, we may not see the call to RegisteredProxies. This is especially apparent when tests are run with the race detector. As we don’t actually care if that method is executed before the end of the test we can simply transition from expecting it to be called exactly once to a 0 or 1 times assertion.	2022-05-27 13:25:08 -04:00
Dan Upton	2427e38839	Enable servers to configure arbitrary proxies from the catalog (#13244 ) OSS port of enterprise PR 1822 Includes the necessary changes to the `proxycfg` and `xds` packages to enable Consul servers to configure arbitrary proxies using catalog data. Broadly, `proxycfg.Manager` now has public methods for registering, deregistering, and listing registered proxies — the existing local agent state-sync behavior has been moved into a separate component that makes use of these methods. When an xDS session is started for a proxy service in the catalog, a goroutine will be spawned to watch the service in the server's state store and re-register it with the `proxycfg.Manager` whenever it is updated (and clean it up when the client goes away).	2022-05-27 12:38:52 +01:00
alex	fd7a403e11	monitor leadership in peering service (#13257 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2022-05-26 17:55:16 -07:00
Riddhi Shah	b6a4271c02	Termporarily disable validation of merge central config response (#13266 ) Temporarily disabling the validation of merge central config response since it is breaking OSS to ENT merging. A follow up PR will patch the fixes.	2022-05-26 13:49:40 -07:00
Chris S. Kim	6d3bea7129	Add support for streaming CA roots to peers (#13260 ) Sender watches for changes to CA roots and sends them through the replication stream. Receiver saves CA roots to tablePeeringTrustBundle	2022-05-26 15:24:09 -04:00
Riddhi Shah	c78ee7d48f	Remove tests failing on ent (#13255 ) Will follow up with the fixed version of these tests that passes in ent.	2022-05-26 10:17:59 -07:00
John Cowen	09c5bac102	Export top-level HCP Enabled go-template variable for UI (#13165 ) * Update ui template data to export HCPEnabled at the top level	2022-05-26 17:23:56 +01:00
DanStough	2e2c71d2f2	fix: multiple grpc/http2 services for ingress listeners	2022-05-26 10:43:58 -04:00
Riddhi Shah	d8d8c8603e	Add support for merge-central-config query param (#13001 ) Adds a new query param merge-central-config for use with the below endpoints: /catalog/service/:service /catalog/connect/:service /health/service/:service /health/connect/:service If set on the request, the response will include a fully resolved service definition which is merged with the proxy-defaults/global and service-defaults/:service config entries (on-demand style). This is useful to view the full service definition for a mesh service (connect-proxy kind or gateway kind) which might not be merged before being written into the catalog (example: in case of services in the agentless model).	2022-05-25 13:20:17 -07:00
R.B. Boyer	31526139fd	remove a source of test panics (#13227 )	2022-05-25 14:33:00 -05:00
R.B. Boyer	a85b8a4705	api: ensure peering API endpoints do not use protobufs (#13204 ) I noticed that the JSON api endpoints for peerings json encodes protobufs directly, rather than converting them into their `api` package equivalents before marshal/unmarshaling them. I updated this and used `mog` to do the annoying part in the middle. Other changes: - the status enum was converted into the friendlier string form of the enum for readability with tools like `curl` - some of the `api` library functions were slightly modified to match other similar endpoints in UX (cc: @ndhanushkodi ) - peeringRead returns `nil` if not found - partitions are NOT inferred from the agent's partition (matching 1.11-style logic)	2022-05-25 13:43:35 -05:00
R.B. Boyer	1a8834e1c8	peering: replicate expected SNI, SPIFFE, and service protocol to peers (#13218 ) The importing peer will need to know what SNI and SPIFFE name corresponds to each exported service. Additionally it will need to know at a high level the protocol in use (L4/L7) to generate the appropriate connection pool and local metrics. For replicated connect synthetic entities we edit the `Connect{}` part of a `NodeService` to have a new section: { "PeerMeta": { "SNI": [ "web.default.default.owt.external.183150d5-1033-3672-c426-c29205a576b8.consul" ], "SpiffeID": [ "spiffe://183150d5-1033-3672-c426-c29205a576b8.consul/ns/default/dc/dc1/svc/web" ], "Protocol": "tcp" } } This data is then replicated and saved as-is at the importing side. Both SNI and SpiffeID are slices for now until I can be sure we don't need them for how mesh gateways will ultimately work.	2022-05-25 12:37:44 -05:00
R.B. Boyer	be631ebdce	peering: disable requirement for mesh gateways initially (#13213 )	2022-05-25 10:13:23 -05:00
Kyle Havlovitz	0ed9ff8ef7	Merge pull request #13143 from hashicorp/envoy-connection-limit Add connection limit setting to service defaults	2022-05-25 07:48:50 -07:00
Kyle Havlovitz	f2fbe8aec9	Fix proto lint errors after version bump	2022-05-24 18:44:54 -07:00
Kyle Havlovitz	dbed8ae10b	Specify go_package explicitly	2022-05-24 10:22:53 -07:00
cskh	8712a088b1	fix: non-leader agents return 404 on Get Intention exact api (#13179 ) * fix: non-leader agents return 404 on Get Intention exact api - rpc call method appends extra error message, so change == to "Strings.Contains" Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-05-24 13:21:15 -04:00
Kyle Havlovitz	4bc6c23357	Add connection limit setting to service defaults	2022-05-24 10:13:38 -07:00
DanStough	817449041d	chore(test): Update bats version	2022-05-24 11:56:08 -04:00
DanStough	147fd96d97	feat: add endpoint struct to ServiceConfigEntry	2022-05-24 11:56:08 -04:00
alex	876f3bb971	peering: expose IsLeader, hung up on dialer if follower (#13164 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-05-23 11:30:58 -07:00
Matt Keeler	26f4ea3f01	Migrate from `protoc` to `buf` (#12841 ) * Install `buf` instead of `protoc` * Created `buf.yaml` and `buf.gen.yaml` files in the two proto directories to control how `buf` generates/lints proto code. * Invoke `buf` instead of `protoc` * Added a `proto-format` make target. * Committed the reformatted proto files. * Added a `proto-lint` make target. * Integrated proto linting with CI * Fixed tons of proto linter warnings. * Got rid of deprecated builtin protoc-gen-go grpc plugin usage. Moved to direct usage of protoc-gen-go-grpc. * Unified all proto directories / go packages around using pb prefixes but ensuring all proto packages do not have the prefix.	2022-05-23 10:37:52 -04:00
cskh	c986940fda	Upgrade golangci-lint for go v1.18 (#13176 )	2022-05-23 10:26:45 -04:00
R.B. Boyer	21bb0eef4a	test: fix flaky test TestEventBufferFuzz (#13175 )	2022-05-23 09:22:30 -05:00
Matt Keeler	d0fdf22f83	Fix tests broken in #13173 (#13178 ) I changed the error type returned in a situation but didn’t update the tests to expect that error.	2022-05-23 10:00:06 -04:00
Matt Keeler	3c1e17cbd5	Fix flaky tests in the agent/grpc/public/services/serverdiscovery package (#13173 ) Occasionally we had seen the TestWatchServers_ACLToken_PermissionDenied be flagged as flaky in circleci. This change should fix that. Why it fixes it is complicated. The test was failing with a panic when a mocked ACL Resolver was being called more times than expected. I struggled for a while to determine how that could be. This test should call authorize once and only once and the error returned should cause the stream to be terminated and the error returned to the gRPC client. Another oddity was no amount of running this test locally seemed to be able to reproduce the issue. I ran the test hundreds of thousands of time and it always passed. It turns out that there is nothing wrong with the test. It just so happens that the panic from unexpected invocation of a mocked call happened during the test but was caused by a previous test (specifically the TestWatchServers_StreamLifecycle test) The stream from the previous test remained open after all the test Cleanup functions were run and it just so happened that when the EventPublisher eventually picked up that the context was cancelled during cleanup, it force closes all subscriptions which causes some loops to be re-entered and the streams to be reauthorized. Its that looping in response to forced subscription closures that causes the mock to eventually panic. All the components, publisher, server, client all operate based on contexts. We cancel all those contexts but there is no syncrhonous way to know when they are stopped. We could have implemented a syncrhonous stop but in the context of an actual running Consul, context cancellation + async stopping is perfectly fine. What we (Dan and I) eventually thought was that the behavior of grpc streams such as this when a server was shutting down wasn’t super helpful. What we would want is for a client to be able to distinguish between subscription closed because something may have changed requiring re-authentication and subscription closed because the server is shutting down. That way we can send back appropriate error messages to detail that the server is shutting down and not confuse users with potentially needing to resubscribe. So thats what this PR does. We have introduced a shutting down state to our event subscriptions and the various streaming gRPC services that rely on the event publisher will all just behave correctly and actually stop the stream (not attempt transparent reauthorization) if this particular error is the one we get from the stream. Additionally the error that gets transmitted back through gRPC when this does occur indicates to the consumer that the server is going away. That is more helpful so that a client can then attempt to reconnect to another server.	2022-05-23 08:59:13 -04:00
R.B. Boyer	bbcb1fa805	agent: allow for service discovery queries involving peer name to use streaming (#13168 )	2022-05-20 15:27:01 -05:00
Dan Upton	d7f8a8e4ef	proxycfg: remove dependency on `cache.UpdateEvent` (#13144 ) OSS portion of enterprise PR 1857. This removes (most) references to the `cache.UpdateEvent` type in the `proxycfg` package. As we're going to be direct usage of the agent cache with interfaces that can be satisfied by alternative server-local datasources, it doesn't make sense to depend on this type everywhere anymore (particularly on the `state.ch` channel). We also plan to extract `proxycfg` out of Consul into a shared library in the future, which would require removing this dependency. Aside from a fairly rote find-and-replace, the main change is that the `cache.Cache` and `health.Client` types now accept a callback function parameter, rather than a `chan<- cache.UpdateEvents`. This allows us to do the type conversion without running another goroutine.	2022-05-20 15:47:40 +01:00
R.B. Boyer	2e72f44fda	peering: accept replication stream of discovery chain information at the importing side (#13151 )	2022-05-19 16:37:52 -05:00
R.B. Boyer	c27e186334	test: TestServer_RPC_MetricsIntercept should use a concurrency-safe metrics store (#13157 )	2022-05-19 15:39:28 -05:00
cskh	364d4f5efe	Retry on bad dogstatsd connection (#13091 ) - Introduce a new telemetry configurable parameter retry_failed_connection. User can set the value to true to let consul agent continue its start process on failed connection to datadog server. When set to false, agent will stop on failed start. The default behavior is true. Co-authored-by: Dan Upton <daniel@floppy.co> Co-authored-by: Evan Culver <eculver@users.noreply.github.com>	2022-05-19 16:03:46 -04:00
R.B. Boyer	3e4a522882	peering: replicate discovery chains information to importing peers Treat each exported service as a "discovery chain" and replicate one synthetic CheckServiceNode for each chain and remote mesh gateway. The health will be a flattened generated check of the checks for that mesh gateway node.	2022-05-19 14:21:44 -05:00
R.B. Boyer	5a03536040	prefactor some functions out of the monolithic file	2022-05-19 14:21:29 -05:00
R.B. Boyer	1e31dc891a	test: fix incorrect use of t instead of r in retry test (#13146 )	2022-05-19 14:00:07 -05:00
Dan Upton	a76f63a695	config: prevent top-level `verify_incoming` enabling mTLS on gRPC port (#13118 ) Fixes #13088 This is a backwards-compatibility bug introduced in 1.12.	2022-05-18 16:15:57 +01:00
Freddy	b38be4c0ed	Patches to peering initiation for POC demo (#13076 ) Co-authored-by: R.B. Boyer <rb@hashicorp.com>	2022-05-13 13:01:00 -06:00
Dhia Ayachi	a0455774c0	When a host header is defined override `req.Host` in the metrics ui (#13071 ) * When a host header is defined override the req.Host in the metrics ui endpoint. * add changelog	2022-05-13 14:05:22 -04:00
Freddy	e874b860c0	Actually block when syncing subscriptions (#13066 ) By changing to use WatchCtx we will actually block for changes to the peering list. WatchCh creates a goroutine to collect errors from WatchCtx and returns immediately. The existing behavior wouldn't result in a tight loop because of the rate limiting in the surrounding function, but it would still lead to more work than is necessary.	2022-05-12 17:36:14 -06:00
Evan Culver	0fa5e7be5a	peering: add TrustBundleListByService endpoint (#13048 )	2022-05-12 15:58:22 -07:00
Freddy	4e215dc411	[OSS] Add upsert handling for receiving CheckServiceNode (#13061 )	2022-05-12 15:04:44 -06:00
Matt Keeler	b788691fa6	Watch the singular service resolver instead of the list + filtering to 1 (#13012 ) * Watch the singular service resolver instead of the list + filtering to 1 * Rename the ConfigEntries cache type to ConfigEntryList	2022-05-12 16:34:17 -04:00
R.B. Boyer	93b164aac3	structs: add convenience methods to sort slices of ServiceName values (#13038 )	2022-05-12 10:08:50 -05:00
R.B. Boyer	cc15a11f9c	test: ensure this package uses freeport for port allocation (#13036 )	2022-05-11 14:20:50 -05:00
R.B. Boyer	901fd4dd68	remove remaining shim runStep functions (#13015 ) Wraps up the refactor from #13013	2022-05-10 16:24:45 -05:00
R.B. Boyer	0d6d16ddfb	add general runstep test helper instead of copying it all over the place (#13013 )	2022-05-10 15:25:51 -05:00
Jared Kirschner	f4e1ade46a	Merge pull request #12463 from hashicorp/docs/consistency-mode-improvements Improve consistency mode docs	2022-05-09 23:04:00 -04:00
Jared Kirschner	05a648f530	docs: clarify consistency mode operation Changes include: - Add diagrams of the operation of different consistency modes - Note that only stale reads benefit from horizontal scaling - Increase scannability with headings - Document consistency mode defaults and how to override for DNS and HTTP API interfaces - Document X-Consul-Effective-Consistency response header	2022-05-09 16:39:48 -07:00
FFMMM	b8ce8e36fb	add err msg on PeeringRead not found (#12986 ) Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2022-05-09 15:22:42 -07:00
FFMMM	37a1e33834	expose meta tags for peering (#12964 )	2022-05-09 13:47:37 -07:00
Mark Anderson	4364e440db	Add oss test Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-05-09 10:07:19 -07:00
Mark Anderson	346b68a441	Fix up enterprise version tag. Changes to how the version string was handled created small regression with the release of consul 1.12.0 enterprise. Many tools use the Config:Version field reported by the agent/self resource to determine whether Consul is an enterprise or OSS instance, expect something like 1.12.0+ent for enterprise and simply 1.12.0 for OSS. This was accidentally broken during the runup to 1.12.x This work fixes the value returned by both the self endpoint in ["Config"]["Version"] and the metrics consul.version field. Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-05-09 10:07:19 -07:00
Evan Culver	9c8606e138	peering: add store.PeeringsForService implementation (#12957 )	2022-05-06 12:35:31 -07:00
Eric Haberkorn	e7b9d025a4	Merge pull request #12956 from hashicorp/suport-lambda-connect-proxy Support Invoking Lambdas from Sidecar Proxies	2022-05-06 08:17:38 -04:00
Eric	21c3134575	Support making requests to lambda from connect proxies.	2022-05-05 17:42:30 -04:00
FFMMM	745bd15b15	api: add PeeeringList, polish (#12934 )	2022-05-05 14:15:42 -07:00
Riddhi Shah	0c855fab98	Validate port on mesh service registration (#12881 ) Add validation to ensure connect native services have a port or socketpath specified on catalog registration. This was the only missing piece to ensure all mesh services are validated for a port (or socketpath) specification on catalog registration.	2022-05-05 09:13:30 -07:00
Mark Anderson	c6ff4ba7d8	Support vault namespaces in connect CA (#12904 ) * Support vault namespaces in connect CA Follow on to some missed items from #12655 From an internal ticket "Support standard "Vault namespace in the path" semantics for Connect Vault CA Provider" Vault allows the namespace to be specified as a prefix in the path of a PKI definition, but our usage of the Vault API includes calls that don't support a namespaced key. In particular the sys.* family of calls simply appends the key, instead of prefixing the namespace in front of the path. Unfortunately it is difficult to reliably parse a path with a namespace; only vault knows what namespaces are present, and the '/' separator can be inside a key name, as well as separating path elements. This is in use in the wild; for example 'dc1/intermediate-key' is a relatively common naming schema. Instead we add two new fields: RootPKINamespace and IntermediatePKINamespace, which are the absolute namespace paths 'prefixed' in front of the respective PKI Paths. Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-05-04 19:41:55 -07:00
Chris S. Kim	abc472f2a3	Default discovery chain when upstream targets a DestinationPeer (#12942 )	2022-05-04 16:25:25 -04:00
Mark Anderson	2fcac5224e	Merge pull request #12878 from hashicorp/ma/x-forwarded-client-cert Support x-forwarded-client-cert	2022-05-04 11:05:44 -07:00
Dan Upton	a668c36930	acl: gRPC login and logout endpoints (#12935 ) Introduces two new public gRPC endpoints (`Login` and `Logout`) and includes refactoring of the equivalent net/rpc endpoints to enable the majority of logic to be reused (i.e. by extracting the `Binder` and `TokenWriter` types). This contains the OSS portions of the following enterprise commits: - 75fcdbfcfa6af21d7128cb2544829ead0b1df603 - bce14b714151af74a7f0110843d640204082630a - cc508b70fbf58eda144d9af3d71bd0f483985893	2022-05-04 17:38:45 +01:00
Mark Anderson	97f19a6ec1	Fix tests for APPEND_FORWARD change Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-05-04 08:50:59 -07:00
Mark Anderson	863bc16530	Change to use APPEND_FORWARD for terminating gateway Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-05-04 08:50:59 -07:00
Mark Anderson	6430af1c0e	Update mesh config tests Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-05-04 08:50:59 -07:00
Mark Anderson	05dc5a26b7	Docs and changelog edits Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-05-04 08:50:59 -07:00
Mark Anderson	fee6c7a7b6	Fixup missed config entry Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-05-04 08:50:59 -07:00
Mark Anderson	28b4b3a85d	Add x-forwarded-client-cert headers Description Add x-fowarded-client-cert information on trusted incoming connections. Envoy provides support forwarding and annotating the x-forwarded-client-cert header via the forward_client_cert_details set_current_client_cert_details filter fields. It would be helpful for consul to support this directly in its config. The escape hatches are a bit cumbersome for this purpose. This has been implemented on incoming connections to envoy. Outgoing (from the local service through the sidecar) will not have a certificate, and so are left alone. A service on an incoming connection will now get headers something like this: ``` X-Forwarded-Client-Cert:[By=spiffe://efad7282-d9b2-3298-f6d8-38b37fb58df3.consul/ns/default/dc/dc1/svc/counting;Hash=61ad5cbdfcb50f5a3ec0ca60923d61613c149a9d4495010a64175c05a0268ab2;Cert="-----BEGIN%20CERTIFICATE-----%0AMIICHDCCAcOgAwIBAgIBCDAKBggqhkjOPQQDAjAxMS8wLQYDVQQDEyZwcmktMTli%0AYXdyb2YuY29uc3VsLmNhLmVmYWQ3MjgyLmNvbnN1bDAeFw0yMjA0MjkwMzE0NTBa%0AFw0yMjA1MDIwMzE0NTBaMAAwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARVIZ7Y%0AZEXfbOGBfxGa7Vuok1MIng%2FuzLQK2xLVlSTIPDbO5hstTGP%2B%2FGx182PYFP3jYqk5%0Aq6rYWe1wiPNMA30Io4H8MIH5MA4GA1UdDwEB%2FwQEAwIDuDAdBgNVHSUEFjAUBggr%0ABgEFBQcDAgYIKwYBBQUHAwEwDAYDVR0TAQH%2FBAIwADApBgNVHQ4EIgQgrp4q50oX%0AHHghMbxz5Bk8OJFWMdfgH0Upr350WlhyxvkwKwYDVR0jBCQwIoAgUe6uERAIj%2FLM%0AyuFzDc3Wbp9TGAKBJYAwyhF14ToOQCMwYgYDVR0RAQH%2FBFgwVoZUc3BpZmZlOi8v%0AZWZhZDcyODItZDliMi0zMjk4LWY2ZDgtMzhiMzdmYjU4ZGYzLmNvbnN1bC9ucy9k%0AZWZhdWx0L2RjL2RjMS9zdmMvZGFzaGJvYXJkMAoGCCqGSM49BAMCA0cAMEQCIDwb%0AFlchufggNTijnQ5SUcvTZrWlZyq%2FrdVC20nbbmWLAiAVshNNv1xBqJI1NmY2HI9n%0AgRMfb8aEPVSuxEHhqy57eQ%3D%3D%0A-----END%20CERTIFICATE-----%0A";Chain="-----BEGIN%20CERTIFICATE-----%0AMIICHDCCAcOgAwIBAgIBCDAKBggqhkjOPQQDAjAxMS8wLQYDVQQDEyZwcmktMTli%0AYXdyb2YuY29uc3VsLmNhLmVmYWQ3MjgyLmNvbnN1bDAeFw0yMjA0MjkwMzE0NTBa%0AFw0yMjA1MDIwMzE0NTBaMAAwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARVIZ7Y%0AZEXfbOGBfxGa7Vuok1MIng%2FuzLQK2xLVlSTIPDbO5hstTGP%2B%2FGx182PYFP3jYqk5%0Aq6rYWe1wiPNMA30Io4H8MIH5MA4GA1UdDwEB%2FwQEAwIDuDAdBgNVHSUEFjAUBggr%0ABgEFBQcDAgYIKwYBBQUHAwEwDAYDVR0TAQH%2FBAIwADApBgNVHQ4EIgQgrp4q50oX%0AHHghMbxz5Bk8OJFWMdfgH0Upr350WlhyxvkwKwYDVR0jBCQwIoAgUe6uERAIj%2FLM%0AyuFzDc3Wbp9TGAKBJYAwyhF14ToOQCMwYgYDVR0RAQH%2FBFgwVoZUc3BpZmZlOi8v%0AZWZhZDcyODItZDliMi0zMjk4LWY2ZDgtMzhiMzdmYjU4ZGYzLmNvbnN1bC9ucy9k%0AZWZhdWx0L2RjL2RjMS9zdmMvZGFzaGJvYXJkMAoGCCqGSM49BAMCA0cAMEQCIDwb%0AFlchufggNTijnQ5SUcvTZrWlZyq%2FrdVC20nbbmWLAiAVshNNv1xBqJI1NmY2HI9n%0AgRMfb8aEPVSuxEHhqy57eQ%3D%3D%0A-----END%20CERTIFICATE-----%0A";Subject="";URI=spiffe://efad7282-d9b2-3298-f6d8-38b37fb58df3.consul/ns/default/dc/dc1/svc/dashboard] ``` Closes #12852	2022-05-04 08:50:58 -07:00
Kyle Havlovitz	0696ed24c8	Merge pull request #12885 from hashicorp/acl-err-cache Store and return RPC error in ACL cache entries	2022-05-03 10:44:22 -07:00
Kyle Havlovitz	76d62a14f5	Return ACLRemoteError from cache and test it correctly	2022-05-03 10:05:26 -07:00
FFMMM	3b3f001580	[sync oss] api: add peering api module (#12911 )	2022-05-02 11:49:05 -07:00
Chris S. Kim	9791bad136	peering: Make Upstream peer-aware (#12900 ) Adds DestinationPeer field to Upstream. Adds Peer field to UpstreamID and its string conversion functions.	2022-04-29 18:12:51 -04:00
Chris S. Kim	0d66301ea7	Cleanup peering files that used error types that were removed (#12892 )	2022-04-29 14:02:26 -04:00
Mathew Estafanous	474385d153	Unify various status errors into one HTTP error type. (#12594 ) Replaces specific error types for HTTP Status codes with a generic HTTPError type. Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-04-29 13:42:49 -04:00
Kyle Havlovitz	0d8b187ea1	Store and return rpc error in acl cache entries	2022-04-28 09:08:55 -07:00
R.B. Boyer	11213ae180	health: ensure /v1/health/service/:service endpoint returns the most recent results when a filter is used with streaming (#12640 ) The primary bug here is in the streaming subsystem that makes the overall v1/health/service/:service request behave incorrectly when servicing a blocking request with a filter provided. There is a secondary non-streaming bug being fixed here that is much less obvious related to when to update the `reply` variable in a `blockingQuery` evaluation. It is unlikely that it is triggerable in practical environments and I could not actually get the bug to manifest, but I fixed it anyway while investigating the original issue. Simple reproduction (streaming): 1. Register a service with a tag. curl -sL --request PUT 'http://localhost:8500/v1/agent/service/register' \ --header 'Content-Type: application/json' \ --data-raw '{ "ID": "ID1", "Name": "test", "Tags":[ "a" ], "EnableTagOverride": true }' 2. Do an initial filter query that matches on the tag. curl -sLi --get 'http://localhost:8500/v1/health/service/test' --data-urlencode 'filter=a in Service.Tags' 3. Note you get one result. Use the `X-Consul-Index` header to establish a blocking query in another terminal, this should not return yet. curl -sLi --get 'http://localhost:8500/v1/health/service/test?index=$INDEX' --data-urlencode 'filter=a in Service.Tags' 4. Re-register that service with a different tag. curl -sL --request PUT 'http://localhost:8500/v1/agent/service/register' \ --header 'Content-Type: application/json' \ --data-raw '{ "ID": "ID1", "Name": "test", "Tags":[ "b" ], "EnableTagOverride": true }' 5. Your blocking query from (3) should return with a header `X-Consul-Query-Backend: streaming` and empty results if it works correctly `[]`. Attempts to reproduce with non-streaming failed (where you add `&near=_agent` to the read queries and ensure `X-Consul-Query-Backend: blocking-query` shows up in the results).	2022-04-27 10:39:45 -05:00
R.B. Boyer	1a491886fa	structs: ensure exported-services PeerName field can be addressed as peer_name (#12862 )	2022-04-27 10:27:21 -05:00
Dhia Ayachi	b83a790927	update raft to v1.3.8 (#12844 ) * update raft to v1.3.7 * add changelog * fix compilation error * fix HeartbeatTimeout * fix ElectionTimeout to reload only if value is valid * fix default values for `ElectionTimeout` and `HeartbeatTimeout` * fix test defaults * bump raft to v1.3.8	2022-04-25 10:19:26 -04:00
R.B. Boyer	f507f62f3c	peering: initial sync (#12842 ) - Add endpoints related to peering: read, list, generate token, initiate peering - Update node/service/check table indexing to account for peers - Foundational changes for pushing service updates to a peer - Plumb peer name through Health.ServiceNodes path see: ENT-1765, ENT-1280, ENT-1283, ENT-1283, ENT-1756, ENT-1739, ENT-1750, ENT-1679, ENT-1709, ENT-1704, ENT-1690, ENT-1689, ENT-1702, ENT-1701, ENT-1683, ENT-1663, ENT-1650, ENT-1678, ENT-1628, ENT-1658, ENT-1640, ENT-1637, ENT-1597, ENT-1634, ENT-1613, ENT-1616, ENT-1617, ENT-1591, ENT-1588, ENT-1596, ENT-1572, ENT-1555 Co-authored-by: R.B. Boyer <rb@hashicorp.com> Co-authored-by: freddygv <freddy@hashicorp.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: Evan Culver <eculver@hashicorp.com> Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com>	2022-04-21 17:34:40 -05:00
Will Jordan	c48120d005	Add timeout to Client RPC calls (#11500 ) Adds a timeout (deadline) to client RPC calls, so that streams will no longer hang indefinitely in unstable network conditions. Co-authored-by: kisunji <ckim@hashicorp.com>	2022-04-21 16:21:35 -04:00
Matt Keeler	7ce2b48cb7	Implement the ServerDiscovery.WatchServers gRPC endpoint (#12819 ) * Implement the ServerDiscovery.WatchServers gRPC endpoint * Fix the ConnectCA.Sign gRPC endpoints metadata forwarding. * Unify public gRPC endpoints around the public.TraceID function for request_id logging	2022-04-21 12:56:18 -04:00
Blake Covarrubias	c786c49282	acl: Clarify node/service identities must be lowercase (#12807 ) Modify ACL error message for invalid node/service identities names to clearly state only lowercase alphanumeric characters are supported.	2022-04-21 09:29:16 -07:00
R.B. Boyer	4274e67b47	chore: upgrade mockery to v2 and regenerate (#12836 )	2022-04-21 09:48:21 -05:00
R.B. Boyer	f3ce353a87	ca: fix a bug that caused a non blocking leaf cert query after a blocking leaf cert query to block (#12820 ) Fixes #12048 Fixes #12319 Regression introduced in #11693 Local reproduction steps: 1. `consul agent -dev` 2. `curl -sLiv 'localhost:8500/v1/agent/connect/ca/leaf/web'` 3. make note of the `X-Consul-Index` header returned 4. `curl -sLi 'localhost:8500/v1/agent/connect/ca/leaf/web?index=<VALUE_FROM_STEP_3>'` 5. Kill the above curl when it hangs with Ctrl-C 6. Repeat (2) and it should not hang.	2022-04-20 12:21:47 -05:00
Riddhi Shah	a1eb774407	[OSS] gRPC call to get envoy bootstrap params (#12825 ) Adds a new gRPC endpoint to get envoy bootstrap params. The new consul-dataplane service will use this endpoint to generate an envoy bootstrap configuration.	2022-04-19 17:24:21 -07:00
Matt Keeler	cdad79bfc7	Add event generation for autopilot state updates (#12626 ) Whenever autopilot updates its state it notifies Consul. That notification will then trigger Consul to extract out the ready server information. If the ready servers have changed, then an event will be published to notify any subscribers of the full set of ready servers. All these ready server event things are contained within an autopilotevents package instead of the consul package to make importing them into the grpc related packages possible	2022-04-19 13:03:03 -04:00
Evan Culver	000d0621b4	connect: Add Envoy 1.22 to integration tests, remove Envoy 1.18 (#12805 ) Co-authored-by: R.B. Boyer <rb@hashicorp.com>	2022-04-18 09:36:07 -07:00
DanStough	95250e7915	Update go version to 1.18.1	2022-04-18 11:41:10 -04:00
Kyle Havlovitz	e162db7ad0	Add an internal env var for managed cluster config in the ui (#12796 )	2022-04-15 09:55:52 -07:00
John Murret	a1117261df	set vault namespaces on vault client prior to logging in with the vault auth method	2022-04-14 12:18:06 -06:00
Evan Culver	881e17fae1	connect: Add Envoy 1.21.1 to support matrix, remove 1.17.4 (#12777 )	2022-04-14 10:44:42 -07:00
Dan Upton	325c1c0dd7	ConnectCA.Sign gRPC Endpoint (#12787 ) Introduces a gRPC endpoint for signing Connect leaf certificates. It's also the first of the public gRPC endpoints to perform leader-forwarding, so establishes the pattern of forwarding over the multiplexed internal RPC port.	2022-04-14 14:26:14 +01:00
Kyle Havlovitz	3e88f579fc	Fix namespace default field names in expanded token output	2022-04-13 16:46:39 -07:00
Paul Glass	99f373dde4	acl: Adjust region handling in AWS IAM auth method (#12774 ) * acl: Adjust region handling in AWS IAM auth method	2022-04-13 14:31:37 -05:00
Eric Haberkorn	8d966edfbb	Merge pull request #12773 from hashicorp/fix-lambda-intentions-and-routing Implement Routing and Intentions for Lambdas	2022-04-13 13:01:15 -04:00
Eric	b01bb41553	Implement routing and intentions for AWS Lambdas	2022-04-13 11:45:25 -04:00
Karl Cardenas	43b548d4c1	Merge pull request #12562 from hashicorp/docs/blake-agent-config docs: Agent configuration hierarchy reorganization	2022-04-12 12:33:42 -07:00
FFMMM	a46bbe892d	add more labels to RequestRecorder (#12727 ) Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>	2022-04-12 10:50:25 -07:00
Matt Keeler	8bad5105b7	Move to using a shared EventPublisher (#12673 ) Previously we had 1 EventPublisher per state.Store. When a state store was closed/abandoned such as during a consul snapshot restore, this had the behavior of force closing subscriptions for that topic and evicting event snapshots from the cache. The intention of this commit is to keep all that behavior. To that end, the shared EventPublisher now supports the ability to refresh a topic. That will perform the force close + eviction. The FSM upon abandoning the previous state.Store will call RefreshTopic for all the topics with events generated by the state.Store.	2022-04-12 09:47:42 -04:00
Blake Covarrubias	891e086cde	Remove .html extensions from docs URLs	2022-04-11 17:38:49 -07:00
Natalie Smith	0a310188f8	docs: fix yet more references to agent/options	2022-04-11 17:38:49 -07:00
R.B. Boyer	9fb8616bac	fix broken test (#12741 )	2022-04-11 10:56:57 -05:00
Jared Kirschner	1d817f683a	Merge pull request #12725 from hashicorp/clarify-service-deregister-after-critical-message improve error msg for deregister critical service	2022-04-07 18:01:54 -04:00
R.B. Boyer	25ba9c147a	xds: ensure that all connect timeout configs can apply equally to tproxy direct dial connections (#12711 ) Just like standard upstreams the order of applicability in descending precedence: 1. caller's `service-defaults` upstream override for destination 2. caller's `service-defaults` upstream defaults 3. destination's `service-resolver` ConnectTimeout 4. system default of 5s Co-authored-by: mrspanishviking <kcardenas@hashicorp.com>	2022-04-07 16:58:21 -05:00
Jared Kirschner	c4534bc53d	improve error msg for deregister critical service If a service is automatically registered because it has a critical health check for longer than deregister_critical_service_after, the error message will now include: - mention of the deregister_critical_service_after option - the value of deregister_critical_service_after for that check	2022-04-07 14:50:02 -07:00
Kyle Havlovitz	9780b672da	Merge pull request #12685 from hashicorp/http-check-redirect-option Add a field to disable following redirects on http checks	2022-04-07 11:29:27 -07:00
Matt Keeler	a553982506	Enable running autopilot state updates on all servers (#12617 ) * Fixes a lint warning about t.Errorf not supporting %w * Enable running autopilot on all servers On the non-leader servers all they do is update the state and do not attempt any modifications. * Fix the RPC conn limiting tests Technically they were relying on racey behavior before. Now they should be reliable.	2022-04-07 10:48:48 -04:00
FFMMM	5245251bbf	[rpc/middleware][consul] plumb intercept off, add server level happy test (#12692 )	2022-04-06 14:33:05 -07:00
FFMMM	7ed356b338	lower log to trace (#12708 )	2022-04-06 11:37:08 -07:00
Kyle Havlovitz	3b44343276	Add a field to disable following redirects on http checks	2022-04-05 16:12:18 -07:00
Mark Anderson	98a2e282be	Fixup acl.EnterpriseMeta Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-04-05 15:11:49 -07:00
Mark Anderson	05eded4f1d	Manual Structs fixup Change things by hand that I couldn't figure out how to automate Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-04-05 14:51:10 -07:00
Mark Anderson	897ba08cfd	add new entmeta stuff. Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-04-05 14:49:31 -07:00
R.B. Boyer	d06183ba7f	syncing changes back from enterprise (#12701 )	2022-04-05 15:46:56 -05:00
Riddhi Shah	41ef1671fa	Merge pull request #12695 from hashicorp/feature-negotiation-grpc-api-oss [OSS] Supported dataplane features gRPC endpoint	2022-04-05 11:26:33 -07:00
Dan Upton	7be40406fa	ca: move ConnectCA.Sign authorization logic to CAManager (#12609 ) OSS sync of enterprise changes at 8d6fd125	2022-04-05 13:16:20 -05:00
Kyle Havlovitz	6cf22a5cef	Merge pull request #12672 from hashicorp/tgate-san-validation Respect SNI with terminating gateways and log a warning if it isn't set alongside TLS	2022-04-05 11:15:59 -07:00
Riddhi Shah	ec1ae5eca1	[OSS] Supported dataplane features gRPC endpoint Adds a new gRPC service and endpoint to return the list of supported consul dataplane features. The Consul Dataplane will use this API to customize its interaction with that particular server.	2022-04-05 07:38:58 -07:00
Dan Upton	a70e1886c9	WatchRoots gRPC endpoint (#12678 ) Adds a new gRPC streaming endpoint (WatchRoots) that dataplane clients will use to fetch the current list of active Connect CA roots and receive new lists whenever the roots are rotated.	2022-04-05 15:26:14 +01:00
Dhia Ayachi	83720e5737	add a rate limiter to config auto-reload (#12490 ) * add config watcher to the config package * add logging to watcher * add test and refactor to add WatcherEvent. * add all API calls and fix a bug with recreated files * add tests for watcher * remove the unnecessary use of context * Add debug log and a test for file rename * use inode to detect if the file is recreated/replaced and only listen to create events. * tidy ups (#1535) * tidy ups * Add tests for inode reconcile * fix linux vs windows syscall * fix linux vs windows syscall * fix windows compile error * increase timeout * use ctime ID * remove remove/creation test as it's a use case that fail in linux * fix linux/windows to use Ino/CreationTime * fix the watcher to only overwrite current file id * fix linter error * fix remove/create test * set reconcile loop to 200 Milliseconds * fix watcher to not trigger event on remove, add more tests * on a remove event try to add the file back to the watcher and trigger the handler if success * fix race condition * fix flaky test * fix race conditions * set level to info * fix when file is removed and get an event for it after * fix to trigger handler when we get a remove but re-add fail * fix error message * add tests for directory watch and fixes * detect if a file is a symlink and return an error on Add * rename Watcher to FileWatcher and remove symlink deref * add fsnotify@v1.5.1 * fix go mod * do not reset timer on errors, rename OS specific files * rename New func * events trigger on write and rename * add missing test * fix flaking tests * fix flaky test * check reconcile when removed * delete invalid file * fix test to create files with different mod time. * back date file instead of sleeping * add watching file in agent command. * fix watcher call to use new API * add configuration and stop watcher when server stop * add certs as watched files * move FileWatcher to the agent start instead of the command code * stop watcher before replacing it * save watched files in agent * add add and remove interfaces to the file watcher * fix remove to not return an error * use `Add` and `Remove` to update certs files * fix tests * close events channel on the file watcher even when the context is done * extract `NotAutoReloadableRuntimeConfig` is a separate struct * fix linter errors * add Ca configs and outgoing verify to the not auto reloadable config * add some logs and fix to use background context * add tests to auto-config reload * remove stale test * add tests to changes to config files * add check to see if old cert files still trigger updates * rename `NotAutoReloadableRuntimeConfig` to `StaticRuntimeConfig` * fix to re add both key and cert file. Add test to cover this case. * review suggestion Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * add check to static runtime config changes * fix test * add changelog file * fix review comments * Apply suggestions from code review Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * update flag description Co-authored-by: FFMMM <FFMMM@users.noreply.github.com> * fix compilation error * add static runtime config support * fix test * fix review comments * fix log test * Update .changelog/12329.txt Co-authored-by: Dan Upton <daniel@floppy.co> * transfer tests to runtime_test.go * fix filewatcher Replace to not deadlock. * avoid having lingering locks Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * split ReloadConfig func * fix warning message Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * convert `FileWatcher` into an interface * fix compilation errors * fix tests * extract func for adding and removing files * add a coalesceTimer with a very small timer * extract coaelsce Timer and add a shim for testing * add tests to coalesceTimer fix to send remaining events * set `coalesceTimer` to 1 Second * support symlink, fix a nil deref. * fix compile error * fix compile error * refactor file watcher rate limiting to be a Watcher implementation * fix linter issue * fix runtime config * fix runtime test * fix flaky tests * fix compile error * Apply suggestions from code review Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * fix agent New to return an error if File watcher New return an error * quit timer loop if ctx is canceled * Apply suggestions from code review Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> Co-authored-by: FFMMM <FFMMM@users.noreply.github.com> Co-authored-by: Daniel Upton <daniel@floppy.co> Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-04-04 11:31:39 -04:00
Eric Haberkorn	61af7947f9	Merge pull request #12681 from hashicorp/lambda-patching-tweaks Tweak the Lambda Envoy configuration generated by the serverless patcher	2022-04-01 19:59:30 -04:00
FFMMM	973d2d0f9a	mark disable_compat_1.9 to deprecate in 1.13, change default to true (#12675 ) Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-04-01 10:35:56 -07:00
R.B. Boyer	cb82949ac6	xds: errors from the xds serverless plugin are fatal (#12682 )	2022-04-01 10:30:26 -05:00
Eric	5682f3ce1f	Tweak the Lambda Envoy configuration generated by the serverless patcher - Move from `strip_matching_host_port` to `strip_any_host_port` - Remove `auto_host_rewrite` since it conflicts with `strip_any_host_port`	2022-04-01 11:13:44 -04:00
Eric Haberkorn	26cfbc70b0	Merge pull request #12676 from hashicorp/implement-lambda-patching Implement Lambda Patching in the Serverless Plugin	2022-04-01 09:58:56 -04:00
Mark Anderson	018edc222e	Avoid using sys/mounts to enable namespaces (#12655 ) * Avoid doing list of /sys/mounts From an internal ticket "Support standard "Vault namespace in the path" semantics for Connect Vault CA Provider" Vault allows the namespace to be specified as a prefix in the path of a PKI definition, but this doesn't currently work for ```IntermediatePKIPath``` specifications, because we attempt to list all of the paths to check if ours is already defined. This doesn't really work in a namespaced world. This changes the IntermediatePKIPath code to follow the same pattern as the root key, where we directly get the key rather than listing. This code is difficult to write automated tests for because it relies on features of Vault Enterprise, which isn't currently part of our test framework, so it was tested manually. Signed-off-by: Mark Anderson <manderson@hashicorp.com> * add changelog Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-03-31 23:35:38 -07:00
Kyle Havlovitz	1a3b885027	Use the GatewayService SNI field for upstream SAN validation	2022-03-31 13:54:25 -07:00
Eric	e0a15690ae	Implement Lambda Patching in the Serverless Plugin	2022-03-31 16:45:32 -04:00
Kyle Havlovitz	059bd0a92e	Merge pull request #12670 from hashicorp/token-read-expanded oss: Add expanded token read flag and endpoint option	2022-03-31 12:24:11 -07:00
Kyle Havlovitz	f8efe9a208	Log a warning when a terminating gateway service has TLS but not SNI configured	2022-03-31 12:18:40 -07:00
Dhia Ayachi	16b19dd82d	auto-reload configuration when config files change (#12329 ) * add config watcher to the config package * add logging to watcher * add test and refactor to add WatcherEvent. * add all API calls and fix a bug with recreated files * add tests for watcher * remove the unnecessary use of context * Add debug log and a test for file rename * use inode to detect if the file is recreated/replaced and only listen to create events. * tidy ups (#1535) * tidy ups * Add tests for inode reconcile * fix linux vs windows syscall * fix linux vs windows syscall * fix windows compile error * increase timeout * use ctime ID * remove remove/creation test as it's a use case that fail in linux * fix linux/windows to use Ino/CreationTime * fix the watcher to only overwrite current file id * fix linter error * fix remove/create test * set reconcile loop to 200 Milliseconds * fix watcher to not trigger event on remove, add more tests * on a remove event try to add the file back to the watcher and trigger the handler if success * fix race condition * fix flaky test * fix race conditions * set level to info * fix when file is removed and get an event for it after * fix to trigger handler when we get a remove but re-add fail * fix error message * add tests for directory watch and fixes * detect if a file is a symlink and return an error on Add * rename Watcher to FileWatcher and remove symlink deref * add fsnotify@v1.5.1 * fix go mod * do not reset timer on errors, rename OS specific files * rename New func * events trigger on write and rename * add missing test * fix flaking tests * fix flaky test * check reconcile when removed * delete invalid file * fix test to create files with different mod time. * back date file instead of sleeping * add watching file in agent command. * fix watcher call to use new API * add configuration and stop watcher when server stop * add certs as watched files * move FileWatcher to the agent start instead of the command code * stop watcher before replacing it * save watched files in agent * add add and remove interfaces to the file watcher * fix remove to not return an error * use `Add` and `Remove` to update certs files * fix tests * close events channel on the file watcher even when the context is done * extract `NotAutoReloadableRuntimeConfig` is a separate struct * fix linter errors * add Ca configs and outgoing verify to the not auto reloadable config * add some logs and fix to use background context * add tests to auto-config reload * remove stale test * add tests to changes to config files * add check to see if old cert files still trigger updates * rename `NotAutoReloadableRuntimeConfig` to `StaticRuntimeConfig` * fix to re add both key and cert file. Add test to cover this case. * review suggestion Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * add check to static runtime config changes * fix test * add changelog file * fix review comments * Apply suggestions from code review Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * update flag description Co-authored-by: FFMMM <FFMMM@users.noreply.github.com> * fix compilation error * add static runtime config support * fix test * fix review comments * fix log test * Update .changelog/12329.txt Co-authored-by: Dan Upton <daniel@floppy.co> * transfer tests to runtime_test.go * fix filewatcher Replace to not deadlock. * avoid having lingering locks Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * split ReloadConfig func * fix warning message Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * convert `FileWatcher` into an interface * fix compilation errors * fix tests * extract func for adding and removing files Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> Co-authored-by: FFMMM <FFMMM@users.noreply.github.com> Co-authored-by: Daniel Upton <daniel@floppy.co>	2022-03-31 15:11:49 -04:00
Kyle Havlovitz	b21b4346b4	Add expanded token read flag and endpoint option	2022-03-31 10:49:49 -07:00
FFMMM	1adfd7b94c	polish rpc.service.call metric behavior (#12624 )	2022-03-31 10:49:37 -07:00
Paul Glass	706c844423	Add IAM Auth Method (#12583 ) This adds an aws-iam auth method type which supports authenticating to Consul using AWS IAM identities. Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-03-31 10:18:48 -05:00
Eric Haberkorn	458b1838db	Merge pull request #12659 from hashicorp/bump-go-control-plane Bump Go Control Plane	2022-03-30 15:07:47 -04:00
R.B. Boyer	e79ce8ab03	xds: adding control of the mesh-wide min/max TLS versions and cipher suites from the mesh config entry (#12601 ) - `tls.incoming`: applies to the inbound mTLS targeting the public listener on `connect-proxy` and `terminating-gateway` envoy instances - `tls.outgoing`: applies to the outbound mTLS dialing upstreams from `connect-proxy` and `ingress-gateway` envoy instances Fixes #11966	2022-03-30 13:43:59 -05:00
R.B. Boyer	c98f2acd75	similar bump	2022-03-30 13:28:00 -05:00
R.B. Boyer	33fcc83d00	fail on error and use ptypes.MarshalAny for now instead of anypb.New	2022-03-30 13:27:49 -05:00
Eric	e4b4f175ed	Bump go-control-plane * `go get cloud.google.com/go@v0.59.0` * `go get github.com/envoyproxy/go-control-plane@v0.9.9` * `make envoy-library` * Bumpprotoc to 3.15.8	2022-03-30 13:11:27 -04:00
R.B. Boyer	ac5bea862a	server: ensure that service-defaults meta is incorporated into the discovery chain response (#12511 ) Also add a new "Default" field to the discovery chain response to clients	2022-03-30 10:04:18 -05:00
FFMMM	bbab030beb	introduce EmptyReadRequest for status_endpoint (#12653 ) Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>	2022-03-29 18:05:45 -07:00
Fulvio	b94bbf7f43	remove DualStack field from check TCP #12629 (#12630 )	2022-03-29 14:56:01 -04:00
Eric	5f050614e5	remove the rest of gogo	2022-03-28 17:34:41 -04:00
Eric	02d8a11ff0	remove gogo from acl protobufs	2022-03-28 16:20:56 -04:00
Connor	922619dfc3	Fix leaked Vault LifetimeRenewers (#12607 ) * Fix leaked Vault LifetimeRenewers When the Vault CA Provider is reconfigured we do not stop the LifetimeRenewers which can cause them to leak until the Consul processes recycles. On Configure execute stopWatcher if it exists and is not nil before starting a new renewal * Add jitter before restarting the LifetimeWatcher If we fail to login to Vault or our token is no longer valid we can overwhelm a Vault instance with many requests very quickly by restarting the LifetimeWatcher. Before restarting the LifetimeWatcher provide a backoff time of 1 second or less. * Use a retry.Waiter instead of RandomStagger * changelog * gofmt'd * Swap out bool for atomic.Unit32 in test * Provide some extra clarification in comment and changelog	2022-03-28 09:58:16 -05:00
Eric	5cab213e81	assorted changes required to remove gogo	2022-03-25 09:55:36 -04:00
FFMMM	c39854de78	fix bad oss sync, use gauges not counters (#12611 )	2022-03-24 14:41:30 -07:00
Kyle Havlovitz	3b736d6a0c	Merge pull request #12596 from hashicorp/overview-endpoint oss: Add overview UI internal endpoint	2022-03-24 14:27:54 -07:00
Mike Morris	f8a2ae2606	agent: convert listener config to TLS types (#12522 ) * tlsutil: initial implementation of types/TLSVersion tlsutil: add test for parsing deprecated agent TLS version strings tlsutil: return TLSVersionInvalid with error tlsutil: start moving tlsutil cipher suite lookups over to types/tls tlsutil: rename tlsLookup to ParseTLSVersion, add cipherSuiteLookup agent: attempt to use types in runtime config agent: implement b.tlsVersion validation in config builder agent: fix tlsVersion nil check in builder tlsutil: update to renamed ParseTLSVersion and goTLSVersions tlsutil: fixup TestConfigurator_CommonTLSConfigTLSMinVersion tlsutil: disable invalid config parsing tests tlsutil: update tests auto_config: lookup old config strings from base.TLSMinVersion auto_config: update endpoint tests to use TLS types agent: update runtime_test to use TLS types agent: update TestRuntimeCinfig_Sanitize.golden agent: update config runtime tests to expect TLS types * website: update Consul agent tls_min_version values * agent: fixup TLS parsing and compilation errors * test: fixup lint issues in agent/config_runtime_test and tlsutil/config_test * tlsutil: add CHACHA20_POLY1305 cipher suites to goTLSCipherSuites * test: revert autoconfig tls min version fixtures to old format * types: add TLSVersions public function * agent: add warning for deprecated TLS version strings * agent: move agent config specific logic from tlsutil.ParseTLSVersion into agent config builder * tlsutil(BREAKING): change default TLS min version to TLS 1.2 * agent: move ParseCiphers logic from tlsutil into agent config builder * tlsutil: remove unused CipherString function * agent: fixup import for types package * Revert "tlsutil: remove unused CipherString function" This reverts commit `6ca7f6f58d`. * agent: fixup config builder and runtime tests * tlsutil: fixup one remaining ListenerConfig -> ProtocolConfig * test: move TLS cipher suites parsing test from tlsutil into agent config builder tests * agent: remove parseCiphers helper from auto_config_endpoint_test * test: remove unused imports from tlsutil * agent: remove resolved FIXME comment * tlsutil: remove TODO and FIXME in cipher suite validation * agent: prevent setting inherited cipher suite config when TLS 1.3 is specified * changelog: add entry for converting agent config to TLS types * agent: remove FIXME in runtime test, this is covered in builder tests with invalid tls9 value now * tlsutil: remove config tests for values checked at agent config builder boundary * tlsutil: remove tls version check from loadProtocolConfig * tlsutil: remove tests and TODOs for logic checked in TestBuilder_tlsVersion and TestBuilder_tlsCipherSuites * website: update search link for supported Consul agent cipher suites * website: apply review suggestions for tls_min_version description * website: attempt to clean up markdown list formatting for tls_min_version * website: moar linebreaks to fix tls_min_version formatting * Revert "website: moar linebreaks to fix tls_min_version formatting" This reverts commit `3858592742`. * autoconfig: translate old values for TLSMinVersion * agent: rename var for translated value of deprecated TLS version value * Update agent/config/deprecated.go Co-authored-by: Dan Upton <daniel@floppy.co> * agent: fix lint issue * agent: fixup deprecated config test assertions for updated warning Co-authored-by: Dan Upton <daniel@floppy.co>	2022-03-24 15:32:25 -04:00
Kyle Havlovitz	a559de63dd	Sort by partition/ns/servicename instead of the reverse	2022-03-24 12:16:05 -07:00
FFMMM	26717b470a	[metrics][rpc]: add basic prefix filter test for new rpc metric (#12598 ) Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>	2022-03-23 13:29:12 -07:00
Kyle Havlovitz	0d9c99b227	Clean up ent meta id usage in overview summary	2022-03-23 12:47:12 -07:00
Eric	776f5843d0	remove gogo from pbservice	2022-03-23 12:18:01 -04:00
Mark Anderson	5590da2732	Fixup dropped SecretID usage Looks like something got munged at some point. Not sure how it slipped in, but my best guess is that because TestTxn_Apply_ACLDeny is marked flaky we didn't block merge because it failed. Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-03-22 21:20:03 -07:00
Kyle Havlovitz	e530fbfb33	oss: Add overview UI internal endpoint	2022-03-22 17:05:09 -07:00
Dhia Ayachi	72a997242b	split `pbcommon` to `pbcommon` and `pbcommongogo` (#12587 ) * mogify needed pbcommon structs * mogify needed pbconnect structs * fix compilation errors and make config_translate_test pass * add missing file * remove redundant oss func declaration * fix EnterpriseMeta to copy the right data for enterprise * rename pbcommon package to pbcommongogo * regenerate proto and mog files * add missing mog files * add pbcommon package * pbcommon no mog * fix enterprise meta code generation * fix enterprise meta code generation (pbcommongogo) * fix mog generation for gogo * use `protoc-go-inject-tag` to inject tags * rename proto package * pbcommon no mog * use `protoc-go-inject-tag` to inject tags * add non gogo proto to make file * fix proto get	2022-03-22 16:30:00 -04:00
Dan Upton	f8e2e3c710	streaming: emit events when Connect CA Roots change (#12590 ) OSS sync of enterprise changes at 614f786d	2022-03-22 19:13:59 +00:00
FFMMM	a7e5ee005a	factor out recording func, add unit tests (#12585 ) Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>	2022-03-22 09:31:54 -07:00
Dan Upton	7298967070	Restructure gRPC server setup (#12586 ) OSS sync of enterprise changes at 0b44395e	2022-03-22 12:40:24 +00:00
FFMMM	e5ebc47a94	pre register new rpc metric, rename metric (#12582 )	2022-03-21 17:26:32 -07:00
Mark Anderson	fa63aed1fa	Add source of authority annotations to the PermissionDeniedError output. (#12567 ) This extends the acl.AllowAuthorizer with source of authority information. The next step is to unify the AllowAuthorizer and ACLResolveResult structures; that will be done in a separate PR. Part of #12481 Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-03-18 10:32:25 -07:00
Dan Upton	b36d4e16b6	Support per-listener TLS configuration ⚙️ (#12504 ) Introduces the capability to configure TLS differently for Consul's listeners/ports (i.e. HTTPS, gRPC, and the internal multiplexed RPC port) which is useful in scenarios where you may want the HTTPS or gRPC interfaces to present a certificate signed by a well-known/public CA, rather than the certificate used for internal communication which must have a SAN in the form `server.<dc>.consul`.	2022-03-18 10:46:58 +00:00
Evan Culver	e3e481022e	lib: add validation package + DNS label validation (#12535 ) Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-03-17 18:31:28 -07:00
FFMMM	db27ea3484	[sync oss] add net/rpc interceptor implementation (#12573 ) * sync ent changes from 866dcb0667 Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> * update oss go.mod Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>	2022-03-17 16:02:26 -07:00
Jared Kirschner	6c84083307	Merge pull request #11821 from hashicorp/error-if-get-request-has-body http: error if GET request has non-empty body	2022-03-16 18:34:27 -04:00
Jared Kirschner	c73267f318	http: WARN if GET request has non-empty body Give the user a hint that they might be doing something wrong if their GET request has a non-empty body, which can easily happen using curl's --data-urlencode if specifying request type via "--request GET" rather than "--get". See https://github.com/hashicorp/consul/issues/11471.	2022-03-16 14:19:50 -07:00
Eric	eea8300187	Remove the stdduration gogo extension	2022-03-16 12:12:29 -04:00
mrspanishviking	7180c99960	Revert "[Docs] Agent configuration hierarchy "	2022-03-15 16:13:58 -07:00
trujillo-adam	4151dc097a	fixing merge conflicts part 3	2022-03-15 15:25:03 -07:00
Eric Haberkorn	e92dd9dc9a	Merge pull request #12556 from hashicorp/wire-up-serverless-patcher Create and wire up the serverless patcher	2022-03-15 14:05:40 -04:00
Eric Haberkorn	fc3c0f312c	Merge pull request #12557 from hashicorp/remove-healthcheck-gogo-stdduration Remove Gogo Stdduration From the Healthcheck Protobufs	2022-03-15 13:20:49 -04:00
Eric	4e6b34725d	Remove gogo stdduration from the healthcheck protobufs	2022-03-15 10:51:40 -04:00
Eric	cf3e517d0e	Create and wire up the serverless patcher	2022-03-15 10:12:57 -04:00
trujillo-adam	76d55ac2b4	merging new hierarchy for agent configuration	2022-03-14 15:44:41 -07:00
Mark Anderson	676ea58bc4	Refactor config checks oss (#12550 ) Currently the config_entry.go subsystem delegates authorization decisions via the ConfigEntry interface CanRead and CanWrite code. Unfortunately this returns a true/false value and loses the details of the source. This is not helpful, especially since it the config subsystem can be more complex to understand, since it covers so many domains. This refactors CanRead/CanWrite to return a structured error message (PermissionDenied or the like) with more details about the reason for denial. Part of #12241 Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-03-11 13:45:51 -08:00
Eric Haberkorn	d59364fa7f	Merge pull request #12536 from hashicorp/add-serverless-config Add the `connect.enable_serverless_plugin` configuration option	2022-03-11 09:39:36 -05:00
Eric Haberkorn	44609c0ca5	Merge pull request #12539 from hashicorp/make-xds-lib Make the xdscommon package	2022-03-11 09:21:10 -05:00
Eric	3302b2eec2	Add the `connect.enable_serverless_plugin` configuration option.	2022-03-11 09:16:00 -05:00
Mark Anderson	aaefe15613	Bulk acl message fixup oss (#12470 ) * First pass for helper for bulk changes Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Convert ACLRead and ACLWrite to new form Signed-off-by: Mark Anderson <manderson@hashicorp.com> * AgentRead and AgentWRite Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Fix EventWrite Signed-off-by: Mark Anderson <manderson@hashicorp.com> * KeyRead, KeyWrite, KeyList Signed-off-by: Mark Anderson <manderson@hashicorp.com> * KeyRing Signed-off-by: Mark Anderson <manderson@hashicorp.com> * NodeRead NodeWrite Signed-off-by: Mark Anderson <manderson@hashicorp.com> * OperatorRead and OperatorWrite Signed-off-by: Mark Anderson <manderson@hashicorp.com> * PreparedQuery Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Intention partial Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Fix ServiceRead, Write ,etc Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Error check ServiceRead? Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Fix Sessionread/Write Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Fixup snapshot ACL Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Error fixups for txn Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Add changelog Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Fixup review comments Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-03-10 18:48:27 -08:00
Eric	f5c9fa6fa6	Make an xdscommon package that will be shared between Consul and Envoy plugins	2022-03-08 14:57:23 -05:00
Eric Haberkorn	abfcde1bc6	Merge pull request #12529 from hashicorp/add-meta-to-service-config-response Add `Meta` to `ServiceConfigResponse`	2022-03-07 16:35:21 -05:00
Eric Haberkorn	9d0ec2eec2	Code review changes	2022-03-07 14:39:33 -05:00
R.B. Boyer	2a56e0055b	proxycfg: change how various proxycfg test helpers for making ConfigSnapshot copies works to be more correct and less error prone (#12531 ) Prior to this PR for the envoy xDS golden tests in the agent/xds package we were hand-creating a proxycfg.ConfigSnapshot structure in the proper format for input to the xDS generator. Over time this intermediate structure has gotten trickier to build correctly for the various tests. This PR proposes to switch to using the existing mechanism for turning a structs.NodeService and a sequence of cache.UpdateEvent copies into a proxycfg.ConfigSnapshot, as that is less error prone to construct and aligns more with how the data arrives. NOTE: almost all of this is in test-related code. I tried super hard to craft correct event inputs to get the golden files to be the same, or similar enough after construction to feel ok that i recreated the spirit of the original test cases.	2022-03-07 11:47:14 -06:00
Eric	f7cc7ff5cd	Add `Meta` to `ServiceConfigResponse`	2022-03-07 10:05:18 -05:00
R.B. Boyer	8307e40f2b	reduce flakiness/raciness of errNotFound and errNotChanged blocking query tests (#12518 ) Improves tests from #12362 These tests try to setup the following concurrent scenario: 1. (goroutine 1) execute read RPC with index=0 2. (goroutine 1) get response from (1) @ index=10 3. (goroutine 1) execute read RPC with index=10 and block 4. (goroutine 2) WHILE (3) is blocking, start slamming the system with stray writes that will cause the WatchSet to wakeup 5. (goroutine 2) after doing all writes, shut down the reader above 6. (goroutine 1) stops reading, double checks that it only ever woke up once (from 1)	2022-03-04 11:20:01 -06:00
R.B. Boyer	9268715697	server: fix spurious blocking query suppression for discovery chains (#12512 ) Minor fix for behavior in #12362 IsDefault sometimes returns true even if there was a proxy-defaults or service-defaults config entry that was consulted. This PR fixes that.	2022-03-03 16:54:41 -06:00
Daniel Nephin	5ba994a73f	Merge pull request #12298 from jorgemarey/b-persistnewrootandconfig Avoid raft change when no config is provided on persistNewRootAndConfig	2022-03-03 11:03:50 -05:00
Daniel Nephin	161206e24d	ca: make sure the test fails without the fix Also change the path used for the secondary so that both primary and secondary do not overwrite each other.	2022-03-02 18:22:49 -05:00
R.B. Boyer	58e053c336	raft: upgrade to v1.3.6 (#12496 ) Add additional protections on the Consul side to prevent NonVoters from bootstrapping raft. This should un-flake TestServer_Expect_NonVoters	2022-03-02 17:00:02 -06:00
Daniel Nephin	73c91ed80f	Merge pull request #12467 from hashicorp/dnephin/ci-vault-test-safer ca: require that tests that use Vault are named correctly	2022-03-01 12:54:02 -05:00
R.B. Boyer	6666832077	test: parallelize more of TestLeader_ReapOrLeftMember_IgnoreSelf (#12468 ) before: $ go test ./agent/consul -run TestLeader_ReapOrLeftMember_IgnoreSelf ok github.com/hashicorp/consul/agent/consul 21.147s after: $ go test ./agent/consul -run TestLeader_ReapOrLeftMember_IgnoreSelf ok github.com/hashicorp/consul/agent/consul 5.402s	2022-03-01 10:30:06 -06:00
Jorge Marey	f429c1a5d9	Fix vault test with suggested changes	2022-03-01 10:20:00 +01:00
Jorge Marey	1a0baf4024	Add test case to verify #12298	2022-03-01 09:25:52 +01:00
Jorge Marey	4375dd2409	Avoid raft change when no config is provided on CAmanager - This avoids a change to the raft store when no roots or config are provided to persistNewRootAndConfig	2022-03-01 09:25:52 +01:00
Daniel Nephin	d669226784	ca: fix a test This test does not use Vault, so does not need ca.SkipIfVaultNotPresent	2022-02-28 16:26:18 -05:00
Daniel Nephin	1f00ede559	ca: require that tests that use Vault are named correctly Previously we were using two different criteria to decide where to run a test. The main `go-test` job would skip Vault tests based on the presence of the `vault` binary, but the `test-connect-ca-providers` job would run tests based on the name. This led to a scenario where a test may never run in CI. To fix this problem I added a name check to the function we use to skip the test. This should ensure that any test that requires vault is named correctly to be run as part of the `test-connect-ca-providers` job. At the same time I relaxed the regex we use. I verified this runs the same tests using `go test --list Vault`. I made this change because a bunch of tests in `agent/connect/ca` used `Vault` in the name, without the underscores. Instead of changing a bunch of test names, this seemed easier. With this approach, the worst case is that we run a few extra tests in the `test-connect-ca-providers` job, which doesn't seem like a problem.	2022-02-28 16:13:53 -05:00
R.B. Boyer	7b0548dd8d	server: suppress spurious blocking query returns where multiple config entries are involved (#12362 ) Starting from and extending the mechanism introduced in #12110 we can specially handle the 3 main special Consul RPC endpoints that react to many config entries in a single blocking query in Connect: - `DiscoveryChain.Get` - `ConfigEntry.ResolveServiceConfig` - `Intentions.Match` All of these will internally watch for many config entries, and at least one of those will likely be not found in any given query. Because these are blends of multiple reads the exact solution from #12110 isn't perfectly aligned, but we can tweak the approach slightly and regain the utility of that mechanism. ### No Config Entries Found In this case, despite looking for many config entries none may be found at all. Unlike #12110 in this scenario we do not return an empty reply to the caller, but instead synthesize a struct from default values to return. This can be handled nearly identically to #12110 with the first 1-2 replies being non-empty payloads followed by the standard spurious wakeup suppression mechanism from #12110. ### No Change Since Last Wakeup Once a blocking query loop on the server has completed and slept at least once, there is a further optimization we can make here to detect if any of the config entries that were present at specific versions for the prior execution of the loop are identical for the loop we just woke up for. In that scenario we can return a slightly different internal sentinel error and basically externally handle it similar to #12110. This would mean that even if 20 discovery chain read RPC handling goroutines wakeup due to the creation of an unrelated config entry, the only ones that will terminate and reply with a blob of data are those that genuinely have new data to report. ### Extra Endpoints Since this pattern is pretty reusable, other key config-entry-adjacent endpoints used by `agent/proxycfg` also were updated: - `ConfigEntry.List` - `Internal.IntentionUpstreams` (tproxy)	2022-02-25 15:46:34 -06:00
Chris S. Kim	25f4a425d1	Merge pull request #12442 from danieleva/12422-keyring Allows keyring operations on client agents	2022-02-25 16:28:56 -05:00
Evan Culver	522676ed8d	connect: Update supported Envoy versions to include 1.19.3 and 1.18.6	2022-02-24 16:59:33 -08:00
Evan Culver	b95f010ac0	connect: Upgrade Envoy 1.20 to 1.20.2 (#12443 )	2022-02-24 16:19:39 -08:00
R.B. Boyer	ca112f8721	fix flaky test panic (#12446 )	2022-02-24 17:35:46 -06:00
R.B. Boyer	957146401e	catalog: compare node names case insensitively in more places (#12444 ) Many places in consul already treated node names case insensitively. The state store indexes already do it, but there are a few places that did a direct byte comparison which have now been corrected. One place of particular consideration is ensureCheckIfNodeMatches which is executed during snapshot restore (among other places). If a node check used a slightly different casing than the casing of the node during register then the snapshot restore here would deterministically fail. This has been fixed. Primary approach: git grep -i "node.[!=]=.node" -- ':!_test.go' ':!docs' git grep -i '\[[^]]member[^]]\] git grep -i '\[[^]]$member\\|name\\|node$[^]]\]' -- ':!_test.go' ':!website' ':!ui' ':!agent/proxycfg/testing.go:' ':!*.md'	2022-02-24 16:54:47 -06:00
Daniele Vazzola	e76ca318dc	Allows keyring operations on client agents	2022-02-24 17:24:57 +00:00
R.B. Boyer	64271289ec	server: partly fix config entry replication issue that prevents replication in some circumstances (#12307 ) There are some cross-config-entry relationships that are enforced during "graph validation" at persistence time that are required to be maintained. This means that config entries may form a digraph at times. Config entry replication procedes in a particular sorted order by kind and name. Occasionally there are some fixups to these digraphs that end up replicating in the wrong order and replicating the leaves (ingress-gateway) before the roots (service-defaults) leading to replication halting due to a graph validation error related to things like mismatched service protocol requirements. This PR changes replication to give each computed change (upsert/delete) a fair shot at being applied before deciding to terminate that round of replication in error. In the case where we've simply tried to do the operations in the wrong order at least ONE of the outstanding requests will complete in the right order, leading the subsequent round to have fewer operations to do, with a smaller likelihood of graph validation errors. This does not address all scenarios, but for scenarios where the edits are being applied in the wrong order this should avoid replication halting. Fixes #9319 The scenario that is NOT ADDRESSED by this PR is as follows: 1. create: service-defaults: name=new-web, protocol=http 2. create: service-defaults: name=old-web, protocol=http 3. create: service-resolver: name=old-web, redirect-to=new-web 4. delete: service-resolver: name=old-web 5. update: service-defaults: name=old-web, protocol=grpc 6. update: service-defaults: name=new-web, protocol=grpc 7. create: service-resolver: name=old-web, redirect-to=new-web If you shutdown dc2 just before (4) and turn it back on after (7) replication is impossible as there is no single edit you can make to make forward progress.	2022-02-23 17:27:48 -06:00
Chris S. Kim	ea47f066d7	Merge pull request #12430 from hashicorp/ci/main-assetfs-build auto-updated agent/uiserver/bindata_assetfs.go from commit `73b6687c5`	2022-02-23 18:19:30 -05:00
Daniel Nephin	771df290d7	Merge pull request #11910 from hashicorp/dnephin/ca-provider-interface-for-ica-in-primary ca: add support for an external trusted CA	2022-02-22 13:14:52 -05:00
R.B. Boyer	8b987a4d59	configentry: make a new package to hold shared config entry structs that aren't used for RPC or the FSM (#12384 ) First two candidates are ConfigEntryKindName and DiscoveryChainConfigEntries.	2022-02-22 10:36:36 -06:00
Dhia Ayachi	cd9d8d44a5	file watcher to be used for configuration auto-reload feature (#12301 ) * add config watcher to the config package * add logging to watcher * add test and refactor to add WatcherEvent. * add all API calls and fix a bug with recreated files * add tests for watcher * remove the unnecessary use of context * Add debug log and a test for file rename * use inode to detect if the file is recreated/replaced and only listen to create events. * tidy ups (#1535) * tidy ups * Add tests for inode reconcile * fix linux vs windows syscall * fix linux vs windows syscall * fix windows compile error * increase timeout * use ctime ID * remove remove/creation test as it's a use case that fail in linux * fix linux/windows to use Ino/CreationTime * fix the watcher to only overwrite current file id * fix linter error * fix remove/create test * set reconcile loop to 200 Milliseconds * fix watcher to not trigger event on remove, add more tests * on a remove event try to add the file back to the watcher and trigger the handler if success * fix race condition * fix flaky test * fix race conditions * set level to info * fix when file is removed and get an event for it after * fix to trigger handler when we get a remove but re-add fail * fix error message * add tests for directory watch and fixes * detect if a file is a symlink and return an error on Add * rename Watcher to FileWatcher and remove symlink deref * add fsnotify@v1.5.1 * fix go mod * fix flaky test * Apply suggestions from code review Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com> * fix a possible stack overflow * do not reset timer on errors, rename OS specific files * start the watcher when creating it * fix data race in tests * rename New func * do not call handler when a remove event happen * events trigger on write and rename * fix watcher tests * make handler async * remove recursive call * do not produce events for sub directories * trim "/" at the end of a directory when adding * add missing test * fix logging * add todo * fix failing test * fix flaking tests * fix flaky test * add logs * fix log text * increase timeout * reconcile when remove * check reconcile when removed * fix reconcile move test * fix logging * delete invalid file * Apply suggestions from code review Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * fix review comments * fix is watched to properly catch a remove * change test timeout * fix test and rename id * fix test to create files with different mod time. * fix deadlock when stopping watcher * Apply suggestions from code review Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * fix a deadlock when calling stop while emitting event is blocked * make sure to close the event channel after the event loop is done * add go doc * back date file instead of sleeping * Apply suggestions from code review Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * check error Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-02-21 11:36:52 -05:00
hc-github-team-consul-core	ad14a2bffd	auto-updated agent/uiserver/bindata_assetfs.go from commit `73b6687c5`	2022-02-21 12:27:52 +00:00
Evan Culver	602e08ada7	checks: populate interval and timeout when registering services (#11138 )	2022-02-18 12:05:33 -08:00
Kyle Havlovitz	362753cad7	Merge pull request #12385 from hashicorp/tproxy-http-upstream-fix xds: respect chain protocol on default discovery chain	2022-02-18 10:08:59 -08:00
Daniel Nephin	dc484ee09e	rpc: set response to nil when not found Otherwise when the query times out we might incorrectly send a value for the reply, when we should send an empty reply. Also document errNotFound and how to handle the result in that case.	2022-02-18 12:26:06 -05:00
Daniel Nephin	6021105dfc	ca: test that original certs from secondary still verify There's a chance this could flake if the secondary hasn't received the update yet, but running this test many times doesn't show any flakes yet.	2022-02-17 18:45:16 -05:00
Daniel Nephin	6b679aa9d4	Update TODOs to reference an issue with more details And remove a no longer needed TODO	2022-02-17 18:21:30 -05:00
Daniel Nephin	1853a32df6	ca: add test cases for rotating external trusted CA	2022-02-17 18:21:30 -05:00
Daniel Nephin	5e8ea2a039	ca: add a test for secondary with external CA	2022-02-17 18:21:30 -05:00
Daniel Nephin	42ec34d101	ca: examine the full chain in newCARoot make TestNewCARoot much more strict compare the full result instead of only a few fields. add a test case with 2 and 3 certificates in the pem	2022-02-17 18:21:30 -05:00
Daniel Nephin	71f3ae04e2	ca: small docs improvements	2022-02-17 18:21:30 -05:00
Daniel Nephin	86994812ed	ca: cleanup validateSetIntermediate	2022-02-17 18:21:30 -05:00
Daniel Nephin	c1c1580bf8	ca: only return the leaf cert from Sign in vault provider The interface is documented as 'Sign will only return the leaf', and the other providers only return the leaf. It seems like this was added during the initial implementation, so is likely just something we missed. It doesn't break anything , but it does cause confusing cert chains in the API response which could break something in the future.	2022-02-17 18:21:30 -05:00
Daniel Nephin	85ecbaf109	Merge pull request #12110 from hashicorp/dnephin/blocking-queries-not-found rpc: make blocking queries for non-existent items more efficient	2022-02-17 18:09:39 -05:00
Ashwin Venkatesh	6e6cd928a2	Parse datacenter from request (#12370 ) * Parse datacenter from request - Parse the value of the datacenter from the create/delete requests for AuthMethods and BindingRules so that they can be created in and deleted from the datacenters specified in the request.	2022-02-17 16:41:27 -05:00
Kyle Havlovitz	3fe358b831	xds: respect chain protocol on default discovery chain	2022-02-17 11:47:20 -08:00
Florian Apolloner	f01f00fc84	Support for connect native services in topology view. (#12098 )	2022-02-16 16:51:54 -05:00
Chris S. Kim	154b781bc8	Move IndexEntryName helpers to common files (#12365 )	2022-02-16 12:56:38 -05:00
Daniel Nephin	8a6e75ac81	rpc: add errNotFound to all Get queries Any query that returns a list of items is not part of this commit.	2022-02-15 18:24:34 -05:00
Daniel Nephin	4b33bdf396	Make blockingQuery efficient with 'not found' results. By using the query results as state. Blocking queries are efficient when the query matches some results, because the ModifyIndex of those results, returned as queryMeta.Mindex, will never change unless the items themselves change. Blocking queries for non-existent items are not efficient because the queryMeta.Index can (and often does) change when other entities are written. This commit reduces the churn of these queries by using a different comparison for "has changed". Instead of using the modified index, we use the existence of the results. If the previous result was "not found" and the new result is still "not found", we know we can ignore the modified index and continue to block. This is done by setting the minQueryIndex to the returned queryMeta.Index, which prevents the query from returning before a state change is observed.	2022-02-15 18:24:33 -05:00
Daniel Nephin	897b953f66	Add a test for blocking query on non-existent entry This test shows how blocking queries are not efficient when the query returns no results. The test fails with 100+ calls instead of the expected 2. This test is still a bit flaky because it depends on the timing of the writes. It can sometimes return 3 calls. A future commit should fix this and make blocking queries even more optimal for not-found results.	2022-02-15 18:23:17 -05:00
Daniel Nephin	3301f94004	rpc: improve docs for blockingQuery Follow the Go convention of accepting a small interface that documents the methods used by the function. Clarify the rules for implementing a query function passed to blockingQuery.	2022-02-15 14:20:14 -05:00
R.B. Boyer	115946da99	server: conditionally avoid writing a config entry to raft if it was already the same (#12321 ) This will both save on unnecessary raft operations as well as unnecessarily incrementing the raft modify index of config entries subject to no-op updates.	2022-02-14 14:39:12 -06:00
FFMMM	78264a8030	Vendor in rpc mono repo for net/rpc fork, go-msgpack, msgpackrpc. (#12311 ) This commit syncs ENT changes to the OSS repo. Original commit details in ENT: ``` commit 569d25f7f4578981c3801e6e067295668210f748 Author: FFMMM <FFMMM@users.noreply.github.com> Date: Thu Feb 10 10:23:33 2022 -0800 Vendor fork net rpc (#1538) * replace net/rpc w consul-net-rpc/net/rpc Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> * replace msgpackrpc and go-msgpack with fork from mono repo Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> * gofmt all files touched Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> ``` Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>	2022-02-14 09:45:45 -08:00
R.B. Boyer	52009ae86a	missed this test adjustment (#12331 )	2022-02-14 11:39:00 -06:00
R.B. Boyer	fa4577d1a9	local: fixes a data race in anti-entropy sync (#12324 ) The race detector noticed this initially in `TestAgentConfigWatcherSidecarProxy` but it is not restricted to just tests. The two main changes here were: - ensure that before we mutate the internal `agent/local` representation of a Service (for tags or VIPs) we clone those fields - ensure that there's no function argument joint ownership between the caller of a function and the local state when calling `AddService`, `AddCheck`, and related using `copystructure` for now.	2022-02-14 10:41:33 -06:00
Dao Thanh Tung	add15e12f7	URL-encode/decode resource names for HTTP API part 5 (#12297 )	2022-02-14 10:47:06 -05:00
Mark Anderson	1a16f7ee70	Refactor to make ACL errors more structured. (#12308 ) * First phase of refactoring PermissionDeniedError Add extended type PermissionDeniedByACLError that captures information about the accessor, particular permission type and the object and name of the thing being checked. It may be worth folding the test and error return into a single helper function, that can happen at a later date. Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-02-11 12:53:23 -08:00
Freddy	9580f79f86	Merge pull request #12223 from hashicorp/proxycfg/passthrough-cleanup	2022-02-10 17:35:51 -07:00
freddygv	ceb52d649a	Account for upstream targets in another DC. Transparent proxies typically cannot dial upstreams in remote datacenters. However, if their upstream configures a redirect to a remote DC then the upstream targets will be in another datacenter. In that sort of case we should use the WAN address for the passthrough.	2022-02-10 17:01:57 -07:00
freddygv	cbea3d203c	Fix race of upstreams with same passthrough ip Due to timing, a transparent proxy could have two upstreams to dial directly with the same address. For example: - The orders service can dial upstreams shipping and payment directly. - An instance of shipping at address 10.0.0.1 is deregistered. - Payments is scaled up and scheduled to have address 10.0.0.1. - The orders service receives the event for the new payments instance before seeing the deregistration for the shipping instance. At this point two upstreams have the same passthrough address and Envoy will reject the listener configuration. To disambiguate this commit considers the Raft index when storing passthrough addresses. In the example above, 10.0.0.1 would only be associated with the newer payments service instance.	2022-02-10 17:01:57 -07:00
freddygv	659ebc05a9	Ensure passthrough addresses get cleaned up Transparent proxies can set up filter chains that allow direct connections to upstream service instances. Services that can be dialed directly are stored in the PassthroughUpstreams map of the proxycfg snapshot. Previously these addresses were not being cleaned up based on new service health data. The list of addresses associated with an upstream service would only ever grow. As services scale up and down, eventually they will have instances assigned to an IP that was previously assigned to a different service. When IP addresses are duplicated across filter chain match rules the listener config will be rejected by Envoy. This commit updates the proxycfg snapshot management so that passthrough addresses can get cleaned up when no longer associated with a given upstream. There is still the possibility of a race condition here where due to timing an address is shared between multiple passthrough upstreams. That concern is mitigated by #12195, but will be further addressed in a follow-up.	2022-02-10 17:01:57 -07:00
Freddy	378a7258e3	Prevent xDS tight loop on cfg errors (#12195 )	2022-02-10 15:37:36 -07:00
Dhia Ayachi	4f0a71d7b4	fix race when starting a service while the agent `serviceManager` is … (#12302 ) * fix race when starting a service while the agent `serviceManager` is stopping * add changelog	2022-02-10 13:30:49 -05:00
Daniel Nephin	01784470f3	Merge pull request #12277 from hashicorp/dnephin/panic-in-service-register catalog: initialize the refs map to prevent a nil panic	2022-02-09 19:48:22 -05:00
Daniel Nephin	82c264b2b3	config-entry: fix a panic when registering a service or ingress gateway	2022-02-09 18:49:48 -05:00
R.B. Boyer	89bd1f57b5	xds: allow only one outstanding delta request at a time (#12236 ) Fixes #11876 This enforces that multiple xDS mutations are not issued on the same ADS connection at once, so that we can 100% control the order that they are applied. The original code made assumptions about the way multiple in-flight mutations were applied on the Envoy side that was incorrect.	2022-02-08 10:36:48 -06:00

... 9 10 11 12 13 ...

5112 Commits (c384f244608d87a37460148adc41f9c287385387)