consul

Commit Graph

Author	SHA1	Message	Date
Kyle Havlovitz	bd0eb07ed3	Add /v1/internal/service-virtual-ip for manually setting service VIPs (#17294 )	2023-05-12 00:38:52 +00:00
R.B. Boyer	cd80ea18ff	grpc: ensure grpc resolver correctly uses lan/wan addresses on servers (#17270 ) The grpc resolver implementation is fed from changes to the router.Router. Within the router there is a map of various areas storing the addressing information for servers in those areas. All map entries are of the WAN variety except a single special entry for the LAN. Addressing information in the LAN "area" are local addresses intended for use when making a client-to-server or server-to-server request. The client agent correctly updates this LAN area when receiving lan serf events, so by extension the grpc resolver works fine in that scenario. The server agent only initially populates a single entry in the LAN area (for itself) on startup, and then never mutates that area map again. For normal RPCs a different structure is used for LAN routing. Additionally when selecting a server to contact in the local datacenter it will randomly select addresses from either the LAN or WAN addressed entries in the map. Unfortunately this means that the grpc resolver stack as it exists on server agents is either broken or only accidentally functions by having servers dial each other over the WAN-accessible address. If the operator disables the serf wan port completely likely this incidental functioning would break. This PR enforces that local requests for servers (both for stale reads or leader forwarded requests) exclusively use the LAN "area" information and also fixes it so that servers keep that area up to date in the router. A test for the grpc resolver logic was added, as well as a higher level full-stack test to ensure the externally perceived bug does not return.	2023-05-11 11:08:57 -05:00
Dan Upton	5030101cdb	resource: add missing validation to the `List` and `WatchList` endpoints (#17213 )	2023-05-10 10:38:48 +01:00
Derek Menteer	5ecab506a6	Fix ent bug caused by #17241 . (#17278 ) Fix ent bug caused by #17241 All tests passed in OSS, but not ENT. This is a patch to resolve the problem for both.	2023-05-09 16:36:29 -05:00
cskh	48f7d99305	snapshot: some improvments to the snapshot process (#17236 ) * snapshot: some improvments to the snapshot process Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2023-05-09 15:28:52 -04:00
Semir Patel	40eefaba18	Reaper controller for cascading deletes of owner resources (#17256 )	2023-05-09 13:57:40 -05:00
Freddy	7c3e9cd862	Hash namespace+proxy ID when creating socket path (#17204 ) UNIX domain socket paths are limited to 104-108 characters, depending on the OS. This limit was quite easy to exceed when testing the feature on Kubernetes, due to how proxy IDs encode the Pod ID eg: metrics-collector-59467bcb9b-fkkzl-hcp-metrics-collector-sidecar-proxy To ensure we stay under that character limit this commit makes a couple changes: - Use a b64 encoded SHA1 hash of the namespace + proxy ID to create a short and deterministic socket file name. - Add validation to proxy registrations and proxy-defaults to enforce a limit on the socket directory length.	2023-05-09 12:20:26 -06:00
Dan Upton	d53a1d4a27	resource: add helpers for more efficiently comparing IDs etc (#17224 )	2023-05-09 19:02:24 +01:00
Derek Menteer	4f6da20fe5	Fix multiple issues related to proxycfg health queries. (#17241 ) Fix multiple issues related to proxycfg health queries. 1. The datacenter was not being provided to a proxycfg query, which resulted in bypassing agentless query optimizations and using the normal API instead. 2. The health rpc endpoint would return a zero index when insufficient ACLs were detected. This would result in the agent cache performing an infinite loop of queries in rapid succession without backoff.	2023-05-09 12:37:58 -05:00
Dan Upton	972998203e	controller: deduplicate items in queue (#17168 )	2023-05-09 18:14:20 +01:00
Dan Upton	6e1bc57469	Controller Runtime	2023-05-09 15:25:55 +01:00
Matt Keeler	34915670f2	Register new catalog & mesh protobuf types with the resource registry (#17225 )	2023-05-08 15:36:35 -04:00
Derek Menteer	50ef6a697e	Fix issue with peer stream node cleanup. (#17235 ) Fix issue with peer stream node cleanup. This commit encompasses a few problems that are closely related due to their proximity in the code. 1. The peerstream utilizes node IDs in several locations to determine which nodes / services / checks should be cleaned up or created. While VM deployments with agents will likely always have a node ID, agentless uses synthetic nodes and does not populate the field. This means that for consul-k8s deployments, all services were likely bundled together into the same synthetic node in some code paths (but not all), resulting in strange behavior. The Node.Node field should be used instead as a unique identifier, as it should always be populated. 2. The peerstream cleanup process for unused nodes uses an incorrect query for node deregistration. This query is NOT namespace aware and results in the node (and corresponding services) being deregistered prematurely whenever it has zero default-namespace services and 1+ non-default-namespace services registered on it. This issue is tricky to find due to the incorrect logic mentioned in #1, combined with the fact that the affected services must be co-located on the same node as the currently deregistering service for this to be encountered. 3. The stream tracker did not understand differences between services in different namespaces and could therefore report incorrect numbers. It was updated to utilize the full service name to avoid conflicts and return proper results.	2023-05-08 13:13:25 -05:00
Semir Patel	991a002fcc	resource: List resources by owner (#17190 )	2023-05-08 12:26:19 -05:00
Dan Upton	917afcf3c6	controller: make the `WorkQueue` generic (#16982 )	2023-05-05 15:38:22 +01:00
John Eikenberry	bd76fdeaeb	enable auto-tidy expired issuers in vault (as CA) When using vault as a CA and generating the local signing cert, try to enable the PKI endpoint's auto-tidy feature with it set to tidy expired issuers.	2023-05-03 20:30:37 +00:00
Nathan Coleman	bdef22354b	Use auth context when evaluating service read permissions (#17207 ) Co-authored-by: Blake Covarrubias <1812+blake@users.noreply.github.com>	2023-05-02 16:23:42 -04:00
Poonam Jadhav	ef5d54fd4c	feat: add no-op reporting background routine (#17178 )	2023-04-28 20:07:03 -04:00
Eric Haberkorn	2c0da88ce7	fix panic in `injectSANMatcher` when `tlsContext` is `nil` (#17185 )	2023-04-28 16:27:57 -04:00
Paul Glass	e4a341c88a	Permissive mTLS: Config entry filtering and CLI warnings (#17183 ) This adds filtering for service-defaults: consul config list -filter 'MutualTLSMode == "permissive"'. It adds CLI warnings when the CLI writes a config entry and sees that either service-defaults or proxy-defaults contains MutualTLSMode=permissive, or sees that the mesh config entry contains AllowEnablingPermissiveMutualTLSMode=true.	2023-04-28 12:51:36 -05:00
R.B. Boyer	6b4986907d	peering: ensure that merged central configs of peered upstreams for partitioned downstreams work (#17179 ) Partitioned downstreams with peered upstreams could not properly merge central config info (i.e. proxy-defaults and service-defaults things like mesh gateway modes) if the upstream had an empty DestinationPartition field in Enterprise. Due to data flow, if this setup is done using Consul client agents the field is never empty and thus does not experience the bug. When a service is registered directly to the catalog as is the case for consul-dataplane use this field may be empty and and the internal machinery of the merging function doesn't handle this well. This PR ensures the internal machinery of that function is referentially self-consistent.	2023-04-28 12:36:08 -05:00
Semir Patel	1037bf7f69	Sync .golangci.yml from ENT (#17180 )	2023-04-28 17:14:37 +00:00
John Landa	eded58b62a	Remove artificial ACLTokenMaxTTL limit for configuring acl token expiry (#17066 ) * Remove artificial ACLTokenMaxTTL limit for configuring acl token expiry * Add changelog * Remove test on default MaxTokenTTL * Change to imperitive tense for changelog entry	2023-04-28 10:57:30 -05:00
Semir Patel	9fef1c7f17	Create tombstone on resource `Delete` (#17108 )	2023-04-28 10:49:08 -05:00
Dan Upton	eff5dd1812	resource: owner references must include a uid (#17169 )	2023-04-28 11:22:42 +01:00
Freddy	e02ef16f02	Update HCP bootstrapping to support existing clusters (#16916 ) * Persist HCP management token from server config We want to move away from injecting an initial management token into Consul clusters linked to HCP. The reasoning is that by using a separate class of token we can have more flexibility in terms of allowing HCP's token to co-exist with the user's management token. Down the line we can also more easily adjust the permissions attached to HCP's token to limit it's scope. With these changes, the cloud management token is like the initial management token in that iit has the same global management policy and if it is created it effectively bootstraps the ACL system. * Update SDK and mock HCP server The HCP management token will now be sent in a special field rather than as Consul's "initial management" token configuration. This commit also updates the mock HCP server to more accurately reflect the behavior of the CCM backend. * Refactor HCP bootstrapping logic and add tests We want to allow users to link Consul clusters that already exist to HCP. Existing clusters need care when bootstrapped by HCP, since we do not want to do things like change ACL/TLS settings for a running cluster. Additional changes: * Deconstruct MaybeBootstrap so that it can be tested. The HCP Go SDK requires HTTPS to fetch a token from the Auth URL, even if the backend server is mocked. By pulling the hcp.Client creation out we can modify its TLS configuration in tests while keeping the secure behavior in production code. * Add light validation for data received/loaded. * Sanitize initial_management token from received config, since HCP will only ever use the CloudConfig.MangementToken. * Add changelog entry	2023-04-27 22:27:39 +02:00
John Maguire	391ed069c4	APIGW: Update how status conditions for certificates are handled (#17115 ) * Move status condition for invalid certifcate to reference the listener that is using the certificate * Fix where we set the condition status for listeners and certificate refs, added tests * Add changelog	2023-04-27 15:54:44 +00:00
Semir Patel	5eaeb7b8e5	Support Envoy's MaxEjectionPercent and BaseEjectionTime config entries for passive health checks (#15979 ) * Add MaxEjectionPercent to config entry * Add BaseEjectionTime to config entry * Add MaxEjectionPercent and BaseEjectionTime to protobufs * Add MaxEjectionPercent and BaseEjectionTime to api * Fix integration test breakage * Verify MaxEjectionPercent and BaseEjectionTime in integration test upstream confings * Website docs for MaxEjectionPercent and BaseEjection time * Add `make docs` to browse docs at http://localhost:3000 * Changelog entry * so that is the difference between consul-docker and dev-docker * blah * update proto funcs * update proto --------- Co-authored-by: Maliz <maliheh.monshizadeh@hashicorp.com>	2023-04-26 15:59:48 -07:00
Michael Wilkerson	80b1dbcc7d	fixed aliases for sameness group (sameness_group) (#17161 )	2023-04-26 14:53:23 -07:00
Eric Haberkorn	a87115c598	add acl filter logs (#17143 )	2023-04-26 10:57:35 -04:00
Dan Upton	faae7bb5f2	testing: `RunResourceService` helper (#17068 )	2023-04-26 11:57:10 +01:00
Semir Patel	e7bb8fdf15	Fix or disable pipeline breaking changes that made it into main in last day or so (#17130 ) * Fix straggler from renaming Register->RegisterTypes * somehow a lint failure got through previously * Fix lint-consul-retry errors * adding in fix for success jobs getting skipped. (#17132) * Temporarily disable inmem backend conformance test to get green pipeline * Another test needs disabling --------- Co-authored-by: John Murret <john.murret@hashicorp.com>	2023-04-25 15:17:48 -05:00
Dan Upton	b9c485dcb8	Controller Supervision (#17016 )	2023-04-25 12:52:35 +01:00
John Maguire	e47f3216e5	APIGW Normalize Status Conditions (#16994 ) * normalize status conditions for gateways and routes * Added tests for checking condition status and panic conditions for validating combinations, added dummy code for fsm store * get rid of unneeded gateway condition generator struct * Remove unused file * run go mod tidy * Update tests, add conflicted gateway status * put back removed status for test * Fix linting violation, remove custom conflicted status * Update fsm commands oss * Fix incorrect combination of type/condition/status * cleaning up from PR review * Change "invalidCertificate" to be of accepted status * Move status condition enums into api package * Update gateways controller and generated code * Update conditions in fsm oss tests * run go mod tidy on consul-container module to fix linting * Fix type for gateway endpoint test * go mod tidy from changes to api * go mod tidy on troubleshoot * Fix route conflicted reason * fix route conflict reason rename * Fix text for gateway conflicted status * Add valid certificate ref condition setting * Revert change to resolved refs to be handled in future PR	2023-04-24 16:22:55 -04:00
Michael Wilkerson	001d540afc	Add sameness group field to prepared queries (#17089 ) * added method for converting SamenessGroupConfigEntry - added new method `ToQueryFailoverTargets` for converting a SamenessGroupConfigEntry's members to a list of QueryFailoverTargets - renamed `ToFailoverTargets` ToServiceResolverFailoverTargets to distinguish it from `ToQueryFailoverTargets` * Added SamenessGroup to PreparedQuery - exposed Service.Partition to API when defining a prepared query - added a method for determining if a QueryFailoverOptions is empty - This will be useful for validation - added unit tests * added method for retrieving a SamenessGroup to state store * added logic for using PQ with SamenessGroup - added branching path for SamenessGroup handling in execute. It will be handled separate from the normal PQ case - added a new interface so that the `GetSamenessGroupFailoverTargets` can be properly tested - separated the execute logic into a `targetSelector` function so that it can be used for both failover and sameness group PQs - split OSS only methods into new PQ OSS files - added validation that `samenessGroup` is an enterprise only feature * added documentation for PQ SamenessGroup	2023-04-24 13:21:28 -07:00
Derek Menteer	a33b224a55	Fix virtual services being included in intention topology as downstreams. (#17099 )	2023-04-24 12:03:26 -05:00
Semir Patel	46816071df	De-scope tenenacy requirements to OSS only for now. (#17087 ) Partition and namespace must be "default" Peername must be "local"	2023-04-24 08:14:51 -05:00
Kyle Havlovitz	6d01d07cf8	Include virtual services from discovery chain in intention topology (#16862 )	2023-04-21 16:58:13 +00:00
Kyle Havlovitz	d5277af70d	Add manual virtual IP support to state store (#16815 )	2023-04-21 09:19:02 -07:00
Eric Haberkorn	53cdda8d17	Fix a bug with disco chain config entry fetching (#17078 ) Before this change, we were not fetching service resolvers (and therefore service defaults) configuration entries for services on members of sameness groups.	2023-04-21 09:18:32 -04:00
Semir Patel	53f49b2fa1	Enforce operator:write acl on `WriteStatus` endpoint (#17019 )	2023-04-20 16:25:33 +00:00
Eric Haberkorn	b1fae05983	Add sameness groups to service intentions. (#17064 )	2023-04-20 12:16:04 -04:00
hashicorp-copywrite[bot]	9f81fc01e9	[COMPLIANCE] Add Copyright and License Headers (#16854 ) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> Co-authored-by: Ronald <roncodingenthusiast@users.noreply.github.com>	2023-04-20 12:40:22 +00:00
Paul Glass	f4406e69b9	[NET-3091] Update service intentions to support jwt provider references (#17037 ) * [NET-3090] Add new JWT provider config entry * Add initial test cases * update validations for jwt-provider config entry fields * more validation * start improving tests * more tests * Normalize * Improve tests and move validate fns * usage test update * Add split between ent and oss for partitions * fix lint issues * Added retry backoff, fixed tests, removed unused defaults * take into account default partitions * use countTrue and add aliases * omit audiences if empty * fix failing tests * add omit-entry * Add JWT intentions * generate proto * fix deep copy issues * remove extra field * added some tests * more tests * add validation for creating existing jwt * fix nil issue * More tests, fix conflicts and improve memdb call * fix namespace * add aliases * consolidate errors, skip duplicate memdb calls * reworked iteration over config entries * logic improvements from review --------- Co-authored-by: Ronald Ekambi <ronekambi@gmail.com>	2023-04-19 18:16:39 -04:00
Paul Glass	ac200cfec8	[NET-3090] Add new JWT provider config entry (#17036 ) * [NET-3090] Add new JWT provider config entry * Add initial test cases * update validations for jwt-provider config entry fields * more validation * start improving tests * more tests * Normalize * Improve tests and move validate fns * usage test update * Add split between ent and oss for partitions * fix lint issues * Added retry backoff, fixed tests, removed unused defaults * take into account default partitions * use countTrue and add aliases * omit audiences if empty * fix failing tests * add omit-entry * update copyright headers ids --------- Co-authored-by: Ronald Ekambi <ronekambi@gmail.com> Co-authored-by: Ronald <roncodingenthusiast@users.noreply.github.com>	2023-04-19 17:54:14 -04:00
Paul Glass	77ecff3209	Permissive mTLS (#17035 ) This implements permissive mTLS , which allows toggling services into "permissive" mTLS mode. Permissive mTLS mode allows incoming "non Consul-mTLS" traffic to be forward unmodified to the application. * Update service-defaults and proxy-defaults config entries with a MutualTLSMode field * Update the mesh config entry with an AllowEnablingPermissiveMutualTLS field and implement the necessary validation. AllowEnablingPermissiveMutualTLS must be true to allow changing to MutualTLSMode=permissive, but this does not require that all proxy-defaults and service-defaults are currently in strict mode. * Update xDS listener config to add a "permissive filter chain" when MutualTLSMode=permissive for a particular service. The permissive filter chain matches incoming traffic by the destination port. If the destination port matches the service port from the catalog, then no mTLS is required and the traffic sent is forwarded unmodified to the application.	2023-04-19 14:45:00 -05:00
R.B. Boyer	d07aac8d7e	Revert "cache: refactor agent cache fetching to prevent unnecessary f… (#16818 ) (#17046 ) Revert "cache: refactor agent cache fetching to prevent unnecessary fetches on error (#14956)" Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com>	2023-04-19 13:17:21 -05:00
John Murret	2cefa8d9bd	ci: remove test-integrations CircleCI workflow (#16928 ) * remove all CircleCI files * remove references to CircleCI * remove more references to CircleCI * pin golangci-lint to v1.51.1 instead of v1.51	2023-04-19 16:19:29 +00:00
Luke Kysow	46212cc570	Don't send updates twice (#16999 )	2023-04-18 10:41:58 -07:00
Poonam Jadhav	5d7a7ff041	feat: set up reporting agent (#16991 )	2023-04-18 11:03:05 -04:00
Dan Upton	a37a441991	server: wire up in-process Resource Service (#16978 )	2023-04-18 10:03:23 +01:00
Semir Patel	2f7d591702	Tenancy wildcard validaton for `Write`, `Read`, and `Delete` endpoints (#17004 )	2023-04-17 16:33:20 -05:00
Derek Menteer	87324c9ec8	Add PrioritizeByLocality to config entries. (#17007 ) This commit adds the PrioritizeByLocality field to both proxy-config and service-resolver config entries for locality-aware routing. The field is currently intended for enterprise only, and will be used to enable prioritization of service-mesh connections to services based on geographical region / zone.	2023-04-14 15:42:54 -05:00
Michael Wilkerson	0dd4ea2033	* added Sameness Group to proto files (#16998 ) - added Sameness Group to config entries - added Sameness Group to subscriptions * generated proto files * added Sameness Group events to the state store - added test cases * Refactored health RPC Client - moved code that is common to rpcclient under rpcclient common.go. This will help set us up to support future RPC clients * Refactored proxycfg glue views - Moved views to rpcclient config entry. This will allow us to reuse this code for a config entry client * added config entry RPC Client - Copied most of the testing code from rpcclient/health * hooked up new rpcclient in agent * fixed documentation and comments for clarity	2023-04-14 09:24:46 -07:00
Dhia Ayachi	79d4040b6c	add IP rate limiting config update (#16997 ) * add IP rate limiting config update * fix review comments	2023-04-14 09:26:38 -04:00
Semir Patel	79b30476e0	Enforce Owner rules in `Write` endpoint (#16983 )	2023-04-14 08:19:46 -05:00
Semir Patel	8611ec56f3	Fix delete when uid not provided (#16996 )	2023-04-14 08:18:24 -05:00
Eric Haberkorn	44b39240a8	move enterprise test cases out of open source (#16985 )	2023-04-13 09:07:06 -04:00
Semir Patel	b8c9e133be	Add mutate hook to `Write` endpoint (#16958 )	2023-04-12 16:50:07 -05:00
Semir Patel	3b83c7ee9a	Enforce ACLs on resource `Write` and `Delete` endpoints (#16956 )	2023-04-12 16:22:44 -05:00
Dhia Ayachi	b85a149eaf	Memdb Txn Commit race condition fix (#16871 ) * Add a test to reproduce the race condition * Fix race condition by publishing the event after the commit and adding a lock to prevent out of order events. * split publish to generate the list of events before committing the transaction. * add changelog * remove extra func * Apply suggestions from code review Co-authored-by: Dan Upton <daniel@floppy.co> * add comment to explain test --------- Co-authored-by: Dan Upton <daniel@floppy.co>	2023-04-12 13:18:01 -04:00
Poonam Jadhav	8255cc97f5	feat: add reporting config with reload (#16890 )	2023-04-11 15:04:02 -04:00
Dan Upton	d595e6ade9	resource: `WriteStatus` endpoint (#16886 )	2023-04-11 19:23:14 +01:00
Derek Menteer	1bcaeabfc3	Remove deprecated service-defaults upstream behavior. (#16957 ) Prior to this change, peer services would be targeted by service-default overrides as long as the new `peer` field was not found in the config entry. This commit removes that deprecated backwards-compatibility behavior. Now it is necessary to specify the `peer` field in order for upstream overrides to apply to a peer upstream.	2023-04-11 10:20:33 -05:00
Semir Patel	317240fca7	Resource validation hook for `Write` endpoint (#16950 )	2023-04-11 06:55:32 -05:00
Semir Patel	686f49346c	Check acls on resource `Read`, `List`, and `WatchList` (#16842 )	2023-04-11 06:10:14 -05:00
John Maguire	92be8bd762	APIGW: Routes with duplicate parents should be invalid (#16926 ) * ensure route parents are unique when creating an http route * Ensure tcp route parents are unique * Added unit tests	2023-04-10 13:20:32 -04:00
John Eikenberry	97173725b7	log warning about certificate expiring sooner and with more details The old setting of 24 hours was not enough time to deal with an expiring certificates. This change ups it to 28 days OR 40% of the full cert duration, whichever is shorter. It also adds details to the log message to indicate which certificate it is logging about and a suggested action.	2023-04-07 20:38:07 +00:00
Chris Thain	175bb1a303	Wasm Envoy HTTP extension (#16877 )	2023-04-06 14:12:07 -07:00
Semir Patel	1794484298	Resource `Delete` endpoint (#16756 )	2023-04-06 08:58:54 -05:00
Dan Upton	4fa2537b3b	Resource `Write` endpoint (#16786 )	2023-04-06 10:40:04 +01:00
Dan Upton	671d5825ca	Raft storage backend (#16619 )	2023-04-04 17:30:06 +01:00
cskh	a319953576	docs: add envoy to the proxycfg diagram (#16834 ) * docs: add envoy to the proxycfg diagram	2023-04-04 09:42:42 -04:00
Freddy	f6de5ff635	Allow dialer to re-establish terminated peering (#16776 ) Currently, if an acceptor peer deletes a peering the dialer's peering will eventually get to a "terminated" state. If the two clusters need to be re-peered the acceptor will re-generate the token but the dialer will encounter this error on the call to establish: "failed to get addresses to dial peer: failed to refresh peer server addresses, will continue to use initial addresses: there is no active peering for "<<<ID>>>"" This is because in `exchangeSecret().GetDialAddresses()` we will get an error if fetching addresses for an inactive peering. The peering shows up as inactive at this point because of the existing terminated state. Rather than checking whether a peering is active we can instead check whether it was deleted. This way users do not need to delete terminated peerings in the dialing cluster before re-establishing them.	2023-04-03 12:07:45 -06:00
Chris S. Kim	a5397b1f23	Connect CA Primary Provider refactor (#16749 ) * Rename Intermediate cert references to LeafSigningCert Within the Consul CA subsystem, the term "Intermediate" is confusing because the meaning changes depending on provider and datacenter (primary vs secondary). For example, when using the Consul CA the "ActiveIntermediate" may return the root certificate in a primary datacenter. At a high level, we are interested in knowing which CA is responsible for signing leaf certs, regardless of its position in a certificate chain. This rename makes the intent clearer. * Move provider state check earlier * Remove calls to GenerateLeafSigningCert GenerateLeafSigningCert (formerly known as GenerateIntermediate) is vestigial in non-Vault providers, as it simply returns the root certificate in primary datacenters. By folding Vault's intermediate cert logic into `GenerateRoot` we can encapsulate the intermediate cert handling within `newCARoot`. * Move GenerateLeafSigningCert out of PrimaryProvidder Now that the Vault Provider calls GenerateLeafSigningCert within GenerateRoot, we can remove the method from all other providers that never used it in a meaningful way. * Add test for IntermediatePEM * Rename GenerateRoot to GenerateCAChain "Root" was being overloaded in the Consul CA context, as different providers and configs resulted in a single root certificate or a chain originating from an external trusted CA. Since the Vault provider also generates intermediates, it seems more accurate to call this a CAChain.	2023-04-03 11:40:33 -04:00
Eric Haberkorn	a6d69adcf5	Add default resolvers to disco chains based on the default sameness group (#16837 )	2023-03-31 14:35:56 -04:00
Derek Menteer	8d40cf9858	Add sameness-group to exported-services config entries (#16836 ) This PR adds the sameness-group field to exported-service config entries, which allows for services to be exported to multiple destination partitions / peers easily.	2023-03-31 12:36:44 -05:00
Dan Upton	651549c97d	storage: fix resource leak in Watch (#16817 )	2023-03-31 13:24:19 +01:00
Eric Haberkorn	0d1d2fc4c9	add order by locality failover to Consul enterprise (#16791 )	2023-03-30 10:08:38 -04:00
Ronald	b64674623e	Copyright headers for missing files/folders (#16708 ) * copyright headers for agent folder	2023-03-28 18:48:58 -04:00
Ronald	94ec4eb2f4	copyright headers for agent folder (#16704 ) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files	2023-03-28 14:39:22 -04:00
John Maguire	c833464daf	Update normalization of route refs (#16789 ) * Use merge of enterprise meta's rather than new custom method * Add merge logic for tcp routes * Add changelog * Normalize certificate refs on gateways * Fix infinite call loop * Explicitly call enterprise meta	2023-03-28 11:23:49 -04:00
Michael Wilkerson	e5d58c59c9	changes to support new PQ enterprise fields (#16793 )	2023-03-27 15:40:49 -07:00
Semir Patel	440f11203f	Resource service List(..) endpoint (#16753 )	2023-03-27 16:25:27 -05:00
Dhia Ayachi	10df4d83aa	add ip rate limiter controller OSS parts (#16790 )	2023-03-27 17:00:25 -04:00
Kyle Havlovitz	42c5b29713	Allocate virtual ip for resolver/router/splitter config entries (#16760 )	2023-03-27 13:04:24 -07:00
Semir Patel	032aba3175	WatchList(..) endpoint for the resource service (#16726 )	2023-03-27 14:37:54 -05:00
John Maguire	351bdc3c0d	Fix struct tags for TCPService enterprise meta (#16781 ) * Fix struct tags for TCPService enterprise meta * Add changelog	2023-03-27 16:17:04 +00:00
Semir Patel	3415689eb6	Read(...) endpoint for the resource service (#16655 )	2023-03-27 10:35:39 -05:00
Derek Menteer	2236975011	Change partition for peers in discovery chain targets (#16769 ) This commit swaps the partition field to the local partition for discovery chains targeting peers. Prior to this change, peer upstreams would always use a value of default regardless of which partition they exist in. This caused several issues in xds / proxycfg because of id mismatches. Some prior fixes were made to deal with one-off id mismatches that this PR also cleans up, since they are no longer needed.	2023-03-24 15:40:19 -05:00
John Eikenberry	0b1dc4ec36	tests instantiating clients w/o shutting down (#16755 ) noticed via their port still in use messages.	2023-03-24 16:54:11 +00:00
Poonam Jadhav	3df271959c	fix: remove unused tenancy category from rate limit spec (#16740 )	2023-03-23 12:14:59 -04:00
Dhia Ayachi	3ba0eb5074	delete config when nil (#16690 ) * delete config when nil * fix mock interface implementation * fix handler test to use the right assertion * extract DeleteConfig as a separate API. * fix mock limiter implementation to satisfy the new interface * fix failing tests * add test comments	2023-03-22 15:19:54 -04:00
Eric Haberkorn	495ad4c7ef	add enterprise xds tests (#16738 )	2023-03-22 14:56:18 -04:00
Eric Haberkorn	3c5c53aa80	fix bug where pqs that failover to a cluster peer dont un-fail over (#16729 )	2023-03-22 09:24:13 -04:00
cskh	7f6f6891f7	fix: gracefully fail on invalid port number (#16721 )	2023-03-21 22:29:21 -04:00
John Maguire	8dd1d73874	Remove unused are hosts set check (#16691 ) * Remove unused are hosts set check * Remove all traces of unused 'AreHostsSet' parameter * Remove unused Hosts attribute * Remove commented out use of snap.APIGateway.Hosts	2023-03-21 16:23:23 +00:00
Nitya Dhanushkodi	b9bd2c3780	peering: peering partition failover fixes (#16673 ) add local source partition for peered upstreams	2023-03-20 10:00:29 -07:00
John Maguire	1ef9f4dade	Fix route subscription when using namespaces (#16677 ) * Fix route subscription when using namespaces * Update changelog * Fix changelog entry to reference that the bug was enterprise only	2023-03-20 12:42:30 -04:00
Melisa Griffin	606f8fbbab	Adds check to verify that the API Gateway is being created with at least one listener	2023-03-20 12:37:30 -04:00
Poonam Jadhav	9c64731a56	feat: add category annotation to RPC and gRPC methods (#16646 )	2023-03-20 11:24:29 -04:00
Eric Haberkorn	7477f52a16	add sameness groups to discovery chains (#16671 )	2023-03-20 09:12:37 -04:00
Andrew Stucki	501b87fd31	[API Gateway] Fix invalid cluster causing gateway programming delay (#16661 ) * Add test for http routes * Add fix * Fix tests * Add changelog entry * Refactor and fix flaky tests	2023-03-17 13:31:04 -04:00
Eric Haberkorn	eaa39f4ef5	add sameness group support to service resolver failover and redirects (#16664 )	2023-03-17 10:48:06 -04:00
Eric Haberkorn	57e034b746	fix confusing spiffe ids in golden tests (#16643 )	2023-03-15 14:30:36 -04:00
wangxinyi7	152c75349e	net 2731 ip config entry OSS version (#16642 ) * ip config entry * name changing * move to ent * ent version * renaming * change format * renaming * refactor * add default values	2023-03-15 11:21:24 -07:00
John Maguire	ff5887a99e	Update e2e tests for namespaces (#16627 ) * Refactored "NewGatewayService" to handle namespaces, fixed TestHTTPRouteFlattening test * Fixed existing http_route tests for namespacing * Squash aclEnterpriseMeta for ResourceRefs and HTTPServices, accept namespace for creating connect services and regular services * Use require instead of assert after creating namespaces in http_route_tests * Refactor NewConnectService and NewGatewayService functions to use cfg objects to reduce number of method args * Rename field on SidecarConfig in tests from `SidecarServiceName` to `Name` to avoid stutter	2023-03-15 17:51:36 +00:00
Freddy	724b752ca7	Backport ENT-4704 (#16612 )	2023-03-14 14:55:11 -06:00
Derek Menteer	8f75d99299	Fix issue with trust bundle read ACL check. (#16630 ) This commit fixes an issue where trust bundles could not be read by services in a non-default namespace, unless they had excessive ACL permissions given to them. Prior to this change, `service:write` was required in the default namespace in order to read the trust bundle. Now, `service:write` to a service in any namespace is sufficient.	2023-03-14 12:24:33 -05:00
Chris S. Kim	d5677e5680	Preserve CARoots when updating Vault CA configuration (#16592 ) If a CA config update did not cause a root change, the codepath would return early and skip some steps which preserve its intermediate certificates and signing key ID. This commit re-orders some code and prevents updates from generating new intermediate certificates.	2023-03-13 17:32:59 -04:00
Derek Menteer	f2902e6608	Add sameness-group configuration entry. (#16608 ) This commit adds a sameness-group config entry to the API and structs packages. It includes some validation logic and a new memdb index that tracks the default sameness-group for each partition. Sameness groups will simplify the effort of managing failovers / intentions / exports for peers and partitions. Note that this change purely to introduce the configuration entry and does not include the full functionality of sameness-groups.	2023-03-13 16:19:11 -05:00
Ashvitha	f95ffe0355	Allow HCP metrics collection for Envoy proxies Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Add a new envoy flag: "envoy_hcp_metrics_bind_socket_dir", a directory where a unix socket will be created with the name `<namespace>_<proxy_id>.sock` to forward Envoy metrics. If set, this will configure: - In bootstrap configuration a local stats_sink and static cluster. These will forward metrics to a loopback listener sent over xDS. - A dynamic listener listening at the socket path that the previously defined static cluster is sending metrics to. - A dynamic cluster that will forward traffic received at this listener to the hcp-metrics-collector service. Reasons for having a static cluster pointing at a dynamic listener: - We want to secure the metrics stream using TLS, but the stats sink can only be defined in bootstrap config. With dynamic listeners/clusters we can use the proxy's leaf certificate issued by the Connect CA, which isn't available at bootstrap time. - We want to intelligently route to the HCP collector. Configuring its addreess at bootstrap time limits our flexibility routing-wise. More on this below. Reasons for defining the collector as an upstream in `proxycfg`: - The HCP collector will be deployed as a mesh service. - Certificate management is taken care of, as mentioned above. - Service discovery and routing logic is automatically taken care of, meaning that no code changes are required in the xds package. - Custom routing rules can be added for the collector using discovery chain config entries. Initially the collector is expected to be deployed to each admin partition, but in the future could be deployed centrally in the default partition. These config entries could even be managed by HCP itself.	2023-03-10 13:52:54 -07:00
Eric Haberkorn	e298f506a5	Add Peer Locality to Discovery Chains (#16588 ) Add peer locality to discovery chains	2023-03-10 12:59:47 -05:00
Eric Haberkorn	57e2493415	allow setting locality on services and nodes (#16581 )	2023-03-10 09:36:15 -05:00
Semir Patel	176945aa86	GRPC stub for the ResourceService (#16528 )	2023-03-09 13:40:23 -06:00
Andrew Stucki	040647e0ba	auto-updated agent/uiserver/dist/ from commit `63204b518` (#16587 ) Co-authored-by: hc-github-team-consul-core <github-team-consul-core@hashicorp.com>	2023-03-09 13:56:53 -05:00
Eric Haberkorn	89de91b263	fix bug that can lead to peering service deletes impacting the state of local services (#16570 )	2023-03-08 11:24:03 -05:00
Eric Haberkorn	dbaf8bf49c	add agent locality and replicate it across peer streams (#16522 )	2023-03-07 14:05:23 -05:00
John Eikenberry	f5641ffccc	support vault auth config for alicloud ca provider Add support for using existing vault auto-auth configurations as the provider configuration when using Vault's CA provider with AliCloud. AliCloud requires 2 extra fields to enable it to use STS (it's preferred auth setup). Our vault-plugin-auth-alicloud package contained a method to help generate them as they require you to make an http call to a faked endpoint proxy to get them (url and headers base64 encoded).	2023-03-07 03:02:05 +00:00
Melisa Griffin	fc232326a0	NET-2904 Fixes API Gateway Route Service Weight Division Error	2023-03-06 08:41:57 -05:00
Melisa Griffin	129eca8fdb	NET-2903 Normalize weight for http routes (#16512 ) * NET-2903 Normalize weight for http routes * Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>	2023-03-03 16:39:59 -05:00
R.B. Boyer	9a485cdb49	proxycfg: ensure that an irrecoverable error in proxycfg closes the xds session and triggers a replacement proxycfg watcher (#16497 ) Receiving an "acl not found" error from an RPC in the agent cache and the streaming/event components will cause any request loops to cease under the assumption that they will never work again if the token was destroyed. This prevents log spam (#14144, #9738). Unfortunately due to things like: - authz requests going to stale servers that may not have witnessed the token creation yet - authz requests in a secondary datacenter happening before the tokens get replicated to that datacenter - authz requests from a primary TO a secondary datacenter happening before the tokens get replicated to that datacenter The caller will get an "acl not found" before the token exists, rather than just after. The machinery added above in the linked PRs will kick in and prevent the request loop from looping around again once the tokens actually exist. For `consul-dataplane` usages, where xDS is served by the Consul servers rather than the clients ultimately this is not a problem because in that scenario the `agent/proxycfg` machinery is on-demand and launched by a new xDS stream needing data for a specific service in the catalog. If the watching goroutines are terminated it ripples down and terminates the xDS stream, which CDP will eventually re-establish and restart everything. For Consul client usages, the `agent/proxycfg` machinery is ahead-of-time launched at service registration time (called "local" in some of the proxycfg machinery) so when the xDS stream comes in the data is already ready to go. If the watching goroutines terminate it should terminate the xDS stream, but there's no mechanism to re-spawn the watching goroutines. If the xDS stream reconnects it will see no `ConfigSnapshot` and will not get one again until the client agent is restarted, or the service is re-registered with something changed in it. This PR fixes a few things in the machinery: - there was an inadvertent deadlock in fetching snapshot from the proxycfg machinery by xDS, such that when the watching goroutine terminated the snapshots would never be fetched. This caused some of the xDS machinery to get indefinitely paused and not finish the teardown properly. - Every 30s we now attempt to re-insert all locally registered services into the proxycfg machinery. - When services are re-inserted into the proxycfg machinery we special case "dead" ones such that we unilaterally replace them rather that doing that conditionally.	2023-03-03 14:27:53 -06:00
John Eikenberry	56ffee6d42	add provider ca support for approle auth-method Adds support for the approle auth-method. Only handles using the approle role/secret to auth and it doesn't support the agent's extra management configuration options (wrap and delete after read) as they are not required as part of the auth (ie. they are vault agent things).	2023-03-03 19:29:53 +00:00
Andrew Stucki	cc0765b87d	Fix resolution of service resolvers with subsets for external upstreams (#16499 ) * Fix resolution of service resolvers with subsets for external upstreams * Add tests * Add changelog entry * Update view filter logic	2023-03-03 14:17:11 -05:00
Eric Haberkorn	5f81662066	Add support for failover policies (#16505 )	2023-03-03 11:12:38 -05:00
Andrew Stucki	5deffbd95b	Fix issue where terminating gateway service resolvers weren't properly cleaned up (#16498 ) * Fix issue where terminating gateway service resolvers weren't properly cleaned up * Add integration test for cleaning up resolvers * Add changelog entry * Use state test and drop integration test	2023-03-03 09:56:57 -05:00
Andrew Stucki	4b661d1e0c	Add ServiceResolver RequestTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable (#16495 ) * Leverage ServiceResolver ConnectTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable * Regenerate golden files * Add RequestTimeout field * Add changelog entry	2023-03-03 09:37:12 -05:00
John Eikenberry	e8eec1fa80	add provider ca auth support for kubernetes Adds support for Kubernetes jwt/token file based auth. Only needs to read the file and save the contents as the jwt/token.	2023-03-02 22:05:40 +00:00
John Eikenberry	4211069080	add provider ca support for jwt file base auth Adds support for a jwt token in a file. Simply reads the file and sends the read in jwt along to the vault login. It also supports a legacy mode with the jwt string being passed directly. In which case the path is made optional.	2023-03-02 20:33:06 +00:00
Chris S. Kim	321439f5a7	Speed up test by registering services concurrently (#16509 )	2023-03-02 14:36:44 -05:00
John Eikenberry	4f2d9a91e5	add provider ca auth-method support for azure Does the required dance with the local HTTP endpoint to get the required data for the jwt based auth setup in Azure. Keeps support for 'legacy' mode where all login data is passed on via the auth methods parameters. Refactored check for hardcoded /login fields.	2023-03-01 00:07:33 +00:00
Dan Upton	73b9b407ba	grpc: fix data race in balancer registration (#16229 ) Registering gRPC balancers is thread-unsafe because they are stored in a global map variable that is accessed without holding a lock. Therefore, it's expected that balancers are registered _once_ at the beginning of your program (e.g. in a package `init` function) and certainly not after you've started dialing connections, etc. > NOTE: this function must only be called during initialization time > (i.e. in an init() function), and is not thread-safe. While this is fine for us in production, it's challenging for tests that spin up multiple agents in-memory. We currently register a balancer per- agent which holds agent-specific state that cannot safely be shared. This commit introduces our own registry that _is_ thread-safe, and implements the Builder interface such that we can call gRPC's `Register` method once, on start-up. It uses the same pattern as our resolver registry where we use the dial target's host (aka "authority"), which is unique per-agent, to determine which builder to use.	2023-02-28 10:18:38 +00:00
Andrew Stucki	801a17329e	Fix attempt for test fail panics in xDS (#16319 ) * Fix attempt for test fail panics in xDS * switch to a mutex pointer	2023-02-24 17:00:31 -05:00
Chris S. Kim	a518893685	Fix various flaky tests (#16396 )	2023-02-23 14:52:18 -05:00
Eric Haberkorn	595131fca9	Refactor the disco chain -> xds logic (#16392 )	2023-02-23 11:32:32 -05:00
Paul Banks	8ac211b427	Correct WAL metrics registrations (#16388 )	2023-02-23 14:07:17 +00:00
Dhia Ayachi	ae9c228967	Rate limiter/add ip prefix (#16342 ) * add support for prefixes in the config tree * fix to use default config when the prefix have no config	2023-02-22 15:15:51 -05:00
Andrew Stucki	641737f32b	[API Gateway] Fix infinite loop in controller and binding non-accepted routes and gateways (#16377 )	2023-02-22 14:55:40 -05:00
Andrew Stucki	0972697661	[API Gateway] Various fixes for Config Entry fields (#16347 ) * [API Gateway] Various fixes for Config Entry fields * simplify logic per PR review	2023-02-22 04:02:04 +00:00
Andrew Stucki	18e2ee77ca	[API Gateway] Fix targeting service splitters in HTTPRoutes (#16350 ) * [API Gateway] Fix targeting service splitters in HTTPRoutes * Fix test description	2023-02-22 03:48:26 +00:00
Andrew Stucki	823fc821fa	[API Gateway] Turn down controller log levels (#16348 )	2023-02-21 20:42:01 -06:00
Derek Menteer	ad865f549b	Fix issue with peer services incorrectly appearing as connect-enabled. (#16339 ) Prior to this commit, all peer services were transmitted as connect-enabled as long as a one or more mesh-gateways were healthy. With this change, there is now a difference between typical services and connect services transmitted via peering. A service will be reported as "connect-enabled" as long as any of these conditions are met: 1. a connect-proxy sidecar is registered for the service name. 2. a connect-native instance of the service is registered. 3. a service resolver / splitter / router is registered for the service name. 4. a terminating gateway has registered the service.	2023-02-21 13:59:36 -06:00
Andrew Stucki	7f9ec78932	[API Gateway] Validate listener name is not empty (#16340 ) * [API Gateway] Validate listener name is not empty * Update docstrings and test	2023-02-21 14:12:19 -05:00
cskh	8e5942f5ca	fix: add tls config to unix socket when https is used (#16301 ) * fix: add tls config to unix socket when https is used * unit test and changelog	2023-02-21 08:28:13 -05:00
Andrew Stucki	4607b535be	Fix HTTPRoute and TCPRoute expectation for enterprise metadata (#16322 )	2023-02-17 17:28:49 -05:00
Andrew Stucki	15d2684ecc	Normalize all API Gateway references (#16316 )	2023-02-17 21:37:34 +00:00
Matt Keeler	085c0addc0	Protobuf Refactoring for Multi-Module Cleanliness (#16302 ) Protobuf Refactoring for Multi-Module Cleanliness This commit includes the following: Moves all packages that were within proto/ to proto/private Rewrites imports to account for the packages being moved Adds in buf.work.yaml to enable buf workspaces Names the proto-public buf module so that we can override the Go package imports within proto/buf.yaml Bumps the buf version dependency to 1.14.0 (I was trying out the version to see if it would get around an issue - it didn't but it also doesn't break things and it seemed best to keep up with the toolchain changes) Why: In the future we will need to consume other protobuf dependencies such as the Google HTTP annotations for openapi generation or grpc-gateway usage. There were some recent changes to have our own ratelimiting annotations. The two combined were not working when I was trying to use them together (attempting to rebase another branch) Buf workspaces should be the solution to the problem Buf workspaces means that each module will have generated Go code that embeds proto file names relative to the proto dir and not the top level repo root. This resulted in proto file name conflicts in the Go global protobuf type registry. The solution to that was to add in a private/ directory into the path within the proto/ directory. That then required rewriting all the imports. Is this safe? AFAICT yes The gRPC wire protocol doesn't seem to care about the proto file names (although the Go grpc code does tack on the proto file name as Metadata in the ServiceDesc) Other than imports, there were no changes to any generated code as a result of this.	2023-02-17 16:14:46 -05:00
Dan Stough	f1436109ea	[OSS] security: update go to 1.20.1 (#16263 ) * security: update go to 1.20.1	2023-02-17 15:04:12 -05:00
Andrew Stucki	58801cc8aa	Add stricter validation and some normalization code for API Gateway ConfigEntries (#16304 ) * Add stricter validation and some normalization code for API Gateway ConfigEntries	2023-02-17 19:22:01 +00:00
Andrew Stucki	ee99d5c3a0	Fix panicky xDS test flakes (#16305 ) * Add defensive guard to make some tests less flaky and panic less * Do the actual fix	2023-02-17 14:07:49 -05:00
Andrew Stucki	e4a992c581	Fix hostname alignment checks for HTTPRoutes (#16300 ) * Fix hostname alignment checks for HTTPRoutes	2023-02-17 18:18:11 +00:00
Andrew Stucki	b3ddd4d24e	Inline API Gateway TLS cert code (#16295 ) * Include secret type when building resources from config snapshot * First pass at generating envoy secrets from api-gateway snapshot * Update comments for xDS update order * Add secret type + corresponding golden files to existing tests * Initialize test helpers for testing api-gateway resource generation * Generate golden files for new api-gateway xDS resource test * Support ADS for TLS certificates on api-gateway * Configure TLS on api-gateway listeners * Inline TLS cert code * update tests * Add SNI support so we can have multiple certificates * Remove commented out section from helper * regen deep-copy * Add tcp tls test --------- Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>	2023-02-17 12:46:03 -05:00
Nitya Dhanushkodi	8dab825c36	troubleshoot: fixes and updated messages (#16294 )	2023-02-17 07:43:05 -08:00
Thomas Eckert	2460ac99c9	API Gateway Envoy Golden Listener Tests (#16221 ) * Simple API Gateway e2e test for tcp routes * Drop DNSSans since we don't front the Gateway with a leaf cert * WIP listener tests for api-gateway * Return early if no routes * Add back in leaf cert to testing * Fix merge conflicts * Re-add kind to setup * Fix iteration over listener upstreams * New tcp listener test * Add tests for API Gateway with TCP and HTTP routes * Move zero-route check back * Drop generateIngressDNSSANs * Check for chains not routes --------- Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>	2023-02-16 14:42:36 -05:00
Derek Menteer	30112288c8	Fix mesh gateways incorrectly matching peer locality. (#16257 ) Fix mesh gateways incorrectly matching peer locality. This fixes an issue where local mesh gateways use an incorrect address when attempting to forward traffic to a peered datacenter. Prior to this change it would use the lan address instead of the wan if the locality matched. This should never be done for peering, since we must route all traffic through the remote mesh gateway.	2023-02-16 09:22:41 -06:00
Nathan Coleman	514fb25a6f	Fix infinite recursion in inline-certificate config entry (#16276 ) * Fix infinite recursion on InlineCertificateConfigEntry GetNamespace() + GetMeta() were calling themselves. This change also simplifies by removing nil-checking to match pre-existing config entries Co-Authored-By: Andrew Stucki <3577250+andrewstucki@users.noreply.github.com> * Add tests for inline-certificate * Add alias for private key field on inline-certificate * Use valid certificate + private key for inline-certificate tests --------- Co-authored-by: Andrew Stucki <3577250+andrewstucki@users.noreply.github.com>	2023-02-15 13:49:34 -06:00
Derek Menteer	6599a9be1d	Fix nil-pointer panics from proxycfg package. (#16277 ) Prior to this PR, servers / agents would panic and crash if an ingress or api gateway were configured to use a discovery chain that both: 1. Referenced a peered service 2. Had a mesh gateway mode of local This could occur, because code for handling upstream watches was shared between both connect-proxy and the gateways. As a short-term fix, this PR ensures that the maps are always initialized for these gateway services. This PR also wraps the proxycfg execution and service registration calls with recover statements to ensure that future issues like this do not put the server into an unrecoverable state.	2023-02-15 11:54:44 -06:00
Andrew Stucki	9bb0ecfc18	[API Gateway] Add integration test for HTTP routes (#16236 ) * [API Gateway] Add integration test for conflicted TCP listeners * [API Gateway] Update simple test to leverage intentions and multiple listeners * Fix broken unit test * [API Gateway] Add integration test for HTTP routes	2023-02-13 14:18:05 -05:00
Semir Patel	8979e64a94	Bump x/time to 0.3.0 and fix related breakage linked to RPCRateLimit (#16241 ) * Bump x/time to 0.3.0 and fix related breakage linked to RPCRateLimit initialization * Apply limitVal(...) to other rate.Limit config fields	2023-02-13 11:11:51 -06:00
Andrew Stucki	8ff2974dbe	[API Gateway] Update simple test to leverage intentions and multiple listeners (#16228 ) * [API Gateway] Add integration test for conflicted TCP listeners * [API Gateway] Update simple test to leverage intentions and multiple listeners * Fix broken unit test * PR suggestions	2023-02-10 21:13:44 +00:00
Andrew Stucki	4c848a554d	Fix missing references to enterprise metadata (#16237 )	2023-02-10 20:47:16 +00:00
Andrew Stucki	318ba215ab	[API Gateway] Add integration test for conflicted TCP listeners (#16225 )	2023-02-10 11:34:01 -06:00
Derek Menteer	4f2ce60654	Fix peering acceptors in secondary datacenters. (#16230 ) Prior to this commit, secondary datacenters could not be initialized as peering acceptors if ACLs were enabled. This is due to the fact that internal server-to-server API calls would fail because the management token was not generated. This PR makes it so that both primary and secondary datacenters generate their own management token whenever a leader is elected in their respective clusters.	2023-02-10 09:47:17 -06:00
Andrew Stucki	3b9c569561	Simple API Gateway e2e test for tcp routes (#16222 ) * Simple API Gateway e2e test for tcp routes * Drop DNSSans since we don't front the Gateway with a leaf cert	2023-02-09 16:20:12 -05:00
skpratt	db2bd404bf	Synthesize anonymous token pre-bootstrap when needed (#16200 ) * add bootstrapping detail for acl errors * error detail improvements * update acl bootstrapping test coverage * update namespace errors * update test coverage * consolidate error message code and update changelog * synthesize anonymous token * Update token language to distinguish Accessor and Secret ID usage (#16044) * remove legacy tokens * remove lingering legacy token references from docs * update language and naming for token secrets and accessor IDs * updates all tokenID references to clarify accessorID * remove token type references and lookup tokens by accessorID index * remove unnecessary constants * replace additional tokenID param names * Add warning info for deprecated -id parameter Co-authored-by: Paul Glass <pglass@hashicorp.com> * Update field comment Co-authored-by: Paul Glass <pglass@hashicorp.com> --------- Co-authored-by: Paul Glass <pglass@hashicorp.com> * revert naming change * add testing * revert naming change --------- Co-authored-by: Paul Glass <pglass@hashicorp.com>	2023-02-09 20:34:02 +00:00
Thomas Eckert	e81a0c2855	API Gateway to Ingress Gateway Snapshot Translation and Routes to Virtual Routers and Splitters (#16127 ) * Stub proxycfg handler for API gateway * Add Service Kind constants/handling for API Gateway * Begin stubbing for SDS * Add new Secret type to xDS order of operations * Continue stubbing of SDS * Iterate on proxycfg handler for API gateway * Handle BoundAPIGateway config entry subscription in proxycfg-glue * Add API gateway to config snapshot validation * Add API gateway to config snapshot clone, leaf, etc. * Subscribe to bound route + cert config entries on bound-api-gateway * Track routes + certs on API gateway config snapshot * Generate DeepCopy() for types used in watch.Map * Watch all active references on api-gateway, unwatch inactive * Track loading of initial bound-api-gateway config entry * Use proper proto package for SDS mapping * Use ResourceReference instead of ServiceName, collect resources * Fix typo, add + remove TODOs * Watch discovery chains for TCPRoute * Add TODO for updating gateway services for api-gateway * make proto * Regenerate deep-copy for proxycfg * Set datacenter on upstream ID from query source * Watch discovery chains for http-route service backends * Add ServiceName getter to HTTP+TCP Service structs * Clean up unwatched discovery chains on API Gateway * Implement watch for ingress leaf certificate * Collect upstreams on http-route + tcp-route updates * Remove unused GatewayServices update handler * Remove unnecessary gateway services logic for API Gateway * Remove outdate TODO * Use .ToIngress where appropriate, including TODO for cleaning up * Cancel before returning error * Remove GatewayServices subscription * Add godoc for handlerAPIGateway functions * Update terminology from Connect => Consul Service Mesh Consistent with terminology changes in https://github.com/hashicorp/consul/pull/12690 * Add missing TODO * Remove duplicate switch case * Rerun deep-copy generator * Use correct property on config snapshot * Remove unnecessary leaf cert watch * Clean up based on code review feedback * Note handler properties that are initialized but set elsewhere * Add TODO for moving helper func into structs pkg * Update generated DeepCopy code * gofmt * Begin stubbing for SDS * Start adding tests * Remove second BoundAPIGateway case in glue * TO BE PICKED: fix formatting of str * WIP * Fix merge conflict * Implement HTTP Route to Discovery Chain config entries * Stub out function to create discovery chain * Add discovery chain merging code (#16131) * Test adding TCP and HTTP routes * Add some tests for the synthesizer * Run go mod tidy * Pairing with N8 * Run deep copy * Clean up GatewayChainSynthesizer * Fix missing assignment of BoundAPIGateway topic * Separate out synthesizeChains and toIngressTLS * Fix build errors * Ensure synthesizer skips non-matching routes by protocol * Rebase on N8s work * Generate DeepCopy() for API gateway listener types * Improve variable name * Regenerate DeepCopy() code * Fix linting issue * fix protobuf import * Fix more merge conflict errors * Fix synthesize test * Run deep copy * Add URLRewrite to proto * Update agent/consul/discoverychain/gateway_tcproute.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Remove APIGatewayConfigEntry that was extra * Error out if route kind is unknown * Fix formatting errors in proto --------- Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>	2023-02-09 17:58:55 +00:00
Andrew Stucki	f4210d47dd	Add basic smoke test to make sure an APIGateway runs (#16217 )	2023-02-09 11:32:10 -05:00
Andrew Stucki	0891b4554d	Clean-up Gateway Controller Binding Logic (#16214 ) * Fix detecting when a route doesn't bind to a gateway because it's already bound * Clean up status setting code * rework binding a bit * More cleanup * Flatten all files * Fix up docstrings	2023-02-09 10:17:25 -05:00
skpratt	6f0b226b0d	ACL error improvements: incomplete bootstrapping and non-existent token (#16105 ) * add bootstrapping detail for acl errors * error detail improvements * update acl bootstrapping test coverage * update namespace errors * update test coverage * add changelog * update message for unbootstrapped error * consolidate error message code and update changelog * logout message change	2023-02-08 23:49:44 +00:00
Nathan Coleman	72a73661c9	Implement APIGateway proxycfg snapshot (#16194 ) * Stub proxycfg handler for API gateway * Add Service Kind constants/handling for API Gateway * Begin stubbing for SDS * Add new Secret type to xDS order of operations * Continue stubbing of SDS * Iterate on proxycfg handler for API gateway * Handle BoundAPIGateway config entry subscription in proxycfg-glue * Add API gateway to config snapshot validation * Add API gateway to config snapshot clone, leaf, etc. * Subscribe to bound route + cert config entries on bound-api-gateway * Track routes + certs on API gateway config snapshot * Generate DeepCopy() for types used in watch.Map * Watch all active references on api-gateway, unwatch inactive * Track loading of initial bound-api-gateway config entry * Use proper proto package for SDS mapping * Use ResourceReference instead of ServiceName, collect resources * Fix typo, add + remove TODOs * Watch discovery chains for TCPRoute * Add TODO for updating gateway services for api-gateway * make proto * Regenerate deep-copy for proxycfg * Set datacenter on upstream ID from query source * Watch discovery chains for http-route service backends * Add ServiceName getter to HTTP+TCP Service structs * Clean up unwatched discovery chains on API Gateway * Implement watch for ingress leaf certificate * Collect upstreams on http-route + tcp-route updates * Remove unused GatewayServices update handler * Remove unnecessary gateway services logic for API Gateway * Remove outdate TODO * Use .ToIngress where appropriate, including TODO for cleaning up * Cancel before returning error * Remove GatewayServices subscription * Add godoc for handlerAPIGateway functions * Update terminology from Connect => Consul Service Mesh Consistent with terminology changes in https://github.com/hashicorp/consul/pull/12690 * Add missing TODO * Remove duplicate switch case * Rerun deep-copy generator * Use correct property on config snapshot * Remove unnecessary leaf cert watch * Clean up based on code review feedback * Note handler properties that are initialized but set elsewhere * Add TODO for moving helper func into structs pkg * Update generated DeepCopy code * gofmt * Generate DeepCopy() for API gateway listener types * Improve variable name * Regenerate DeepCopy() code * Fix linting issue * Temporarily remove the secret type from resource generation	2023-02-08 15:52:12 -06:00
Nitya Dhanushkodi	1f25289048	troubleshoot: output messages for the troubleshoot proxy command (#16208 )	2023-02-08 13:03:15 -08:00
Kyle Havlovitz	898e59b13c	Add the `operator usage instances` command and api endpoint (#16205 ) This endpoint shows total services, connect service instances and billable service instances in the local datacenter or globally. Billable instances = total service instances - connect services - consul server instances.	2023-02-08 12:07:21 -08:00
Andrew Stucki	df03b45bbc	Add additional controller implementations (#16188 ) * Add additional controller implementations * remove additional interface * Fix comparison checks and mark unused contexts * Switch to time.Now().UTC() * Add a pointer helper for shadowing loop variables * Extract anonymous functions for readability * clean up logging * Add Type to the Condition proto * Update some comments and add additional space for readability * Address PR feedback * Fix up dirty checks and change to pointer receiver	2023-02-08 14:50:17 -05:00
Paul Banks	5397e9ee7f	Adding experimental support for a more efficient LogStore implementation (#16176 ) * Adding experimental support for a more efficient LogStore implementation * Adding changelog entry * Fix go mod tidy issues	2023-02-08 16:50:22 +00:00
cskh	e91bc9c058	feat: envoy extension - http local rate limit (#16196 ) - http local rate limit - Apply rate limit only to local_app - unit test and integ test	2023-02-07 21:56:15 -05:00
John Eikenberry	ed7367b6f4	remove redundant vault api retry logic (#16143 ) remove redundant vault api retry logic We upgraded Vault API module version to a version that has built-in retry logic. So this code is no longer necessary. Also add mention of re-configuring the provider in comments.	2023-02-07 20:52:22 +00:00
skpratt	1e7e52e3ef	revert method name change in xds server protocol for version compatibility (#16195 )	2023-02-07 14:19:09 -06:00
skpratt	9199e99e21	Update token language to distinguish Accessor and Secret ID usage (#16044 ) * remove legacy tokens * remove lingering legacy token references from docs * update language and naming for token secrets and accessor IDs * updates all tokenID references to clarify accessorID * remove token type references and lookup tokens by accessorID index * remove unnecessary constants * replace additional tokenID param names * Add warning info for deprecated -id parameter Co-authored-by: Paul Glass <pglass@hashicorp.com> * Update field comment Co-authored-by: Paul Glass <pglass@hashicorp.com> --------- Co-authored-by: Paul Glass <pglass@hashicorp.com>	2023-02-07 12:26:30 -06:00
wangxinyi7	906ebb97f6	change log level (#16128 )	2023-02-06 12:58:13 -08:00
Dhia Ayachi	c680a35b36	Net 2229/rpc reduce max retries 2 (#16165 ) * feat: calculate retry wait time with exponential back-off * test: add test for getWaitTime method * feat: enforce random jitter between min value from previous iteration and current * extract randomStagger to simplify tests and use Milliseconds to avoid float math. * rename variables * add test and rename comment --------- Co-authored-by: Poonam Jadhav <poonam.jadhav@hashicorp.com>	2023-02-06 14:07:41 -05:00
Nitya Dhanushkodi	b8b37c2357	refactor: remove troubleshoot module dependency on consul top level module (#16162 ) Ensure nothing in the troubleshoot go module depends on consul's top level module. This is so we can import troubleshoot into consul-k8s and not import all of consul. * turns troubleshoot into a go module [authored by @curtbushko] * gets the envoy protos into the troubleshoot module [authored by @curtbushko] * adds a new go module `envoyextensions` which has xdscommon and extensioncommon folders that both the xds package and the troubleshoot package can import * adds testing and linting for the new go modules * moves the unit tests in `troubleshoot/validateupstream` that depend on proxycfg/xds into the xds package, with a comment describing why those tests cannot be in the troubleshoot package * fixes all the imports everywhere as a result of these changes Co-authored-by: Curt Bushko <cbushko@gmail.com>	2023-02-06 09:14:35 -08:00
Poonam Jadhav	24c431270c	feat: client RPC is retries on ErrRetryElsewhere error and forwardRequestToLeader method retries ErrRetryLater error (#16099 )	2023-02-06 11:31:25 -05:00
skpratt	a010902978	Remove legacy acl policies (#15922 ) * remove legacy tokens * remove legacy acl policies * flatten test policies to _prefix address oss feedback re: phrasing and tests	2023-02-06 15:35:52 +00:00
John Eikenberry	5c836f2aa9	fix goroutine leak in renew testing (#16142 ) fix goroutine leak in renew testing Test overwrote the stopWatcher() function variable for the test without keeping and calling the original value. The original value is the function that stops the goroutine... so it needs to be called.	2023-02-03 22:09:34 +00:00
sarahalsmiller	143b2bc1f0	API Gateway Controller Logic (#16058 ) * Add initial API gateway controller logic --------- Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com> Co-authored-by: Thomas Eckert <teckert@hashicorp.com>	2023-02-03 21:55:48 +00:00
Derek Menteer	2f149d60cc	[OSS] Add Peer field to service-defaults upstream overrides (#15956 ) * Add Peer field to service-defaults upstream overrides. * add api changes, compat mode for service default overrides * Fixes based on testing --------- Co-authored-by: DanStough <dan.stough@hashicorp.com>	2023-02-03 10:51:53 -05:00
Paul Glass	a884d0d7c7	Use agent token for service/check deregistration during anti-entropy (#16097 ) Use only the agent token for deregistration during anti-entropy The previous behavior had the agent attempt to use the "service" token (i.e. from the `token` field in a service definition file), and if that was not set then it would use the agent token. The previous behavior was problematic because, if the service token had been deleted, the deregistration request would fail. The agent would retry the deregistration during each anti-entropy sync, and the situation would never resolve. The new behavior is to only/always use the agent token for service and check deregistration during anti-entropy. This approach is: * Simpler: No fallback logic to try different tokens * Faster (slightly): No time spent attempting the service token * Correct: The agent token is able to deregister services on that agent's node, because: * node:write permissions allow deregistration of services/checks on that node. * The agent token must have node:write permission, or else the agent is not be able to (de)register itself into the catalog Co-authored-by: Vesa Hagström <weeezes@gmail.com>	2023-02-03 08:45:11 -06:00
Dan Upton	e40b731a52	rate: add prometheus definitions, docs, and clearer names (#15945 )	2023-02-03 12:01:57 +00:00
Nitya Dhanushkodi	8d4c3aa42c	refactor: move service to service validation to troubleshoot package (#16132 ) This is to reduce the dependency on xds from within the troubleshoot package.	2023-02-02 22:18:10 -08:00
Derek Menteer	06338c8ee7	Add unit test and update golden files. (#16115 )	2023-02-01 09:51:08 -06:00
Andrew Stucki	1fbfb5905b	APIGateway HTTPRoute scaffolding (#15859 ) * Stub Config Entries for Consul Native API Gateway (#15644) * Add empty InlineCertificate struct and protobuf * apigateway stubs * new files * Stub HTTPRoute in api pkg * checkpoint * Stub HTTPRoute in structs pkg * Simplify api.APIGatewayConfigEntry to be consistent w/ other entries * Update makeConfigEntry switch, add docstring for HTTPRouteConfigEntry * Add TCPRoute to MakeConfigEntry, return unique Kind * proto generated files * Stub BoundAPIGatewayConfigEntry in agent Since this type is only written by a controller and read by xDS, it doesn't need to be defined in the `api` pkg * Add RaftIndex to APIGatewayConfigEntry stub * Add new config entry kinds to validation allow-list * Add RaftIndex to other added config entry stubs * fix panic * Update usage metrics assertions to include new cfg entries * Regenerate proto w/ Go 1.19 * Run buf formatter on config_entry.proto * Add Meta and acl.EnterpriseMeta to all new ConfigEntry types * Remove optional interface method Warnings() for now Will restore later if we wind up needing it * Remove unnecessary Services field from added config entry types * Implement GetMeta(), GetEnterpriseMeta() for added config entry types * Add meta field to proto, name consistently w/ existing config entries * Format config_entry.proto * Add initial implementation of CanRead + CanWrite for new config entry types * Add unit tests for decoding of new config entry types * Add unit tests for parsing of new config entry types * Add unit tests for API Gateway config entry ACLs * Return typed PermissionDeniedError on BoundAPIGateway CanWrite * Add unit tests for added config entry ACLs * Add BoundAPIGateway type to AllConfigEntryKinds * Return proper kind from BoundAPIGateway * Add docstrings for new config entry types * Add missing config entry kinds to proto def * Update usagemetrics_oss_test.go * Use utility func for returning PermissionDeniedError * Add BoundAPIGateway to proto def Co-authored-by: Sarah Alsmiller <sarah.alsmiller@hashicorp.com> Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Add APIGateway validation * Fix comment * Add additional validations * Add cert ref validation * Add protobuf definitions * Tabs to spaces * Fix up field types * Add API structs * Move struct fields around a bit * EventPublisher subscriptions for Consul Native API Gateway (#15757) * Create new event topics in subscribe proto * Add tests for PBSubscribe func * Make configs singular, add all configs to PBToStreamSubscribeRequest * Add snapshot methods * Add config_entry_events tests * Add config entry kind to topic for new configs * Add unit tests for snapshot methods * Start adding integration test * Test using the new controller code * Update agent/consul/state/config_entry_events.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Check value of error Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Add controller stubs for API Gateway (#15837) * update initial stub implementation * move files, clean up mutex references * Remove embed, use idiomatic names for constructors * Remove stray file introduced in merge Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Initial server-side and proto defs * drop trailing whitespace * Add APIGateway validation (#15847) * Add APIGateway validation * Fix comment * Add additional validations * Add cert ref validation * Add protobuf definitions * Tabs to spaces * Fix up field types * Add API structs * Move struct fields around a bit * APIGateway InlineCertificate validation (#15856) * Add APIGateway validation * Add additional validations * Add protobuf definitions * Tabs to spaces * Add API structs * Move struct fields around a bit * Add validation for InlineCertificate * Fix ACL test * APIGateway BoundAPIGateway validation (#15858) * Add APIGateway validation * Fix comment * Add additional validations * Add cert ref validation * Add protobuf definitions * Tabs to spaces * Fix up field types * Add API structs * Move struct fields around a bit * Add validation for BoundAPIGateway * drop trailing whitespace * APIGateway TCPRoute validation (#15855) * Add APIGateway validation * Fix comment * Add additional validations * Add cert ref validation * Add protobuf definitions * Tabs to spaces * Fix up field types * Add API structs * Move struct fields around a bit * Add TCPRoute normalization and validation * Address PR feedback * Add forgotten Status * Add some more field docs in api package * Fix test * Fix bad merge * Remove duplicate helpers * Fix up proto defs * Fix up stray changes * remove extra newline --------- Co-authored-by: Thomas Eckert <teckert@hashicorp.com> Co-authored-by: Sarah Alsmiller <sarah.alsmiller@hashicorp.com> Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> Co-authored-by: sarahalsmiller <100602640+sarahalsmiller@users.noreply.github.com>	2023-02-01 07:59:49 -05:00
Derek Menteer	b19c5a94c7	Add Envoy extension metrics. (#16114 )	2023-01-31 14:50:30 -06:00
cskh	f6da81c9d0	improvement: prevent filter being added twice from any enovy extension (#16112 ) * improvement: prevent filter being added twice from any enovy extension * break if error != nil * update test	2023-01-31 16:49:45 +00:00
Poonam Jadhav	9db5b7d896	feat: apply retry policy to read only grpc endpoints (#16085 )	2023-01-31 10:44:25 -05:00
Derek Menteer	1b02749375	Add extension validation on config save and refactor extensions. (#16110 )	2023-01-30 15:35:26 -06:00
Nitya Dhanushkodi	8728a4496c	troubleshoot: service to service validation (#16096 ) * Add Tproxy support to Envoy Extensions (this is needed for service to service validation) * Add validation for Envoy configuration for an upstream service * Use both /config_dump and /cluster to validate Envoy configuration This is because of a bug in Envoy where the EndpointsConfigDump does not include a cluster_name, making it impossible to match an endpoint to verify it exists. This removes endpoints support for builtin extensions since only the validate plugin was using it, and it is no longer used. It also removes test cases for endpoint validation. Endpoints validation now only occurs in the top level test from config_dump and clusters json files. Co-authored-by: Eric <eric@haberkorn.co>	2023-01-27 11:43:16 -08:00
Andrew Stucki	da99514ac8	Add a server-only method for updating ConfigEntry Statuses (#16053 ) * Add a server-only method for updating ConfigEntry Statuses * Address PR feedback * Regen proto	2023-01-27 14:34:11 -05:00
skpratt	ad43846755	Remove legacy acl tokens (#15947 ) * remove legacy tokens * Update test comment Co-authored-by: Paul Glass <pglass@hashicorp.com> * fix imports * update docs for additional CLI changes * add test case for anonymous token * set deprecated api fields to json ignore and fix patch errors * update changelog to breaking-change * fix import * update api docs to remove legacy reference * fix docs nav data --------- Co-authored-by: Paul Glass <pglass@hashicorp.com>	2023-01-27 09:17:07 -06:00
Thomas Eckert	7814471159	Match route and listener protocols when binding (#16057 ) * Add GatewayMeta for matching routes to listeners based on protocols * Add GetGatewayMeta * Apply suggestions from code review Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Make GatewayMeta private * Bound -> BoundGateway * Document gatewayMeta more * Simplify conditional * Parallelize tests and simplify bind conditional * gofmt * 💧 getGatewayMeta --------- Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>	2023-01-27 09:41:03 -05:00
Michael Wilkerson	a1498b015d	Mw/lambda envoy extension parse region (#4107 ) (#16069 ) * updated builtin extension to parse region directly from ARN - added a unit test - added some comments/light refactoring * updated golden files with proper ARNs - ARNs need to be right format now that they are being processed * updated tests and integration tests - removed 'region' from all EnvoyExtension arguments - added properly formatted ARN which includes the same region found in the removed "Region" field: 'us-east-1'	2023-01-26 15:44:52 -08:00
Andrew Stucki	3febdbff39	Add trigger for doing reconciliation based on watch sets (#16052 ) * Add trigger for doing reconciliation based on watch sets * update doc string * Fix my grammar fail	2023-01-26 15:20:37 -05:00
Poonam Jadhav	f4f62b5da6	feat: panic handler in rpc rate limit interceptor (#16022 ) * feat: handle panic in rpc rate limit interceptor * test: additional test cases to rpc rate limiting interceptor * refactor: remove unused listener	2023-01-25 14:13:38 -05:00
Nathan Coleman	e0f4f6c152	Run config entry controller routines on leader (#16054 )	2023-01-25 12:21:46 -06:00
Ronald	6167aef641	Warn when the token query param is used for auth (#16009 )	2023-01-24 16:21:41 +00:00
Thomas Eckert	20146f2916	Implement BindRoutesToGateways (#15950 ) * Stub out bind code * Move into a new package and flesh out binding * Fill in the actual binding logic * Bind to all listeners if not specified * Move bind code up to gateways package * Fix resource type check * Add UpsertRoute to listeners * Add RemoveRoute to listener * Implement binding as associated functions * Pass in gateways to BindRouteToGateways * Add a bunch of tests * Fix hopping from one listener on a gateway to another * Remove parents from HTTPRoute * Apply suggestions from code review * Fix merge conflict * Unify binding into a single variadic function 🙌 @nathancoleman * Remove vestigial error * Add TODO on protocol check	2023-01-20 15:11:16 -05:00
cskh	25396d81c9	Apply agent partition to load services and agent api (#16024 ) * Apply agent partition to load services and agent api changelog	2023-01-20 12:59:26 -05:00
Derek Menteer	5f5e6864ca	Fix proxy-defaults incorrectly merging config on upstreams. (#16021 )	2023-01-20 11:25:51 -06:00
John Murret	794277371f	Integration test for server rate limiting (#15960 ) * rate limit test * Have tests for the 3 modes * added assertions for logs and metrics * add comments to test sections * add check for rate limit exceeded text in log assertion section. * fix linting error * updating test to use KV get and put. move log assertion tolast. * Adding logging for blocking messages in enforcing mode. refactoring tests. * modified test description * formatting * Apply suggestions from code review Co-authored-by: Dan Upton <daniel@floppy.co> * Update test/integration/consul-container/test/ratelimit/ratelimit_test.go Co-authored-by: Dhia Ayachi <dhia@hashicorp.com> * expand log checking so that it ensures both logs are they when they are supposed to be and not there when they are not expected to be. * add retry on test * Warn once when rate limit exceed regardless of enforcing vs permissive. * Update test/integration/consul-container/test/ratelimit/ratelimit_test.go Co-authored-by: Dan Upton <daniel@floppy.co> Co-authored-by: Dan Upton <daniel@floppy.co> Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>	2023-01-19 08:43:33 -07:00
Thomas Eckert	13da1a5285	Native API Gateway Config Entries (#15897 ) * Stub Config Entries for Consul Native API Gateway (#15644) * Add empty InlineCertificate struct and protobuf * apigateway stubs * Stub HTTPRoute in api pkg * Stub HTTPRoute in structs pkg * Simplify api.APIGatewayConfigEntry to be consistent w/ other entries * Update makeConfigEntry switch, add docstring for HTTPRouteConfigEntry * Add TCPRoute to MakeConfigEntry, return unique Kind * Stub BoundAPIGatewayConfigEntry in agent * Add RaftIndex to APIGatewayConfigEntry stub * Add new config entry kinds to validation allow-list * Add RaftIndex to other added config entry stubs * Update usage metrics assertions to include new cfg entries * Add Meta and acl.EnterpriseMeta to all new ConfigEntry types * Remove unnecessary Services field from added config entry types * Implement GetMeta(), GetEnterpriseMeta() for added config entry types * Add meta field to proto, name consistently w/ existing config entries * Format config_entry.proto * Add initial implementation of CanRead + CanWrite for new config entry types * Add unit tests for decoding of new config entry types * Add unit tests for parsing of new config entry types * Add unit tests for API Gateway config entry ACLs * Return typed PermissionDeniedError on BoundAPIGateway CanWrite * Add unit tests for added config entry ACLs * Add BoundAPIGateway type to AllConfigEntryKinds * Return proper kind from BoundAPIGateway * Add docstrings for new config entry types * Add missing config entry kinds to proto def * Update usagemetrics_oss_test.go * Use utility func for returning PermissionDeniedError * EventPublisher subscriptions for Consul Native API Gateway (#15757) * Create new event topics in subscribe proto * Add tests for PBSubscribe func * Make configs singular, add all configs to PBToStreamSubscribeRequest * Add snapshot methods * Add config_entry_events tests * Add config entry kind to topic for new configs * Add unit tests for snapshot methods * Start adding integration test * Test using the new controller code * Update agent/consul/state/config_entry_events.go * Check value of error * Add controller stubs for API Gateway (#15837) * update initial stub implementation * move files, clean up mutex references * Remove embed, use idiomatic names for constructors * Remove stray file introduced in merge * Add APIGateway validation (#15847) * Add APIGateway validation * Add additional validations * Add cert ref validation * Add protobuf definitions * Fix up field types * Add API structs * Move struct fields around a bit * APIGateway InlineCertificate validation (#15856) * Add APIGateway validation * Add additional validations * Add protobuf definitions * Tabs to spaces * Add API structs * Move struct fields around a bit * Add validation for InlineCertificate * Fix ACL test * APIGateway BoundAPIGateway validation (#15858) * Add APIGateway validation * Add additional validations * Add cert ref validation * Add protobuf definitions * Fix up field types * Add API structs * Move struct fields around a bit * Add validation for BoundAPIGateway * APIGateway TCPRoute validation (#15855) * Add APIGateway validation * Add additional validations * Add cert ref validation * Add protobuf definitions * Fix up field types * Add API structs * Add TCPRoute normalization and validation * Add forgotten Status * Add some more field docs in api package * Fix test * Format imports * Rename snapshot test variable names * Add plumbing for Native API GW Subscriptions (#16003) Co-authored-by: Sarah Alsmiller <sarah.alsmiller@hashicorp.com> Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> Co-authored-by: sarahalsmiller <100602640+sarahalsmiller@users.noreply.github.com> Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>	2023-01-18 22:14:34 +00:00
Chris Thain	2f4c8e50f2	Support Vault agent auth config for AWS/GCP CA provider auth (#15970 )	2023-01-18 11:53:04 -08:00
Derek Menteer	2facf50923	Fix configuration merging for implicit tproxy upstreams. (#16000 ) Fix configuration merging for implicit tproxy upstreams. Change the merging logic so that the wildcard upstream has correct proxy-defaults and service-defaults values combined into it. It did not previously merge all fields, and the wildcard upstream did not exist unless service-defaults existed (it ignored proxy-defaults, essentially). Change the way we fetch upstream configuration in the xDS layer so that it falls back to the wildcard when no matching upstream is found. This is what allows implicit peer upstreams to have the correct "merged" config. Change proxycfg to always watch local mesh gateway endpoints whenever a peer upstream is found. This simplifies the logic so that we do not have to inspect the "merged" configuration on peer upstreams to extract the mesh gateway mode.	2023-01-18 13:43:53 -06:00
Dan Upton	7a55de375c	xds: don't attempt to load-balance sessions for local proxies (#15789 ) Previously, we'd begin a session with the xDS concurrency limiter regardless of whether the proxy was registered in the catalog or in the server's local agent state. This caused problems for users who run `consul connect envoy` directly against a server rather than a client agent, as the server's locally registered proxies wouldn't be included in the limiter's capacity. Now, the `ConfigSource` is responsible for beginning the session and we only do so for services in the catalog. Fixes: https://github.com/hashicorp/consul/issues/15753	2023-01-18 12:33:21 -06:00
Chris S. Kim	e4a268e33e	Warn if ACL is enabled but no token is provided to Envoy (#15967 )	2023-01-16 12:31:56 -05:00
Dhia Ayachi	87ff8c1c95	avoid logging RPC errors when it's specific rate limiter errors (#15968 ) * avoid logging RPC errors when it's specific rate limiter errors * simplify if statements	2023-01-16 12:08:09 -05:00
Derek Menteer	19a46d6ca4	Enforce lowercase peer names. (#15697 ) Enforce lowercase peer names. Prior to this change peer names could be mixed case. This can cause issues, as peer names are used as DNS labels in various locations. It also caused issues with envoy configuration.	2023-01-13 14:20:28 -06:00
Dan Stough	6d2880e894	feat: add access logs to dataplane bootstrap rpc (#15951 )	2023-01-11 13:40:09 -05:00
Matt Keeler	5afd4657ec	Protobuf Modernization (#15949 ) * Protobuf Modernization Remove direct usage of golang/protobuf in favor of google.golang.org/protobuf Marshallers (protobuf and json) needed some changes to account for different APIs. Moved to using the google.golang.org/protobuf/types/known/* for the well known types including replacing some custom Struct manipulation with whats available in the structpb well known type package. This also updates our devtools script to install protoc-gen-go from the right location so that files it generates conform to the correct interfaces. * Fix go-mod-tidy make target to work on all modules	2023-01-11 09:39:10 -05:00
Paul Glass	f5231b9157	Add new config_file_service_registration token (#15828 )	2023-01-10 10:24:02 -06:00
Chris S. Kim	a7b34d50fc	Output user-friendly name for anonymous token (#15884 )	2023-01-09 12:28:53 -06:00
Dan Upton	644cd864a5	Rate limit improvements and fixes (#15917 ) - Fixes a panic when Operation.SourceAddr is nil (internal net/rpc calls) - Adds proper HTTP response codes (429 and 503) for rate limit errors - Makes the error messages clearer - Enables automatic retries for rate-limit errors in the net/rpc stack	2023-01-09 10:20:05 +00:00
Semir Patel	40c0bb24ae	emit metrics for global rate limiting (#15891 )	2023-01-06 17:49:33 -06:00
Dhia Ayachi	233eacf0a4	inject logger and create logdrop sink (#15822 ) * inject logger and create logdrop sink * init sink with an empty struct instead of nil * wrap a logger instead of a sink and add a discard logger to avoid double logging * fix compile errors * fix linter errors * Fix bug where log arguments aren't properly formatted * Move log sink construction outside of handler * Add prometheus definition and docs for log drop counter Co-authored-by: Daniel Upton <daniel@floppy.co>	2023-01-06 11:33:53 -07:00
Eric Haberkorn	8d923c1789	Add the Lua Envoy extension (#15906 )	2023-01-06 12:13:40 -05:00
Paul Glass	666c2b2e2b	Fix TLS_BadVerify test assertions on macOS (#15903 )	2023-01-05 11:47:45 -06:00
Dan Upton	b78de5a7a2	grpc/acl: fix bug where ACL token was required even if disabled (#15904 ) Fixes a bug introduced by #15346 where we'd always require an ACL token even if ACLs were disabled because we were erroneously treating `nil` identity as anonymous.	2023-01-05 16:31:18 +00:00
Dan Upton	d53ce39c32	grpc: switch servers and retry on error (#15892 ) This is the OSS portion of enterprise PR 3822. Adds a custom gRPC balancer that replicates the router's server cycling behavior. Also enables automatic retries for RESOURCE_EXHAUSTED errors, which we now get for free.	2023-01-05 10:21:27 +00:00
Nick Irvine	6fb628c07d	fix: return error when config file with unknown extension is passed (#15107 )	2023-01-04 16:57:00 -08:00
Florian Apolloner	077b0a48a3	Allow Operator Generated bootstrap token (#14437 ) Add support to provide an initial token via the bootstrap HTTP API, similar to hashicorp/nomad#12520	2023-01-04 20:19:33 +00:00
Semir Patel	a6482341a5	Wire up the rate limiter to net/rpc calls (#15879 )	2023-01-04 13:38:44 -06:00
Dan Upton	d4c435856b	grpc: `protoc` plugin for generating gRPC rate limit specifications (#15564 ) Adds automation for generating the map of `gRPC Method Name → Rate Limit Type` used by the middleware introduced in #15550, and will ensure we don't forget to add new endpoints. Engineers must annotate their RPCs in the proto file like so: ``` rpc Foo(FooRequest) returns (FooResponse) { option (consul.internal.ratelimit.spec) = { operation_type: READ, }; } ``` When they run `make proto` a protoc plugin `protoc-gen-consul-rate-limit` will be installed that writes rate-limit specs as a JSON array to a file called `.ratelimit.tmp` (one per protobuf package/directory). After running Buf, `make proto` will execute a post-process script that will ingest all of the `.ratelimit.tmp` files and generate a Go file containing the mappings in the `agent/grpc-middleware` package. In the enterprise repository, it will write an additional file with the enterprise-only endpoints. If an engineer forgets to add the annotation to a new RPC, the plugin will return an error like so: ``` RPC Foo is missing rate-limit specification, fix it with: import "proto-public/annotations/ratelimit/ratelimit.proto"; service Bar { rpc Foo(...) returns (...) { option (hashicorp.consul.internal.ratelimit.spec) = { operation_type: OPERATION_READ \| OPERATION_WRITE \| OPERATION_EXEMPT, }; } } ``` In the future, this annotation can be extended to support rate-limit category (e.g. KV vs Catalog) and to determine the retry policy.	2023-01-04 16:07:02 +00:00
Dan Upton	7c7503c849	grpc/acl: relax permissions required for "core" endpoints (#15346 ) Previously, these endpoints required `service:write` permission on _any_ service as a sort of proxy for "is the caller allowed to participate in the mesh?". Now, they're called as part of the process of establishing a server connection by any consumer of the consul-server-connection-manager library, which will include non-mesh workloads (e.g. Consul KV as a storage backend for Vault) as well as ancillary components such as consul-k8s' acl-init process, which likely won't have `service:write` permission. So this commit relaxes those requirements to accept any valid ACL token on the following gRPC endpoints: - `hashicorp.consul.dataplane.DataplaneService/GetSupportedDataplaneFeatures` - `hashicorp.consul.serverdiscovery.ServerDiscoveryService/WatchServers` - `hashicorp.consul.connectca.ConnectCAService/WatchRoots`	2023-01-04 12:40:34 +00:00
Derek Menteer	1f7e7abeac	Fix issue with incorrect proxycfg watch on upstream peer-targets. (#15865 ) This fixes an issue where the incorrect partition was given to the upstream target watch, which meant that failover logic would not work correctly.	2023-01-03 10:44:08 -06:00
Derek Menteer	f3776894bf	Fix agent cache incorrectly notifying unchanged protobufs. (#15866 ) Fix agent cache incorrectly notifying unchanged protobufs. This change fixes a situation where the protobuf private fields would be read by reflect.DeepEqual() and indicate data was modified. This resulted in change notifications being fired every time, which could cause performance problems in proxycfg.	2023-01-03 10:11:56 -06:00
Dan Upton	7747384f1f	Wire in rate limiter to handle internal and external gRPC calls (#15857 )	2022-12-23 13:42:16 -06:00
Dan Stough	b3bd3a6586	[OSS] feat: access logs for listeners and listener filters (#15864 ) * feat: access logs for listeners and listener filters * changelog * fix integration test	2022-12-22 15:18:15 -05:00
Nitya Dhanushkodi	24f01f96b1	add extensions for local service to GetExtensionConfigurations (#15871 ) This gets the extensions information for the local service into the snapshot and ExtensionConfigurations for a proxy. It grabs the extensions from config entries and puts them in structs.NodeService.Proxy field, which already is copied into the config snapshot. Also: * add EnvoyExtensions to api.AgentService so that it matches structs.NodeService	2022-12-22 10:03:33 -08:00
Nitya Dhanushkodi	c7ef04c597	[OSS] extensions: refactor PluginConfiguration into a more generic type ExtensionConfiguration (#15846 ) * extensions: refactor PluginConfiguration into a more generic type ExtensionConfiguration Also: * adds endpoints configuration to lambda golden tests * uses string constant for builtin/aws/lambda Co-authored-by: Eric <eric@haberkorn.co>	2022-12-20 22:26:20 -08:00
John Murret	f5e01f8c6b	Rate Limit Handler - ensure rate limiting is not in the code path when not configured (#15819 ) * Rate limiting handler - ensure configuration has changed before modifying limiters * Updating test to validate arguments to UpdateConfig * Removing duplicate test. Updating mock. * Renaming NullRateLimiter to NullRequestLimitsHandler * Rate Limit Handler - ensure rate limiting is not in the code path when not configured * Update agent/consul/rate/handler.go Co-authored-by: Dhia Ayachi <dhia@hashicorp.com> * formatting handler.go * Rate limiting handler - ensure configuration has changed before modifying limiters * Updating test to validate arguments to UpdateConfig * Removing duplicate test. Updating mock. * adding logging for when UpdateConfig is called but the config has not changed. * Update agent/consul/rate/handler.go Co-authored-by: Dhia Ayachi <dhia@hashicorp.com> * Update agent/consul/rate/handler_test.go Co-authored-by: Dan Upton <daniel@floppy.co> * modifying existing variable name based on pr feedback * updating a broken merge conflict; Co-authored-by: Dhia Ayachi <dhia@hashicorp.com> Co-authored-by: Dan Upton <daniel@floppy.co>	2022-12-20 15:00:22 -07:00
John Murret	aba43d85d9	Rate limiting handler - ensure configuration has changed before modifying limiters (#15805 ) * Rate limiting handler - ensure configuration has changed before modifying limiters * Updating test to validate arguments to UpdateConfig * Removing duplicate test. Updating mock. * adding logging for when UpdateConfig is called but the config has not changed. * Update agent/consul/rate/handler.go Co-authored-by: Dhia Ayachi <dhia@hashicorp.com> Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>	2022-12-20 14:12:03 -07:00
Michael Wilkerson	1b28b89439	Enhancement: Consul Compatibility Checking (#15818 ) * add functions for returning the max and min Envoy major versions - added an UnsupportedEnvoyVersions list - removed an unused error from TestDetermineSupportedProxyFeaturesFromString - modified minSupportedVersion to use the function for getting the Min Envoy major version. Using just the major version without the patch is equivalent to using `.0` * added a function for executing the envoy --version command - added a new exec.go file to not be locked to unix system * added envoy version check when using consul connect envoy * added changelog entry * added docs change	2022-12-20 09:58:19 -08:00
Derek Menteer	74b11c416c	Fix incorrect protocol check on discovery chains with peer targets. (#15833 )	2022-12-20 10:15:03 -06:00
Semir Patel	799b34f1a9	Map net/rpc endpoints to a read/write/exempt op for rate-limiting (#15825 ) Also fixed TestRequestRecorder flaky tests due to loss of precision in elapsed time in the test.	2022-12-19 16:04:52 -06:00
Nitya Dhanushkodi	d382ca0aec	extensions: refactor serverless plugin to use extensions from config entry fields (#15817 ) docs: update config entry docs and the Lambda manual registration docs Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com> Co-authored-by: Eric <eric@haberkorn.co>	2022-12-19 12:19:37 -08:00
Chris S. Kim	d44b23cb31	Break instead (#15844 )	2022-12-19 11:53:05 -07:00
Chris S. Kim	831680d2c5	Add custom balancer to always remove subConns (#15701 ) The new balancer is a patched version of gRPC's default pick_first balancer which removes the behavior of preserving the active subconnection if a list of new addresses contains the currently active address.	2022-12-19 17:39:31 +00:00
Andrew Stucki	ab199a11b0	Add async reconciliation controller subpackage (#15534 ) * Add async reconciliation controller subpackage * Address initial feedback * Add tests for panic assertions * Fix comment	2022-12-16 16:49:26 -05:00
Dhia Ayachi	f04f88e4b9	add missing code and fix enterprise specific code (#15375 ) * add missing code and fix enterprise specific code * fix retry * fix flaky tests * fix linter error in test	2022-12-16 16:31:05 -05:00
Dhia Ayachi	2d902b26ac	add log-drop package (#15670 ) * add log-drop package * refactor to extract level * extract metrics * Apply suggestions from code review Co-authored-by: Dan Upton <daniel@floppy.co> * fix compile errors * change to implement a log sink * fix tests to remove sleep * rename and add go docs * fix expending variadic Co-authored-by: Dan Upton <daniel@floppy.co>	2022-12-15 12:52:48 -05:00
Paul Glass	619032cfcd	Deprecate -join and -join-wan (#15598 )	2022-12-14 20:28:25 +00:00
Dhia Ayachi	6468e3e09c	Server side rate limiter: handle the race condition for limiters tree write in multilimiter (#15767 ) * change to perform all tree writes in the same go routine to avoid race condition. * rename runStoreOnce to reconcile * Apply suggestions from code review Co-authored-by: Dan Upton <daniel@floppy.co> * reduce nesting Co-authored-by: Dan Upton <daniel@floppy.co>	2022-12-14 17:32:11 +00:00
Semir Patel	bafa5c7156	Pass remote addr of incoming HTTP requests through to RPC(..) calls (#15700 )	2022-12-14 09:24:22 -06:00
John Murret	e027c94b52	adding config for request_limits (#15531 ) * server: add placeholder glue for rate limit handler This commit adds a no-op implementation of the rate-limit handler and adds it to the `consul.Server` struct and setup code. This allows us to start working on the net/rpc and gRPC interceptors and config logic. * Add handler errors * Set the global read and write limits * fixing multilimiter moving packages * Fix typo * Simplify globalLimit usage * add multilimiter and tests * exporting LimitedEntity * Apply suggestions from code review Co-authored-by: John Murret <john.murret@hashicorp.com> * add config update and rename config params * add doc string and split config * Apply suggestions from code review Co-authored-by: Dan Upton <daniel@floppy.co> * use timer to avoid go routine leak and change the interface * add comments to tests * fix failing test * add prefix with config edge, refactor tests * Apply suggestions from code review Co-authored-by: Dan Upton <daniel@floppy.co> * refactor to apply configs for limiters under a prefix * add fuzz tests and fix bugs found. Refactor reconcile loop to have a simpler logic * make KeyType an exported type * split the config and limiter trees to fix race conditions in config update * rename variables * fix race in test and remove dead code * fix reconcile loop to not create a timer on each loop * add extra benchmark tests and fix tests * fix benchmark test to pass value to func * server: add placeholder glue for rate limit handler This commit adds a no-op implementation of the rate-limit handler and adds it to the `consul.Server` struct and setup code. This allows us to start working on the net/rpc and gRPC interceptors and config logic. * Set the global read and write limits * fixing multilimiter moving packages * add server configuration for global rate limiting. * remove agent test * remove added stuff from handler * remove added stuff from multilimiter * removing unnecessary TODOs * Removing TODO comment from handler * adding in defaulting to infinite * add disabled status in there * adding in documentation for disabled mode. * make disabled the default. * Add mock and agent test * addig documentation and missing mock file. * Fixing test TestLoad_IntegrationWithFlags * updating docs based on PR feedback. * Updating Request Limits mode to use int based on PR feedback. * Adding RequestLimits struct so we have a nested struct in ReloadableConfig. * fixing linting references * Update agent/consul/rate/handler.go Co-authored-by: Dan Upton <daniel@floppy.co> * Update agent/consul/config.go Co-authored-by: Dan Upton <daniel@floppy.co> * removing the ignore of the request limits in JSON. addingbuilder logic to convert any read rate or write rate less than 0 to rate.Inf * added conversion function to convert request limits object to handler config. * Updating docs to reflect gRPC and RPC are rate limit and as a result, HTTP requests are as well. * Updating values for TestLoad_FullConfig() so that they were different and discernable. * Updating TestRuntimeConfig_Sanitize * Fixing TestLoad_IntegrationWithFlags test * putting nil check in place * fixing rebase * removing change for missing error checks. will put in another PR * Rebasing after default multilimiter config change * resolving rebase issues * updating reference for incomingRPCLimiter to use interface * updating interface * Updating interfaces * Fixing mock reference Co-authored-by: Daniel Upton <daniel@floppy.co> Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>	2022-12-13 13:09:55 -07:00
Dan Stough	233dbcb67f	feat: add access logging API to proxy defaults (#15780 )	2022-12-13 14:52:18 -05:00
cskh	04bf24c8c1	feat(ingress-gateway): support outlier detection of upstream service for ingress gateway (#15614 ) * feat(ingress-gateway): support outlier detection of upstream service for ingress gateway * changelog Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>	2022-12-13 11:51:37 -05:00
Derek Menteer	e87d35e313	Fix DialedDirectly configuration for Consul dataplane. (#15760 ) Fix DialedDirectly configuration for Consul dataplane.	2022-12-13 09:16:31 -06:00
Dan Upton	c692802dec	grpc: add rate-limiting middleware (#15550 ) Implements the gRPC middleware for rate-limiting as a tap.ServerInHandle function (executed before the request is unmarshaled). Mappings between gRPC methods and their operation type are generated by a protoc plugin introduced by #15564.	2022-12-13 15:01:56 +00:00
Dan Upton	eef38c2199	server: add placeholder glue for rate limit handler (#15539 ) Adds a no-op implementation of the rate-limit handler and exposes it on the consul.Server struct. It allows us to start working on the net/rpc and gRPC interceptors and config (re)loading logic, without having to implement the full handler up-front. Co-authored-by: John Murret <john.murret@hashicorp.com> Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>	2022-12-13 11:41:54 +00:00
John Murret	cd53120cd7	agent: Fix assignment of error when auto-reloading cert and key file changes. (#15769 ) * Adding the setting of errors missing in config file watcher code in agent. * add changelog	2022-12-12 12:24:39 -07:00
R.B. Boyer	4a32070210	test: remove variable shadowing in TestDNS_ServiceLookup_ARecordLimits (#15740 )	2022-12-09 10:19:02 -06:00
Eric Haberkorn	4268c1c25c	Remove the `connect.enable_serverless_plugin` agent configuration option (#15710 )	2022-12-08 14:46:42 -05:00
Dhia Ayachi	81e40c1fac	add multilimiter and tests (#15467 ) * add multilimiter and tests * exporting LimitedEntity * go mod tidy * Apply suggestions from code review Co-authored-by: John Murret <john.murret@hashicorp.com> * add config update and rename config params * add doc string and split config * Apply suggestions from code review Co-authored-by: Dan Upton <daniel@floppy.co> * use timer to avoid go routine leak and change the interface * add comments to tests * fix failing test * add prefix with config edge, refactor tests * Apply suggestions from code review Co-authored-by: Dan Upton <daniel@floppy.co> * refactor to apply configs for limiters under a prefix * add fuzz tests and fix bugs found. Refactor reconcile loop to have a simpler logic * make KeyType an exported type * split the config and limiter trees to fix race conditions in config update * rename variables * fix race in test and remove dead code * fix reconcile loop to not create a timer on each loop * add extra benchmark tests and fix tests * fix benchmark test to pass value to func * use a separate go routine to write limiters (#15643) * use a separate go routine to write limiters * Add updating limiter when another limiter is created * fix waiter to be a ticker, so we commit more than once. * fix tests and add tests for coverage * unexport members and add tests * make UpdateConfig thread safe and multi call to Run safe * replace swith with if * fix review comments * replace time.sleep with retries * fix flaky test and remove unnecessary init * fix test races * remove unnecessary negative test case * remove fixed todo Co-authored-by: John Murret <john.murret@hashicorp.com> Co-authored-by: Dan Upton <daniel@floppy.co>	2022-12-08 14:42:07 -05:00
cskh	3df68751f5	Flakiness test: case-cfg-splitter-peering-ingress-gateways (#15707 ) * integ-test: fix flaky test - case-cfg-splitter-peering-ingress-gateways * add retry peering to all peering cases Co-authored-by: Dan Stough <dan.stough@hashicorp.com>	2022-12-07 20:19:34 -05:00
Derek Menteer	97ec5279aa	Fix local mesh gateway with peering discovery chains. (#15690 ) Fix local mesh gateway with peering discovery chains. Prior to this patch, discovery chains with peers would not properly honor the mesh gateway mode for two reasons. 1. An incorrect target upstream ID was used to lookup the mesh gateway mode. To fix this, the parent upstream uid is now used instead of the discovery-chain-target-uid to find the intended mesh gateway mode. 2. The watch for local mesh gateways was never initialized for discovery chains. To fix this, the discovery chains are now scanned, and a local GW watch is spawned if: the mesh gateway mode is local and the target is a peering connection.	2022-12-07 13:07:42 -06:00
R.B. Boyer	5af94fb2a0	connect: use -dev-no-store-token for test vaults to reduce source of flakes (#15691 ) It turns out that by default the dev mode vault server will attempt to interact with the filesystem to store the provided root token. If multiple vault instances are running they'll all awkwardly share the filesystem and if timing results in one server stopping while another one is starting then the starting one will error with: Error initializing Dev mode: rename /home/circleci/.vault-token.tmp /home/circleci/.vault-token: no such file or directory This change uses `-dev-no-store-token` to bypass that source of flakes. Also the stdout/stderr from the vault process is included if the test fails. The introduction of more `t.Parallel` use in https://github.com/hashicorp/consul/pull/15669 increased the likelihood of this failure, but any of the tests with multiple vaults in use (or running multiple package tests in parallel that all use vault) were eventually going to flake on this.	2022-12-06 13:15:13 -06:00
R.B. Boyer	900584ca82	connect: ensure all vault connect CA tests use limited privilege tokens (#15669 ) All of the current integration tests where Vault is the Connect CA now use non-root tokens for the test. This helps us detect privilege changes in the vault model so we can keep our guides up to date. One larger change was that the RenewIntermediate function got refactored slightly so it could be used from a test, rather than the large duplicated function we were testing in a test which seemed error prone.	2022-12-06 10:06:36 -06:00
R.B. Boyer	4940a728ab	Detect Vault 1.11+ import in secondary datacenters and update default issuer (#15661 ) The fix outlined and merged in #15253 fixed the issue as it occurs in the primary DC. There is a similar issue that arises when vault is used as the Connect CA in a secondary datacenter that is fixed by this PR. Additionally: this PR adds support to run the existing suite of vault related integration tests against the last 4 versions of vault (1.9, 1.10, 1.11, 1.12)	2022-12-05 15:39:21 -06:00
Chris S. Kim	c046d1a4d8	Add warn log when all ACL policies are filtered out (#15632 )	2022-12-05 11:26:10 -05:00
cskh	36f05bc8fb	integ-test: test consul upgrade from the snapshot of a running cluster (#15595 ) * integ-test: test consul upgrade from the snapshot of a running cluster * use Target version as default Co-authored-by: Dan Stough <dan.stough@hashicorp.com>	2022-12-01 10:39:09 -05:00
R.B. Boyer	11a277f372	peering: better represent non-passing states during peer check flattening (#15615 ) During peer stream replication we flatten checks from the source cluster and build one thin overall check to hide the irrelevant details from the consuming cluster. This flattening logic did correctly flip to non-passing if there were any non-passing checks, but WHICH status it got during that was random (warn/error). Also it didn't represent "maintenance" operations. There is an api package call AggregatedStatus which more correctly flattened check statuses. This PR replicated the more complete logic into the peer stream package.	2022-11-30 11:29:21 -06:00
Freddy	941f6da202	Remove log line about server mgmt token init (#15610 ) * Remove log line about server mgmt token init Currently the server management token is only being bootstrapped in the primary datacenter. That means that servers on the secondary datacenter will never have this token available, and would log this line any time a token is resolved. Bootstrapping the token in secondary datacenters will be done in a follow-up. * Add changelog entry	2022-11-29 17:56:03 -05:00
James Oulman	7e78fb7818	Add support for configuring Envoys route idle_timeout (#14340 ) * Add idleTimeout Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>	2022-11-29 17:43:15 -05:00
Derek Menteer	95dc0c7b30	Add peering `.service` and `.node` DNS lookups. (#15596 ) Add peering `.service` and `.node` DNS lookups.	2022-11-29 12:23:18 -06:00
cskh	97c9432843	fix(peering): increase the gRPC limit to 8MB (#15503 ) * fix(peering): increase the gRPC limit to 50MB * changelog * update gRPC limit to 8MB	2022-11-28 17:48:43 -05:00
Chris S. Kim	c9ec9fa320	Fix Vault managed intermediate PKI bug (#15525 )	2022-11-28 16:17:58 -05:00
Chris S. Kim	27c53f6c82	Use backport-compatible assertion (#15546 ) * Use backport-compatible assertion * Add workaround for broken apt-get	2022-11-24 11:44:20 -05:00
Chris S. Kim	386da5439a	Use rpcHoldTimeout to calculate blocking timeout (#15541 ) Adds buffer to clients so that servers have time to respond to blocking queries.	2022-11-24 10:13:02 -05:00
Jared Kirschner	3e7e8ae9c5	Support RFC 2782 for prepared query DNS lookups (#14465 ) Format: _<query id or name>._tcp.query[.<datacenter>].<domain>	2022-11-20 17:21:24 -05:00
Alexander Scheel	2b90307f6d	Detect Vault 1.11+ import, update default issuer (#15253 ) Consul used to rely on implicit issuer selection when calling Vault endpoints to issue new CSRs. Vault 1.11+ changed that behavior, which caused Consul to check the wrong (previous) issuer when renewing its Intermediate CA. This patch allows Consul to explicitly set a default issuer when it detects that the response from Vault is 1.11+. Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-11-17 16:29:49 -05:00
cskh	435e16ecda	fix: clarifying error message when acquiring a lock in remote dc (#15394 ) * fix: clarifying error message when acquiring a lock in remote dc * Update website/content/commands/lock.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>	2022-11-16 15:27:37 -05:00
Kyle Havlovitz	f4c3e54b11	auto-config: relax node name validation for JWT authorization (#15370 ) * auto-config: relax node name validation for JWT authorization This changes the JWT authorization logic to allow all non-whitespace, non-quote characters when validating node names. Consul had previously allowed these characters in node names, until this validation was added to fix a security vulnerability with whitespace/quotes being passed to the `bexpr` library. This unintentionally broke node names with characters like `.` which aren't related to this vulnerability. * Update website/content/docs/agent/config/cli-flags.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>	2022-11-14 18:24:40 -06:00
Dhia Ayachi	225ae55e83	Leadership transfer cmd (#14132 ) * add leadership transfer command * add RPC call test (flaky) * add missing import * add changelog * add command registration * Apply suggestions from code review Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> * add the possibility of providing an id to raft leadership transfer. Add few tests. * delete old file from cherry pick * rename changelog filename to PR # * rename changelog and fix import * fix failing test * check for OperatorWrite Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> * rename from leader-transfer to transfer-leader * remove version check and add test for operator read * move struct to operator.go * first pass * add code for leader transfer in the grpc backend and tests * wire the http endpoint to the new grpc endpoint * remove the RPC endpoint * remove non needed struct * fix naming * add mog glue to API * fix comment * remove dead code * fix linter error * change package name for proto file * remove error wrapping * fix failing test * add command registration * add grpc service mock tests * fix receiver to be pointer * use defined values Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> * reuse MockAclAuthorizer * add documentation * remove usage of external.TokenFromContext * fix failing tests * fix proto generation * Apply suggestions from code review Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * Apply suggestions from code review * add more context in doc for the reason * Apply suggestions from docs code review Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * regenerate proto * fix linter errors Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com> Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2022-11-14 15:35:12 -05:00
Freddy	706866fa00	Ensure that NodeDump imported nodes are filtered (#15356 )	2022-11-14 12:35:20 -07:00
Freddy	c58f86a00f	Fixup authz for data imported from peers (#15347 ) There are a few changes that needed to be made to to handle authorizing reads for imported data: - If the data was imported from a peer we should not attempt to read the data using the traditional authz rules. This is because the name of services/nodes in a peer cluster are not equivalent to those of the importing cluster. - If the data was imported from a peer we need to check whether the token corresponds to a service, meaning that it has service:write permissions, or to a local read only token that can read all nodes/services in a namespace. This required changes at the policyAuthorizer level, since that is the only view available to OSS Consul, and at the enterprise partition/namespace level.	2022-11-14 11:36:27 -07:00
Kyle Havlovitz	dde5c524ad	connect: strip port from DNS SANs for ingress gateway leaf cert (#15320 ) * connect: strip port from DNS SANs for ingress gateway leaf cert * connect: format DNS SANs in CreateCSR * connect: Test wildcard case when formatting SANs	2022-11-14 10:27:03 -08:00
Derek Menteer	931cec42b3	Prevent serving TLS via ports.grpc (#15339 ) Prevent serving TLS via ports.grpc We remove the ability to run the ports.grpc in TLS mode to avoid confusion and to simplify configuration. This breaking change ensures that any user currently using ports.grpc in an encrypted mode will receive an error message indicating that ports.grpc_tls must be explicitly used. The suggested action for these users is to simply swap their ports.grpc to ports.grpc_tls in the configuration file. If both ports are defined, or if the user has not configured TLS for grpc, then the error message will not be printed.	2022-11-11 14:29:22 -06:00
Dan Stough	626249fbf5	[OSS] fix: wait and try longer to peer through mesh gw (#15328 )	2022-11-10 13:54:00 -05:00
Kyle Schochenmaier	bf0f61a878	removes ioutil usage everywhere which was deprecated in go1.16 (#15297 ) * update go version to 1.18 for api and sdk, go mod tidy * removes ioutil usage everywhere which was deprecated in go1.16 in favour of io and os packages. Also introduces a lint rule which forbids use of ioutil going forward. Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-11-10 10:26:01 -06:00
malizz	b51f0e25e9	update ACLs for cluster peering (#15317 ) * update ACLs for cluster peering * add changelog * Update .changelog/15317.txt Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com> Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>	2022-11-09 13:02:58 -08:00
malizz	b9a9e1219c	update config defaults, add docs (#15302 ) * update config defaults, add docs * update grpc tls port for non-default values * add changelog * Update website/content/docs/upgrading/upgrade-specific.mdx Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> * Update website/content/docs/agent/config/config-files.mdx Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> * update logic for setting grpc tls port value * move default config to default.go, update changelog * update docs * Fix config tests. * Fix linter error. * Fix ConnectCA tests. * Cleanup markdown on upgrade notes. Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>	2022-11-09 09:29:55 -08:00
Eric Haberkorn	c340922991	Log Warnings When Peering With Mesh Gateway Mode None (#15304 ) warn when mesh gateway mode is set to none for peering	2022-11-09 11:48:58 -05:00
Derek Menteer	418bd62c44	Fix mesh gateway configuration with proxy-defaults (#15186 ) * Fix mesh gateway proxy-defaults not affecting upstreams. * Clarify distinction with upstream settings Top-level mesh gateway mode in proxy-defaults and service-defaults gets merged into NodeService.Proxy.MeshGateway, and only gets merged with the mode attached to an an upstream in proxycfg/xds. * Fix mgw mode usage for peered upstreams There were a couple issues with how mgw mode was being handled for peered upstreams. For starters, mesh gateway mode from proxy-defaults and the top-level of service-defaults gets stored in NodeService.Proxy.MeshGateway, but the upstream watch for peered data was only considering the mesh gateway config attached in NodeService.Proxy.Upstreams[i]. This means that applying a mesh gateway mode via global proxy-defaults or service-defaults on the downstream would not have an effect. Separately, transparent proxy watches for peered upstreams didn't consider mesh gateway mode at all. This commit addresses the first issue by ensuring that we overlay the upstream config for peered upstreams as we do for non-peered. The second issue is addressed by re-using setupWatchesForPeeredUpstream when handling transparent proxy updates. Note that for transparent proxies we do not yet support mesh gateway mode per upstream, so the NodeService.Proxy.MeshGateway mode is used. * Fix upstream mesh gateway mode handling in xds This commit ensures that when determining the mesh gateway mode for peered upstreams we consider the NodeService.Proxy.MeshGateway config as a baseline. In absense of this change, setting a mesh gateway mode via proxy-defaults or the top-level of service-defaults will not have an effect for peered upstreams. * Merge service/proxy defaults in cfg resolver Previously the mesh gateway mode for connect proxies would be merged at three points: 1. On servers, in ComputeResolvedServiceConfig. 2. On clients, in MergeServiceConfig. 3. On clients, in proxycfg/xds. The first merge returns a ServiceConfigResponse where there is a top-level MeshGateway config from proxy/service-defaults, along with per-upstream config. The second merge combines per-upstream config specified at the service instance with per-upstream config specified centrally. The third merge combines the NodeService.Proxy.MeshGateway config containing proxy/service-defaults data with the per-upstream mode. This third merge is easy to miss, which led to peered upstreams not considering the mesh gateway mode from proxy-defaults. This commit removes the third merge, and ensures that all mesh gateway config is available at the upstream. This way proxycfg/xds do not need to do additional overlays. * Ensure that proxy-defaults is considered in wc Upstream defaults become a synthetic Upstream definition under a wildcard key "". Now that proxycfg/xds expect Upstream definitions to have the final MeshGateway values, this commit ensures that values from proxy-defaults/service-defaults are the default for this synthetic upstream. Add changelog. Co-authored-by: freddygv <freddy@hashicorp.com>	2022-11-09 10:14:29 -06:00
Dan Upton	7b2d08d461	chore: remove unused argument from MergeNodeServiceWithCentralConfig (#15024 ) Previously, the MergeNodeServiceWithCentralConfig method accepted a ServiceSpecificRequest argument, of which only the Datacenter and QueryOptions fields were used. Digging a little deeper, it turns out these fields were only passed down to the ComputeResolvedServiceConfig method (through the ServiceConfigRequest struct) which didn't actually use them. As such, not all call-sites passed a valid ServiceSpecificRequest so it's safer to remove the argument altogether to prevent future changes from depending on it.	2022-11-09 14:54:57 +00:00
Derek Menteer	b64972d486	Bring back parameter ServerExternalAddresses in GenerateToken endpoint (#15267 ) Re-add ServerExternalAddresses parameter in GenerateToken endpoint This reverts commit `5e156772f6` and adds extra functionality to support newer peering behaviors.	2022-11-08 14:55:18 -06:00
cskh	a3f57cc5e8	fix(mesh-gateway): remove deregistered service from mesh gateway (#15272 ) * fix(mesh-gateway): remove deregistered service from mesh gateway * changelog Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> Co-authored-by: Evan Culver <eculver@users.noreply.github.com>	2022-11-07 20:30:15 -05:00
Freddy	7f5f7e9cf9	Avoid blocking child type updates on parent ack (#15083 )	2022-11-07 18:10:42 -07:00
Derek Menteer	c064ddf606	Backport test fix from ent. (#15279 )	2022-11-07 12:17:46 -06:00
Chris S. Kim	985a4ee1b1	Update hcp-scada-provider to fix diamond dependency problem with go-msgpack (#15185 )	2022-11-07 11:34:30 -05:00
Eric Haberkorn	1804b58799	Fix a bug in mesh gateway proxycfg where ACL tokens aren't passed. (#15273 )	2022-11-07 10:00:11 -05:00
Dan Stough	553312ef61	fix: persist peering CA updates to dialing clusters (#15243 ) fix: persist peering CA updates to dialing clusters	2022-11-04 12:53:20 -04:00
Derek Menteer	18d6c338f4	Backport tests from ent. (#15260 ) * Backport agent tests. Original commit: 0710b2d12fb51a29cedd1119b5fb086e5c71f632 Original commit: aaedb3c28bfe247266f21013d500147d8decb7cd (partial) * Backport test fix and reduce flaky failures.	2022-11-04 10:19:24 -05:00
Derek Menteer	0834fe349b	Backport test from ENT: "Fix missing test fields" (#15258 ) * Backport test from ENT: "Fix missing test fields" Original Author: Sarah Pratt Original Commit: a5c88bef7a969ea5d06ed898d142ab081ba65c69 * Update with proper linting.	2022-11-04 09:29:16 -05:00
Derek Menteer	f4cb2f82bf	Backport various fixes from ENT. (#15254 ) * Regenerate golden files. * Backport from ENT: "Avoid race" Original commit: 5006c8c858b0e332be95271ef9ba35122453315b Original author: freddygv * Backport from ENT: "chore: fix flake peerstream test" Original commit: b74097e7135eca48cc289798c5739f9ef72c0cc8 Original author: DanStough	2022-11-03 16:34:57 -05:00
malizz	617a5f2dc2	convert stream status time fields to pointers (#15252 )	2022-11-03 11:51:22 -07:00
sarahalsmiller	436160e155	Added check for empty peeringsni in restrictPeeringEndpoints (#15239 ) Add check for empty peeringSNI in restrictPeeringEndpoints Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>	2022-11-02 17:20:52 -05:00
Derek Menteer	bd1019fadb	Prevent peering acceptor from subscribing to addr updates. (#15214 )	2022-11-02 07:55:41 -05:00
Dan Stough	05e93f7569	test: refactor testcontainers and add peering integ tests (#15084 )	2022-11-01 15:03:23 -04:00
Derek Menteer	fa5d87c116	Decrease retry time for failed peering connections.	2022-10-31 14:30:27 -05:00
R.B. Boyer	97b9fcbf48	test: fix flaky TestSubscribeBackend_IntegrationWithServer_DeliversAllMessages test (#15195 ) Allow for some message duplication in subscription events during assertions. I'm pretty sure the subscriptions machinery allows for messages to occasionally be duplicated instead of dropping them, as a once-and-only-once queue is a pipe dream and you have to pick one of the other two options.	2022-10-31 12:10:43 -05:00
Evan Culver	62d4517f9e	connect: Add Envoy 1.24 to integration tests, remove Envoy 1.20 (#15093 )	2022-10-31 10:50:45 -05:00
Derek Menteer	693c8a4706	Allow peering endpoints to bypass verify_incoming.	2022-10-31 09:56:30 -05:00
Derek Menteer	2d4b62be3c	Add tests.	2022-10-31 08:45:00 -05:00
Derek Menteer	1483c94531	Fix peered service protocols using proxy-defaults.	2022-10-31 08:45:00 -05:00
Eric Haberkorn	cf50bdbe20	Fix peering metrics bug (#15178 ) This bug was caused by the peering health metric being set to NaN.	2022-10-28 10:51:12 -04:00
Chris S. Kim	0e176dd6aa	Allow consul debug on non-ACL consul servers (#15155 )	2022-10-27 09:25:18 -04:00
cskh	a9427e1310	fix(peering): nil pointer in calling handleUpdateService (#15160 ) * fix(peering): nil pointer in calling handleUpdateService * changelog	2022-10-26 11:50:34 -04:00
Eric Haberkorn	1bdad89026	fix bug that resulted in generating Envoy configs that use CDS with an EDS configuration (#15140 )	2022-10-25 14:49:57 -04:00
Luke Kysow	d3aa2bd9c5	ingress-gateways: don't log error when registering gateway (#15001 ) * ingress-gateways: don't log error when registering gateway Previously, when an ingress gateway was registered without a corresponding ingress gateway config entry, an error was logged because the watch on the config entry returned a nil result. This is expected so don't log an error.	2022-10-25 10:55:44 -07:00
Luke Kysow	9999672fd7	autoencrypt: helpful error for clients with wrong dc (#14832 ) * autoencrypt: helpful error for clients with wrong dc If clients have set a different datacenter than the servers they're connecting with for autoencrypt, give a helpful error message.	2022-10-25 10:13:41 -07:00
R.B. Boyer	3c44116a8f	cache: refactor agent cache fetching to prevent unnecessary fetches on error (#14956 ) This continues the work done in #14908 where a crude solution to prevent a goroutine leak was implemented. The former code would launch a perpetual goroutine family every iteration (+1 +1) and the fixed code simply caused a new goroutine family to first cancel the prior one to prevent the leak (-1 +1 == 0). This PR refactors this code completely to: - make it more understandable - remove the recursion-via-goroutine strangeness - prevent unnecessary RPC fetches when the prior one has errored. The core issue arose from a conflation of the entry.Fetching field to mean: - there is an RPC (blocking query) in flight right now - there is a goroutine running to manage the RPC fetch retry loop The problem is that the goroutine-leak-avoidance check would treat Fetching like (2), but within the body of a goroutine it would flip that boolean back to false before the retry sleep. This would cause a new chain of goroutines to launch which #14908 would correct crudely. The refactored code uses a plain for-loop and changes the semantics to track state for "is there a goroutine associated with this cache entry" instead of the former. We use a uint64 unique identity per goroutine instead of a boolean so that any orphaned goroutines can tell when they've been replaced when the expiry loop deletes a cache entry while the goroutine is still running and is later replaced.	2022-10-25 10:27:26 -05:00
R.B. Boyer	da70daba43	test: ensure that all dependencies in a test agent use the test logger (#14996 )	2022-10-24 17:02:38 -05:00
Chris S. Kim	9f0ed81cfd	Remove invalid 1xx HTTP codes These tests started failing in go1.19, presumably due to support for valid 1xx responses being added. https://github.com/golang/go/issues/56346	2022-10-24 16:12:08 -04:00
Chris S. Kim	bde57c0dd0	Regenerate files according to 1.19.2 formatter	2022-10-24 16:12:08 -04:00
cskh	db82ffe503	fix(peering): replicating wan address (#15108 ) * fix(peering): replicating wan address * add changelog * unit test	2022-10-24 15:44:57 -04:00
Iryna Shustava	176abb5ff2	proxycfg: watch service-defaults config entries (#15025 ) To support Destinations on the service-defaults (for tproxy with terminating gateway), we need to now also make servers watch service-defaults config entries.	2022-10-24 12:50:28 -06:00
Chris S. Kim	b236e86030	Move oss-only test to its own file	2022-10-24 14:17:43 -04:00
R.B. Boyer	d04cf25fa8	test: fix flaky TestHealthServiceNodes_NodeMetaFilter by waiting until the streaming subsystem has a valid grpc connection (#15019 ) Also potentially unflakes TestHealthIngressServiceNodes for similar reasons.	2022-10-24 13:09:53 -05:00
R.B. Boyer	300860412c	chore: update golangci-lint to v1.50.1 (#15022 )	2022-10-24 11:48:02 -05:00
Venu Yanamandra	efc813e92d	Update error message when restoring ENT snapshot in OSS (#15066 )	2022-10-24 11:40:26 -04:00
freddygv	d65e60de86	Return forbidden on permission denied This commit updates the establish endpoint to bubble up a 403 status code to callers when the establishment secret from the token is invalid. This is a signal that a new peering token must be generated.	2022-10-20 17:11:49 -06:00
Chris S. Kim	a7ea26192b	Update expected encoding in test go-memdb was updated in v1.3.3 to make integers in indexes sortable, which changed how integers were encoded.	2022-10-20 14:32:42 -04:00
freddygv	6d9be5fb15	Use plain TaggedAddressWAN	2022-10-19 16:32:44 -06:00
freddygv	8d211cc9cc	Add unit test	2022-10-19 16:26:15 -06:00
cskh	058ee4fb84	fix: wan address isn't used by peering token	2022-10-19 16:33:25 -04:00
Nitya Dhanushkodi	5e156772f6	Remove ability to specify external addresses in GenerateToken endpoint (#14930 ) * Reverts "update generate token endpoint to take external addresses (#13844)" This reverts commit `f47319b7c6`.	2022-10-19 09:31:36 -07:00
Kyle Havlovitz	5c3427608b	Merge pull request #15035 from hashicorp/vault-ttl-update-warn Warn instead of returning error when missing intermediate mount tune permissions	2022-10-18 15:41:52 -07:00
cskh	d562d363fc	peering: skip registering duplicate node and check from the peer (#14994 ) * peering: skip register duplicate node and check from the peer * Prebuilt the nodes map and checks map to avoid repeated for loop * use key type to struct: node id, service id, and check id	2022-10-18 16:19:24 -04:00
Chris S. Kim	29a297d3e9	Refactor client RPC timeouts (#14965 ) Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts.	2022-10-18 15:05:09 -04:00
Kyle Havlovitz	d122108992	Warn instead of returning an error when intermediate mount tune permission is missing	2022-10-18 12:01:25 -07:00
R.B. Boyer	0cca4c088d	test: possibly fix flake in TestIntentionGetExact (#15021 ) Restructure test setup to be similar to TestAgent_ServerCertificate and see if that's enough to avoid flaking after join.	2022-10-18 10:51:20 -05:00
R.B. Boyer	fe2d41ddad	cache: prevent goroutine leak in agent cache (#14908 ) There is a bug in the error handling code for the Agent cache subsystem discovered: 1. NotifyCallback calls notifyBlockingQuery which calls getWithIndex in a loop (which backs off on-error up to 1 minute) 2. getWithIndex calls fetch if there’s no valid entry in the cache 3. fetch starts a goroutine which calls Fetch on the cache-type, waits for a while (again with backoff up to 1 minute for errors) and then calls fetch to trigger a refresh The end result being that every 1 minute notifyBlockingQuery spawns an ancestry of goroutines that essentially lives forever. This PR ensures that the goroutine started by `fetch` cancels any prior goroutine spawned by the same line for the same key. In isolated testing where a cache type was tweaked to indefinitely error, this patch prevented goroutine counts from skyrocketing.	2022-10-17 14:38:10 -05:00
R.B. Boyer	02a858efa0	ca: fix a masked bug in leaf cert generation that would not be notified of root cert rotation after the first one (#15005 ) In practice this was masked by #14956 and was only uncovered fixing the other bug. go test ./agent -run TestAgentConnectCALeafCert_goodNotLocal would fail when only #14956 was fixed.	2022-10-17 13:24:27 -05:00
Chris S. Kim	3d2dffff16	Merge pull request #13388 from deblasis/feature/health-checks_windows_service Feature: Health checks windows service	2022-10-17 09:26:19 -04:00
Dan Upton	f8b4b41205	proxycfg: fix goroutine leak when service is re-registered (#14988 ) Fixes a bug where we'd leak a goroutine in state.run when the given context was canceled while there was a pending update.	2022-10-17 11:31:10 +01:00
Kyle Havlovitz	aaf892a383	Extend tcp keepalive settings to work for terminating gateways as well	2022-10-14 17:05:46 -07:00
Kyle Havlovitz	2c569f6b9c	Update docs and add tcp_keepalive_probes setting	2022-10-14 17:05:46 -07:00
Kyle Havlovitz	2242d1ec4a	Add TCP keepalive settings to proxy config for mesh gateways	2022-10-14 17:05:46 -07:00
Derek Menteer	2a33d0ff96	Fix issue with incorrect method signature on test.	2022-10-14 11:04:57 -05:00
Freddy	24d0c8801a	Merge pull request #14981 from hashicorp/peering/dial-through-gateways	2022-10-14 09:44:56 -06:00
Dan Upton	328e3ff563	proxycfg: rate-limit delivery of config snapshots (#14960 ) Adds a user-configurable rate limiter to proxycfg snapshot delivery, with a default limit of 250 updates per second. This addresses a problem observed in our load testing of Consul Dataplane where updating a "global" resource such as a wildcard intention or the proxy-defaults config entry could starve the Raft or Memberlist goroutines of CPU time, causing general cluster instability.	2022-10-14 15:52:00 +01:00
Derek Menteer	29ebcf5ff0	Add tests for peering state snapshots / restores.	2022-10-14 09:48:04 -05:00

... 5 6 7 8 9 ...

5341 Commits (005e1b99265551d90d2f6ec1f9465c73acbccbc3)