Commit Graph

211 Commits (5a29f9b4f7d63188fb44867b552724f3535e520d)

Author SHA1 Message Date
Kyle Havlovitz 2904d0a431
Pull virtual IPs for filter chains from discovery chains (#17375) 2023-05-17 11:18:39 -07:00
Connor 0789661ce5
Rename hcp-metrics-collector to consul-telemetry-collector (#17327)
* Rename hcp-metrics-collector to consul-telemetry-collector

* Fix docs

* Fix doc comment

---------

Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com>
2023-05-16 14:36:05 -04:00
Freddy 7c3e9cd862
Hash namespace+proxy ID when creating socket path (#17204)
UNIX domain socket paths are limited to 104-108 characters, depending on
the OS. This limit was quite easy to exceed when testing the feature on
Kubernetes, due to how proxy IDs encode the Pod ID eg:
metrics-collector-59467bcb9b-fkkzl-hcp-metrics-collector-sidecar-proxy

To ensure we stay under that character limit this commit makes a
couple changes:
- Use a b64 encoded SHA1 hash of the namespace + proxy ID to create a
  short and deterministic socket file name.
- Add validation to proxy registrations and proxy-defaults to enforce a
  limit on the socket directory length.
2023-05-09 12:20:26 -06:00
Semir Patel 5eaeb7b8e5
Support Envoy's MaxEjectionPercent and BaseEjectionTime config entries for passive health checks (#15979)
* Add MaxEjectionPercent to config entry

* Add BaseEjectionTime to config entry

* Add MaxEjectionPercent and BaseEjectionTime to protobufs

* Add MaxEjectionPercent and BaseEjectionTime to api

* Fix integration test breakage

* Verify MaxEjectionPercent and BaseEjectionTime in integration test upstream confings

* Website docs for MaxEjectionPercent and BaseEjection time

* Add `make docs` to browse docs at http://localhost:3000

* Changelog entry

* so that is the difference between consul-docker and dev-docker

* blah

* update proto funcs

* update proto

---------

Co-authored-by: Maliz <maliheh.monshizadeh@hashicorp.com>
2023-04-26 15:59:48 -07:00
Paul Glass 77ecff3209
Permissive mTLS (#17035)
This implements permissive mTLS , which allows toggling services into "permissive" mTLS mode.
Permissive mTLS mode allows incoming "non Consul-mTLS" traffic to be forward unmodified to the application.

* Update service-defaults and proxy-defaults config entries with a MutualTLSMode field
* Update the mesh config entry with an AllowEnablingPermissiveMutualTLS field and implement the necessary validation. AllowEnablingPermissiveMutualTLS must be true to allow changing to MutualTLSMode=permissive, but this does not require that all proxy-defaults and service-defaults are currently in strict mode.
* Update xDS listener config to add a "permissive filter chain" when MutualTLSMode=permissive for a particular service. The permissive filter chain matches incoming traffic by the destination port. If the destination port matches the service port from the catalog, then no mTLS is required and the traffic sent is forwarded unmodified to the application.
2023-04-19 14:45:00 -05:00
Chris Thain 175bb1a303
Wasm Envoy HTTP extension (#16877) 2023-04-06 14:12:07 -07:00
Derek Menteer 2236975011
Change partition for peers in discovery chain targets (#16769)
This commit swaps the partition field to the local partition for
discovery chains targeting peers. Prior to this change, peer upstreams
would always use a value of default regardless of which partition they
exist in. This caused several issues in xds / proxycfg because of id
mismatches.

Some prior fixes were made to deal with one-off id mismatches that this
PR also cleans up, since they are no longer needed.
2023-03-24 15:40:19 -05:00
Eric Haberkorn 495ad4c7ef
add enterprise xds tests (#16738) 2023-03-22 14:56:18 -04:00
Andrew Stucki 501b87fd31
[API Gateway] Fix invalid cluster causing gateway programming delay (#16661)
* Add test for http routes

* Add fix

* Fix tests

* Add changelog entry

* Refactor and fix flaky tests
2023-03-17 13:31:04 -04:00
Eric Haberkorn 57e034b746
fix confusing spiffe ids in golden tests (#16643) 2023-03-15 14:30:36 -04:00
Ashvitha f95ffe0355
Allow HCP metrics collection for Envoy proxies
Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>

Add a new envoy flag: "envoy_hcp_metrics_bind_socket_dir", a directory
where a unix socket will be created with the name
`<namespace>_<proxy_id>.sock` to forward Envoy metrics.

If set, this will configure:
- In bootstrap configuration a local stats_sink and static cluster.
  These will forward metrics to a loopback listener sent over xDS.

- A dynamic listener listening at the socket path that the previously
  defined static cluster is sending metrics to.

- A dynamic cluster that will forward traffic received at this listener
  to the hcp-metrics-collector service.


Reasons for having a static cluster pointing at a dynamic listener:
- We want to secure the metrics stream using TLS, but the stats sink can
  only be defined in bootstrap config. With dynamic listeners/clusters
  we can use the proxy's leaf certificate issued by the Connect CA,
  which isn't available at bootstrap time.

- We want to intelligently route to the HCP collector. Configuring its
  addreess at bootstrap time limits our flexibility routing-wise. More
  on this below.

Reasons for defining the collector as an upstream in `proxycfg`:
- The HCP collector will be deployed as a mesh service.

- Certificate management is taken care of, as mentioned above.

- Service discovery and routing logic is automatically taken care of,
  meaning that no code changes are required in the xds package.

- Custom routing rules can be added for the collector using discovery
  chain config entries. Initially the collector is expected to be
  deployed to each admin partition, but in the future could be deployed
  centrally in the default partition. These config entries could even be
  managed by HCP itself.
2023-03-10 13:52:54 -07:00
Andrew Stucki 4b661d1e0c
Add ServiceResolver RequestTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable (#16495)
* Leverage ServiceResolver ConnectTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable

* Regenerate golden files

* Add RequestTimeout field

* Add changelog entry
2023-03-03 09:37:12 -05:00
Eric Haberkorn 595131fca9
Refactor the disco chain -> xds logic (#16392) 2023-02-23 11:32:32 -05:00
Andrew Stucki b3ddd4d24e
Inline API Gateway TLS cert code (#16295)
* Include secret type when building resources from config snapshot

* First pass at generating envoy secrets from api-gateway snapshot

* Update comments for xDS update order

* Add secret type + corresponding golden files to existing tests

* Initialize test helpers for testing api-gateway resource generation

* Generate golden files for new api-gateway xDS resource test

* Support ADS for TLS certificates on api-gateway

* Configure TLS on api-gateway listeners

* Inline TLS cert code

* update tests

* Add SNI support so we can have multiple certificates

* Remove commented out section from helper

* regen deep-copy

* Add tcp tls test

---------

Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
2023-02-17 12:46:03 -05:00
Thomas Eckert 2460ac99c9
API Gateway Envoy Golden Listener Tests (#16221)
* Simple API Gateway e2e test for tcp routes

* Drop DNSSans since we don't front the Gateway with a leaf cert

* WIP listener tests for api-gateway

* Return early if no routes

* Add back in leaf cert to testing

* Fix merge conflicts

* Re-add kind to setup

* Fix iteration over listener upstreams

* New tcp listener test

* Add tests for API Gateway with TCP and HTTP routes

* Move zero-route check back

* Drop generateIngressDNSSANs

* Check for chains not routes

---------

Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>
2023-02-16 14:42:36 -05:00
cskh e91bc9c058
feat: envoy extension - http local rate limit (#16196)
- http local rate limit
- Apply rate limit only to local_app
- unit test and integ test
2023-02-07 21:56:15 -05:00
Nitya Dhanushkodi 8d4c3aa42c
refactor: move service to service validation to troubleshoot package (#16132)
This is to reduce the dependency on xds from within the troubleshoot package.
2023-02-02 22:18:10 -08:00
Derek Menteer 06338c8ee7
Add unit test and update golden files. (#16115) 2023-02-01 09:51:08 -06:00
Nitya Dhanushkodi 8728a4496c
troubleshoot: service to service validation (#16096)
* Add Tproxy support to Envoy Extensions (this is needed for service to service validation)

* Add validation for Envoy configuration for an upstream service

* Use both /config_dump and /cluster to validate Envoy configuration
This is because of a bug in Envoy where the EndpointsConfigDump does not
include a cluster_name, making it impossible to match an endpoint to
verify it exists.

This removes endpoints support for builtin extensions since only the
validate plugin was using it, and it is no longer used. It also removes
test cases for endpoint validation. Endpoints validation now only occurs
in the top level test from config_dump and clusters json files.

Co-authored-by: Eric <eric@haberkorn.co>
2023-01-27 11:43:16 -08:00
Michael Wilkerson a1498b015d
Mw/lambda envoy extension parse region (#4107) (#16069)
* updated builtin extension to parse region directly from ARN
- added a unit test
- added some comments/light refactoring

* updated golden files with proper ARNs
- ARNs need to be right format now that they are being processed

* updated tests and integration tests
- removed 'region' from all EnvoyExtension arguments
- added properly formatted ARN which includes the same region found in the removed "Region" field: 'us-east-1'
2023-01-26 15:44:52 -08:00
Eric Haberkorn 8d923c1789
Add the Lua Envoy extension (#15906) 2023-01-06 12:13:40 -05:00
Dan Stough b3bd3a6586
[OSS] feat: access logs for listeners and listener filters (#15864)
* feat: access logs for listeners and listener filters

* changelog

* fix integration test
2022-12-22 15:18:15 -05:00
Nitya Dhanushkodi c7ef04c597
[OSS] extensions: refactor PluginConfiguration into a more generic type ExtensionConfiguration (#15846)
* extensions: refactor PluginConfiguration into a more generic type
ExtensionConfiguration

Also:
* adds endpoints configuration to lambda golden tests
* uses string constant for builtin/aws/lambda
Co-authored-by: Eric <eric@haberkorn.co>
2022-12-20 22:26:20 -08:00
cskh 04bf24c8c1
feat(ingress-gateway): support outlier detection of upstream service for ingress gateway (#15614)
* feat(ingress-gateway): support outlier detection of upstream service for ingress gateway

* changelog

Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
2022-12-13 11:51:37 -05:00
James Oulman 7e78fb7818
Add support for configuring Envoys route idle_timeout (#14340)
* Add idleTimeout

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2022-11-29 17:43:15 -05:00
Derek Menteer 418bd62c44
Fix mesh gateway configuration with proxy-defaults (#15186)
* Fix mesh gateway proxy-defaults not affecting upstreams.

* Clarify distinction with upstream settings

Top-level mesh gateway mode in proxy-defaults and service-defaults gets
merged into NodeService.Proxy.MeshGateway, and only gets merged with
the mode attached to an an upstream in proxycfg/xds.

* Fix mgw mode usage for peered upstreams

There were a couple issues with how mgw mode was being handled for
peered upstreams.

For starters, mesh gateway mode from proxy-defaults
and the top-level of service-defaults gets stored in
NodeService.Proxy.MeshGateway, but the upstream watch for peered data
was only considering the mesh gateway config attached in
NodeService.Proxy.Upstreams[i]. This means that applying a mesh gateway
mode via global proxy-defaults or service-defaults on the downstream
would not have an effect.

Separately, transparent proxy watches for peered upstreams didn't
consider mesh gateway mode at all.

This commit addresses the first issue by ensuring that we overlay the
upstream config for peered upstreams as we do for non-peered. The second
issue is addressed by re-using setupWatchesForPeeredUpstream when
handling transparent proxy updates.

Note that for transparent proxies we do not yet support mesh gateway
mode per upstream, so the NodeService.Proxy.MeshGateway mode is used.

* Fix upstream mesh gateway mode handling in xds

This commit ensures that when determining the mesh gateway mode for
peered upstreams we consider the NodeService.Proxy.MeshGateway config as
a baseline.

In absense of this change, setting a mesh gateway mode via
proxy-defaults or the top-level of service-defaults will not have an
effect for peered upstreams.

* Merge service/proxy defaults in cfg resolver

Previously the mesh gateway mode for connect proxies would be
merged at three points:

1. On servers, in ComputeResolvedServiceConfig.
2. On clients, in MergeServiceConfig.
3. On clients, in proxycfg/xds.

The first merge returns a ServiceConfigResponse where there is a
top-level MeshGateway config from proxy/service-defaults, along with
per-upstream config.

The second merge combines per-upstream config specified at the service
instance with per-upstream config specified centrally.

The third merge combines the NodeService.Proxy.MeshGateway
config containing proxy/service-defaults data with the per-upstream
mode. This third merge is easy to miss, which led to peered upstreams
not considering the mesh gateway mode from proxy-defaults.

This commit removes the third merge, and ensures that all mesh gateway
config is available at the upstream. This way proxycfg/xds do not need
to do additional overlays.

* Ensure that proxy-defaults is considered in wc

Upstream defaults become a synthetic Upstream definition under a
wildcard key "*". Now that proxycfg/xds expect Upstream definitions to
have the final MeshGateway values, this commit ensures that values from
proxy-defaults/service-defaults are the default for this synthetic
upstream.

* Add changelog.

Co-authored-by: freddygv <freddy@hashicorp.com>
2022-11-09 10:14:29 -06:00
Derek Menteer f4cb2f82bf
Backport various fixes from ENT. (#15254)
* Regenerate golden files.

* Backport from ENT: "Avoid race"

Original commit: 5006c8c858b0e332be95271ef9ba35122453315b
Original author: freddygv

* Backport from ENT: "chore: fix flake peerstream test"

Original commit: b74097e7135eca48cc289798c5739f9ef72c0cc8
Original author: DanStough
2022-11-03 16:34:57 -05:00
Eric Haberkorn 1bdad89026
fix bug that resulted in generating Envoy configs that use CDS with an EDS configuration (#15140) 2022-10-25 14:49:57 -04:00
Luke Kysow d3aa2bd9c5
ingress-gateways: don't log error when registering gateway (#15001)
* ingress-gateways: don't log error when registering gateway

Previously, when an ingress gateway was registered without a
corresponding ingress gateway config entry, an error was logged
because the watch on the config entry returned a nil result.
This is expected so don't log an error.
2022-10-25 10:55:44 -07:00
Kyle Havlovitz aaf892a383 Extend tcp keepalive settings to work for terminating gateways as well 2022-10-14 17:05:46 -07:00
Kyle Havlovitz 2c569f6b9c Update docs and add tcp_keepalive_probes setting 2022-10-14 17:05:46 -07:00
Kyle Havlovitz 2242d1ec4a Add TCP keepalive settings to proxy config for mesh gateways 2022-10-14 17:05:46 -07:00
James Oulman b8bd7a3058
Configure Envoy alpn_protocols based on service protocol (#14356)
* Configure Envoy alpn_protocols based on service protocol

* define alpnProtocols in a more standard way

* http2 protocol should be h2 only

* formatting

* add test for getAlpnProtocol()

* create changelog entry

* change scope is connect-proxy

* ignore errors on ParseProxyConfig; fixes linter

* add tests for grpc and http2 public listeners

* remove newlines from PR

* Add alpn_protocol configuration for ingress gateway

* Guard against nil tlsContext

* add ingress gateway w/ TLS tests for gRPC and HTTP2

* getAlpnProtocols: add TCP protocol test

* add tests for ingress gateway with grpc/http2 and per-listener TLS config

* add tests for ingress gateway with grpc/http2 and per-listener TLS config

* add Gateway level TLS config with mixed protocol listeners to validate ALPN

* update changelog to include ingress-gateway

* add http/1.1 to http2 ALPN

* go fmt

* fix test on custom-trace-listener
2022-10-10 13:13:56 -07:00
DanStough 77ab28c5c7 feat: xDS updates for peerings control plane through mesh gw 2022-10-07 08:46:42 -06:00
Eric Haberkorn 1633cf20ea
Make the mesh gateway changes to allow `local` mode for cluster peering data plane traffic (#14817)
Make the mesh gateway changes to allow `local` mode for cluster peering data plane traffic
2022-10-06 09:54:14 -04:00
Derek Menteer a279d2d329
Fix explicit tproxy listeners with discovery chains. (#14751)
Fix explicit tproxy listeners with discovery chains.
2022-10-05 14:38:25 -05:00
Alex Oskotsky 13da2c5fad
Add the ability to retry on reset connection to service-routers (#12890) 2022-10-05 13:06:44 -04:00
Freddy d9fe3578ac
Merge pull request #14734 from hashicorp/NET-643-update-mesh-gateway-envoy-config-for-inbound-peering-control-plane-traffic 2022-10-03 12:54:11 -06:00
freddygv b15d41534f Update xds generation for peering over mesh gws
This commit adds the xDS resources needed for INBOUND traffic from peer
clusters:

- 1 filter chain for all inbound peering requests.
- 1 cluster for all inbound peering requests.
- 1 endpoint per voting server with the gRPC TLS port configured.

There is one filter chain and cluster because unlike with WAN
federation, peer clusters will not attempt to dial individual servers.
Peer clusters will only dial the local mesh gateway addresses.
2022-10-03 12:42:27 -06:00
cskh 69f40df548
feat(ingress gateway: support configuring limits in ingress-gateway c… (#14749)
* feat(ingress gateway: support configuring limits in ingress-gateway config entry

- a new Defaults field with max_connections, max_pending_connections, max_requests
  is added to ingress gateway config entry
- new field max_connections, max_pending_connections, max_requests in
  individual services to overwrite the value in Default
- added unit test and integration test
- updated doc

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
2022-09-28 14:56:46 -04:00
Eric Haberkorn 6570d5f004
Enable outbound peered requests to go through local mesh gateway (#14763) 2022-09-27 09:49:28 -04:00
Derek Menteer aa4709ab74
Add envoy connection balancing. (#14616)
Add envoy connection balancing config.
2022-09-26 11:29:06 -05:00
Eric Haberkorn aa8268e50c
Implement Cluster Peering Redirects (#14445)
implement cluster peering redirects
2022-09-09 13:58:28 -04:00
malizz b3ac8f48ca
Add additional parameters to envoy passive health check config (#14238)
* draft commit

* add changelog, update test

* remove extra param

* fix test

* update type to account for nil value

* add test for custom passive health check

* update comments and tests

* update description in docs

* fix missing commas
2022-09-01 09:59:11 -07:00
Chris S. Kim e62f830fa8
Merge pull request #13998 from jorgemarey/f-new-tracing-envoy
Add new envoy tracing configuration
2022-09-01 08:57:23 -04:00
Eric Haberkorn 3726a0ab7a
Finish up cluster peering failover (#14396) 2022-08-30 11:46:34 -04:00
Jorge Marey 3f3bb8831e Fix typos. Add test. Add documentation 2022-08-30 16:59:02 +02:00
Eric Haberkorn 72f90754ae
Update max_ejection_percent on outlier detection for peered clusters to 100% (#14373)
We can't trust health checks on peered services when service resolvers,
splitters and routers are used.
2022-08-29 13:46:41 -04:00
cskh 41aea65214
Fix: the inboundconnection limit filter should be placed in front of http co… (#14325)
* fix: the inboundconnection limit should be placed in front of http connection manager

Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2022-08-24 14:13:10 -04:00
Eric Haberkorn ebd5513d4b
Refactor failover code to use Envoy's aggregate clusters (#14178) 2022-08-12 14:30:46 -04:00