Ensure nothing in the troubleshoot go module depends on consul's top level module. This is so we can import troubleshoot into consul-k8s and not import all of consul.
* turns troubleshoot into a go module [authored by @curtbushko]
* gets the envoy protos into the troubleshoot module [authored by @curtbushko]
* adds a new go module `envoyextensions` which has xdscommon and extensioncommon folders that both the xds package and the troubleshoot package can import
* adds testing and linting for the new go modules
* moves the unit tests in `troubleshoot/validateupstream` that depend on proxycfg/xds into the xds package, with a comment describing why those tests cannot be in the troubleshoot package
* fixes all the imports everywhere as a result of these changes
Co-authored-by: Curt Bushko <cbushko@gmail.com>
* Add Tproxy support to Envoy Extensions (this is needed for service to service validation)
* Add validation for Envoy configuration for an upstream service
* Use both /config_dump and /cluster to validate Envoy configuration
This is because of a bug in Envoy where the EndpointsConfigDump does not
include a cluster_name, making it impossible to match an endpoint to
verify it exists.
This removes endpoints support for builtin extensions since only the
validate plugin was using it, and it is no longer used. It also removes
test cases for endpoint validation. Endpoints validation now only occurs
in the top level test from config_dump and clusters json files.
Co-authored-by: Eric <eric@haberkorn.co>
Fix configuration merging for implicit tproxy upstreams.
Change the merging logic so that the wildcard upstream has correct proxy-defaults
and service-defaults values combined into it. It did not previously merge all fields,
and the wildcard upstream did not exist unless service-defaults existed (it ignored
proxy-defaults, essentially).
Change the way we fetch upstream configuration in the xDS layer so that it falls back
to the wildcard when no matching upstream is found. This is what allows implicit peer
upstreams to have the correct "merged" config.
Change proxycfg to always watch local mesh gateway endpoints whenever a peer upstream
is found. This simplifies the logic so that we do not have to inspect the "merged"
configuration on peer upstreams to extract the mesh gateway mode.
* Protobuf Modernization
Remove direct usage of golang/protobuf in favor of google.golang.org/protobuf
Marshallers (protobuf and json) needed some changes to account for different APIs.
Moved to using the google.golang.org/protobuf/types/known/* for the well known types including replacing some custom Struct manipulation with whats available in the structpb well known type package.
This also updates our devtools script to install protoc-gen-go from the right location so that files it generates conform to the correct interfaces.
* Fix go-mod-tidy make target to work on all modules
* Configure Envoy alpn_protocols based on service protocol
* define alpnProtocols in a more standard way
* http2 protocol should be h2 only
* formatting
* add test for getAlpnProtocol()
* create changelog entry
* change scope is connect-proxy
* ignore errors on ParseProxyConfig; fixes linter
* add tests for grpc and http2 public listeners
* remove newlines from PR
* Add alpn_protocol configuration for ingress gateway
* Guard against nil tlsContext
* add ingress gateway w/ TLS tests for gRPC and HTTP2
* getAlpnProtocols: add TCP protocol test
* add tests for ingress gateway with grpc/http2 and per-listener TLS config
* add tests for ingress gateway with grpc/http2 and per-listener TLS config
* add Gateway level TLS config with mixed protocol listeners to validate ALPN
* update changelog to include ingress-gateway
* add http/1.1 to http2 ALPN
* go fmt
* fix test on custom-trace-listener
This commit adds the xDS resources needed for INBOUND traffic from peer
clusters:
- 1 filter chain for all inbound peering requests.
- 1 cluster for all inbound peering requests.
- 1 endpoint per voting server with the gRPC TLS port configured.
There is one filter chain and cluster because unlike with WAN
federation, peer clusters will not attempt to dial individual servers.
Peer clusters will only dial the local mesh gateway addresses.
Routing peering control plane traffic through mesh gateways can be
enabled or disabled at runtime with the mesh config entry.
This commit updates proxycfg to add or cancel watches for local servers
depending on this central config.
Note that WAN federation over mesh gateways is determined by a service
metadata flag, and any updates to the gateway service registration will
force the creation of a new snapshot. If enabled, WAN-fed over mesh
gateways will trigger a local server watch on initialize().
Because of this we will only add/remove server watches if WAN federation
over mesh gateways is disabled.
* add golden files
* add support to http in tgateway egress destination
* fix slice sorting to include both address and port when using server_names
* fix listener loop for http destination
* fix routes to generate a route per port and a virtualhost per port-address combination
* sort virtual hosts list to have a stable order
* extract redundant serviceNode
Now that peered upstreams can generate envoy resources (#13758), we need a way to disambiguate local from peered resources in our metrics. The key difference is that datacenter and partition will be replaced with peer, since in the context of peered resources partition is ambiguous (could refer to the partition in a remote cluster or one that exists locally). The partition and datacenter of the proxy will always be that of the source service.
Regexes were updated to make emitting datacenter and partition labels mutually exclusive with peer labels.
Listener filter names were updated to better match the existing regex.
Cluster names assigned to peered upstreams were updated to be synthesized from local peer name (it previously used the externally provided primary SNI, which contained the peer name from the other side of the peering). Integration tests were updated to assert for the new peer labels.
Peered upstreams has a separate loop in xds from discovery chain upstreams. This PR adds similar but slightly modified code to add filters for peered upstream listeners, clusters, and endpoints in the case of transparent proxy.
When the protocol is http-like, and an intention has a peered source
then the normal RBAC mTLS SAN field check is replaces with a joint combo
of:
mTLS SAN field must be the service's local mesh gateway leaf cert
AND
the first XFCC header (from the MGW) must have a URI field that matches the original intention source
Also:
- Update the regex program limit to be much higher than the teeny
defaults, since the RBAC regex constructions are more complicated now.
- Fix a few stray panics in xds generation.
This is only configured in xDS when a service with an L7 protocol is
exported.
They also load any relevant trust bundles for the peered services to
eventually use for L7 SPIFFE validation during mTLS termination.
When converting from Consul intentions to xds RBAC rules, services imported from other peers must encode additional data like partition (from the remote cluster) and trust domain.
This PR updates the PeeringTrustBundle to hold the sending side's local partition as ExportedPartition. It also updates RBAC code to encode SpiffeIDs of imported services with the ExportedPartition and TrustDomain.
Mesh gateways will now enable tcp connections with SNI names including peering information so that those connections may be proxied.
Note: this does not change the callers to use these mesh gateways.
This is the OSS portion of enterprise PR 1994
Rather than directly interrogating the agent-local state for HTTP
checks using the `HTTPCheckFetcher` interface, we now rely on the
config snapshot containing the checks.
This reduces the number of changes required to support server xDS
sessions.
It's not clear why the fetching approach was introduced in
931d167ebb.
Envoy's SPIFFE certificate validation extension allows for us to
validate against different root certificates depending on the trust
domain of the dialing proxy.
If there are any trust bundles from peers in the config snapshot then we
use the SPIFFE validator as the validation context, rather than the
usual TrustedCA.
The injected validation config includes the local root certificates as
well.
For mTLS to work between two proxies in peered clusters with different root CAs,
proxies need to configure their outbound listener to use different root certificates
for validation.
Up until peering was introduced proxies would only ever use one set of root certificates
to validate all mesh traffic, both inbound and outbound. Now an upstream proxy
may have a leaf certificate signed by a CA that's different from the dialing proxy's.
This PR makes changes to proxycfg and xds so that the upstream TLS validation
uses different root certificates depending on which cluster is being dialed.
Description
Add x-fowarded-client-cert information on trusted incoming connections.
Envoy provides support forwarding and annotating the
x-forwarded-client-cert header via the forward_client_cert_details
set_current_client_cert_details filter fields. It would be helpful for
consul to support this directly in its config. The escape hatches are
a bit cumbersome for this purpose.
This has been implemented on incoming connections to envoy. Outgoing
(from the local service through the sidecar) will not have a
certificate, and so are left alone.
A service on an incoming connection will now get headers something like this:
```
X-Forwarded-Client-Cert:[By=spiffe://efad7282-d9b2-3298-f6d8-38b37fb58df3.consul/ns/default/dc/dc1/svc/counting;Hash=61ad5cbdfcb50f5a3ec0ca60923d61613c149a9d4495010a64175c05a0268ab2;Cert="-----BEGIN%20CERTIFICATE-----%0AMIICHDCCAcOgAwIBAgIBCDAKBggqhkjOPQQDAjAxMS8wLQYDVQQDEyZwcmktMTli%0AYXdyb2YuY29uc3VsLmNhLmVmYWQ3MjgyLmNvbnN1bDAeFw0yMjA0MjkwMzE0NTBa%0AFw0yMjA1MDIwMzE0NTBaMAAwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARVIZ7Y%0AZEXfbOGBfxGa7Vuok1MIng%2FuzLQK2xLVlSTIPDbO5hstTGP%2B%2FGx182PYFP3jYqk5%0Aq6rYWe1wiPNMA30Io4H8MIH5MA4GA1UdDwEB%2FwQEAwIDuDAdBgNVHSUEFjAUBggr%0ABgEFBQcDAgYIKwYBBQUHAwEwDAYDVR0TAQH%2FBAIwADApBgNVHQ4EIgQgrp4q50oX%0AHHghMbxz5Bk8OJFWMdfgH0Upr350WlhyxvkwKwYDVR0jBCQwIoAgUe6uERAIj%2FLM%0AyuFzDc3Wbp9TGAKBJYAwyhF14ToOQCMwYgYDVR0RAQH%2FBFgwVoZUc3BpZmZlOi8v%0AZWZhZDcyODItZDliMi0zMjk4LWY2ZDgtMzhiMzdmYjU4ZGYzLmNvbnN1bC9ucy9k%0AZWZhdWx0L2RjL2RjMS9zdmMvZGFzaGJvYXJkMAoGCCqGSM49BAMCA0cAMEQCIDwb%0AFlchufggNTijnQ5SUcvTZrWlZyq%2FrdVC20nbbmWLAiAVshNNv1xBqJI1NmY2HI9n%0AgRMfb8aEPVSuxEHhqy57eQ%3D%3D%0A-----END%20CERTIFICATE-----%0A";Chain="-----BEGIN%20CERTIFICATE-----%0AMIICHDCCAcOgAwIBAgIBCDAKBggqhkjOPQQDAjAxMS8wLQYDVQQDEyZwcmktMTli%0AYXdyb2YuY29uc3VsLmNhLmVmYWQ3MjgyLmNvbnN1bDAeFw0yMjA0MjkwMzE0NTBa%0AFw0yMjA1MDIwMzE0NTBaMAAwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARVIZ7Y%0AZEXfbOGBfxGa7Vuok1MIng%2FuzLQK2xLVlSTIPDbO5hstTGP%2B%2FGx182PYFP3jYqk5%0Aq6rYWe1wiPNMA30Io4H8MIH5MA4GA1UdDwEB%2FwQEAwIDuDAdBgNVHSUEFjAUBggr%0ABgEFBQcDAgYIKwYBBQUHAwEwDAYDVR0TAQH%2FBAIwADApBgNVHQ4EIgQgrp4q50oX%0AHHghMbxz5Bk8OJFWMdfgH0Upr350WlhyxvkwKwYDVR0jBCQwIoAgUe6uERAIj%2FLM%0AyuFzDc3Wbp9TGAKBJYAwyhF14ToOQCMwYgYDVR0RAQH%2FBFgwVoZUc3BpZmZlOi8v%0AZWZhZDcyODItZDliMi0zMjk4LWY2ZDgtMzhiMzdmYjU4ZGYzLmNvbnN1bC9ucy9k%0AZWZhdWx0L2RjL2RjMS9zdmMvZGFzaGJvYXJkMAoGCCqGSM49BAMCA0cAMEQCIDwb%0AFlchufggNTijnQ5SUcvTZrWlZyq%2FrdVC20nbbmWLAiAVshNNv1xBqJI1NmY2HI9n%0AgRMfb8aEPVSuxEHhqy57eQ%3D%3D%0A-----END%20CERTIFICATE-----%0A";Subject="";URI=spiffe://efad7282-d9b2-3298-f6d8-38b37fb58df3.consul/ns/default/dc/dc1/svc/dashboard]
```
Closes#12852
- `tls.incoming`: applies to the inbound mTLS targeting the public
listener on `connect-proxy` and `terminating-gateway` envoy instances
- `tls.outgoing`: applies to the outbound mTLS dialing upstreams from
`connect-proxy` and `ingress-gateway` envoy instances
Fixes#11966
Due to timing, a transparent proxy could have two upstreams to dial
directly with the same address.
For example:
- The orders service can dial upstreams shipping and payment directly.
- An instance of shipping at address 10.0.0.1 is deregistered.
- Payments is scaled up and scheduled to have address 10.0.0.1.
- The orders service receives the event for the new payments instance
before seeing the deregistration for the shipping instance. At this
point two upstreams have the same passthrough address and Envoy will
reject the listener configuration.
To disambiguate this commit considers the Raft index when storing
passthrough addresses. In the example above, 10.0.0.1 would only be
associated with the newer payments service instance.
Transparent proxies can set up filter chains that allow direct
connections to upstream service instances. Services that can be dialed
directly are stored in the PassthroughUpstreams map of the proxycfg
snapshot.
Previously these addresses were not being cleaned up based on new
service health data. The list of addresses associated with an upstream
service would only ever grow.
As services scale up and down, eventually they will have instances
assigned to an IP that was previously assigned to a different service.
When IP addresses are duplicated across filter chain match rules the
listener config will be rejected by Envoy.
This commit updates the proxycfg snapshot management so that passthrough
addresses can get cleaned up when no longer associated with a given
upstream.
There is still the possibility of a race condition here where due to
timing an address is shared between multiple passthrough upstreams.
That concern is mitigated by #12195, but will be further addressed
in a follow-up.
The gist here is that now we use a value-type struct proxycfg.UpstreamID
as the map key in ConfigSnapshot maps where we used to use "upstream
id-ish" strings. These are internal only and used just for bidirectional
trips through the agent cache keyspace (like the discovery chain target
struct).
For the few places where the upstream id needs to be projected into xDS,
that's what (proxycfg.UpstreamID).EnvoyID() is for. This lets us ALWAYS
inject the partition and namespace into these things without making
stuff like the golden testdata diverge.