* peerstream: dialer should reconnect when stream closes
If the stream is closed unexpectedly (i.e. when we haven't received
a terminated message), the dialer should attempt to re-establish the
stream.
Previously, the `HandleStream` would return `nil` when the stream
was closed. The caller then assumed the stream was terminated on purpose
and so didn't reconnect when instead it was stopped unexpectedly and
the dialer should have attempted to reconnect.
Ensure that the peer stream replication rpc can successfully be used with TLS activated.
Also:
- If key material is configured for the gRPC port but HTTPS is not
enabled now TLS will still be activated for the gRPC port.
- peerstream replication stream opened by the establishing-side will now
ignore grpc.WithBlock so that TLS errors will bubble up instead of
being awkwardly delayed or suppressed
* Made changes based on Adams suggestions
* updating list layout in systems integration guide. updating wan federation docs.
* fixing env vars on systems integration page
* fixing h3 to h2 on enterprise license page
* Changed `The following steps will be performed` to `Complete the following steps`
* Replaced `These steps will be repeated for each datacenter` with `Repeat the following steps for each datacenter in the cluster`
* Emphasizing that kv2 secrets only need to be stored once.
* Move the sentence indicating where the vault path maps to the helm chart out of the -> Note callout
* remaining suggestions
* Removing store the secret in Vault from server-tls page
* Making the Bootstrapping the Server PKI Engine sections the same on server-tls and webhook-cert pages
* Apply suggestions from code review
Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
* Updating VAULT_ADDR on systems-integration to get it out of the shell.
* Updating intro paragraph of Overview on systems-integration.mdx to what Adamsuggested.
* Putting the GKE, AKS, AKS info into tabs on the systems integration page.
* Apply suggestions from code review
Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
A Node ID is not a required field with Consul’s data model. Therefore we cannot reliably expect all uses to have it. However the node name is required and must be unique so its equally as good of a key for the internal healthSnapshot node tracking.
Prior to this the dialing side of the peering would only ever work within the default partition. This commit allows properly parsing the partition field out of the API struct request body, query param and header.
* Don't request nodes/services `with-peers` anymore
This will be automatic - no need for the query-param anymore.
* Return peering data based on feature flag mock-api services/nodes
* Update tests to reflect removed with-peers query-param
* setup cookie for turning peer feature flag on in mock-api in testing
* Add missing `S` for renamed PEERING feature-flag cookie
This is the OSS portion of enterprise PR 2265.
This PR provides a server-local implementation of the
proxycfg.FederationStateListMeshGateways interface based on blocking queries.
This is the OSS portion of enterprise PR 2259.
This PR provides a server-local implementation of the proxycfg.GatewayServices
interface based on blocking queries.
This is the OSS portion of enterprise PR 2250.
This PR provides server-local implementations of the proxycfg.TrustBundle and
proxycfg.TrustBundleList interfaces, based on local blocking queries.
This is the OSS portion of enterprise PR 2249.
This PR introduces an implementation of the proxycfg.Health interface based on a
local materialized view of the health events.
It reuses the view and request machinery from agent/rpcclient/health, which made
it super straightforward.
This is the OSS portion of enterprise PR 2242.
This PR introduces a server-local implementation of the proxycfg.ServiceList
interface, backed by streaming events and a local materializer.
We cannot do this for "subscribe" and "partition" this easily without
breakage so those are omitted.
Any protobuf message passed around via an Any construct will have the
fully qualified package name embedded in the protobuf as a string. Also
RPC method dispatch will include the package of the service during
serialization.
- We will be passing pbservice and pbpeering through an Any as part of
peer stream replication.
- We will be exposing two new gRPC services via pbpeering and
pbpeerstream.
Previously, public referred to gRPC services that are both exposed on
the dedicated gRPC port and have their definitions in the proto-public
directory (so were considered usable by 3rd parties). Whereas private
referred to services on the multiplexed server port that are only usable
by agents and other servers.
Now, we're splitting these definitions, such that external/internal
refers to the port and public/private refers to whether they can be used
by 3rd parties.
This is necessary because the peering replication API needs to be
exposed on the dedicated port, but is not (yet) suitable for use by 3rd
parties.
- Use some protobuf construction helper methods for brevity.
- Rename a local variable to avoid later shadowing.
- Rename the Nonce field to be more like xDS's naming.
- Be more explicit about which PeerID fields are empty.
If someone were to switch a peer-exported service from L4 to L7 there
would be a brief SAN validation hiccup as traffic shifted to the mesh
gateway for termination.
This PR sends the mesh gateway SpiffeID down all the time so the clients
always expect a switch.
For L4/tcp exported services the mesh gateways will not be terminating
TLS. A caller in one peer will be directly establishing TLS connections
to the ultimate exported service in the other peer.
The caller will be doing SAN validation using the replicated SpiffeID
values shipped from the exporting side. There are a class of discovery
chain edits that could be done on the exporting side that would cause
the introduction of a new SpiffeID value. In between the time of the
config entry update on the exporting side and the importing side getting
updated peer stream data requests to the exported service would fail due
to SAN validation errors.
This is unacceptable so instead prohibit the exporting peer from making
changes that would break peering in this way.
Because peerings are pairwise, between two tuples of (datacenter,
partition) having any exported reference via a discovery chain that
crosses out of the peered datacenter or partition will ultimately not be
able to work for various reasons. The biggest one is that there is no
way in the ultimate destination to configure an intention that can allow
an external SpiffeID to access a service.
This PR ensures that a user simply cannot do this, so they won't run
into weird situations like this.
* Request peering permissions when peerings is active
* Update peering ability to use peering resource
* fix canDelete peer permission to check write permission
* use super call in abilities.peer#canDelete
* ui: use environment variable for feature flagging peers
* Add documentation for `features`-service
* Allow setting feature flag for peers via bookmarklet
* don't use features service for flagging peers
* add ability for checking if peers feature is enabled
* Use abilities to conditionally use peers feature
* Remove unused features service