Commit Graph

10320 Commits (000ca15db7d64606168c183dfcabd95740396fab)

Author SHA1 Message Date
Jack Pearkes 17569d7714 Update CHANGELOG.md 2019-07-12 15:54:29 -07:00
Jack Pearkes 338aed32af Merge branch 'master' into release/1-6 2019-07-12 14:51:25 -07:00
Matt Keeler 4728329aeb
Various Gateway Fixes (#6093)
* Ensure the mesh gateway configuration comes back in the api within each upstream

* Add a test for the MeshGatewayConfig in the ToAPI functions

* Ensure we don’t use gateways for dc local connections

* Update the svc kind index for deletions

* Replace the proxycfg.state cache with an interface for testing

Also start implementing proxycfg state testing.

* Update the state tests to verify some gateway watches for upstream-targets of a discovery chain.
2019-07-12 17:19:37 -04:00
Sarah Adams ce3a6e8660
fix flaky test TestACLEndpoint_SecureIntroEndpoints_OnlyCreateLocalData (#6116)
* fix test to write only to dc2 (typo)
* fix retry behavior in existing test (was being used incorrectly)
2019-07-12 14:14:42 -07:00
R.B. Boyer bcd2de3a2e
implement some missing service-router features and add more xDS testing (#6065)
- also implement OnlyPassing filters for non-gateway clusters
2019-07-12 14:16:21 -05:00
R.B. Boyer 8a90185bbd
unknown fields now fail, so omit these unimplemented fields (#6125) 2019-07-12 14:04:15 -05:00
Freddy 97ecc0517c
Fix some retries in api pkg (#6124) 2019-07-12 12:57:41 -06:00
R.B. Boyer 9138a97054
Fix bug in service-resolver redirects if the destination uses a default resolver. (#6122)
Also:
- add back an internal http endpoint to dump a compiled discovery chain for debugging purposes

Before the CompiledDiscoveryChain.IsDefault() method would test:

- is this chain just one resolver step?
- is that resolver step just the default?

But what I forgot to test:

- is that resolver step for the same service that the chain represents?

This last point is important because if you configured just one config
entry:

    kind = "service-resolver"
    name = "web"
    redirect {
      service = "other"
    }

and requested the chain for "web" you'd get back a **default** resolver
for "other".  In the xDS code the IsDefault() method is used to
determine if this chain is "empty". If it is then we use the
pre-discovery-chain logic that just uses data embedded in the Upstream
object (and still lets the escape hatches function).

In the example above that means certain parts of the xDS code were going
to try referencing a cluster named "web..." despite the other parts of
the xDS code maintaining clusters named "other...".
2019-07-12 12:21:25 -05:00
R.B. Boyer 67a36e3452
handle structs.ConfigEntry decoding similarly to api.ConfigEntry decoding (#6106)
Both 'consul config write' and server bootstrap config entries take a
decoding detour through mapstructure on the way from HCL to an actual
struct. They both may take in snake_case or CamelCase (for consistency)
so need very similar handling.

Unfortunately since they are operating on mirror universes of structs
(api.* vs structs.*) the code cannot be identitical, so try to share the
kind-configuration and duplicate the rest for now.
2019-07-12 12:20:30 -05:00
Matt Keeler 6e65811db2
Envoy CLI bind addresses (#6107)
* Ensure we MapWalk the proxy config in the NodeService and ServiceNode structs

This gets rid of some json encoder errors in the catalog endpoints

* Allow passing explicit bind addresses to envoy

* Move map walking to the ConnectProxyConfig struct

Any place where this struct gets JSON encoded will benefit as opposed to having to implement it everywhere.

* Fail when a non-empty address is provided and not bindable

* camel case

* Update command/connect/envoy/envoy.go

Co-Authored-By: Paul Banks <banks@banksco.de>
2019-07-12 12:57:31 -04:00
R.B. Boyer 911ed76e5b
tests: further reduce envoy integration test flakiness (#6112)
In addition to waiting until s2 shows up healthy in the Catalog, wait
until s2 endpoints show up healthy via EDS in the s1 upstream clusters.
2019-07-12 11:12:56 -05:00
Freddy 5873c56a03
Flaky test overhaul (#6100) 2019-07-12 09:52:26 -06:00
Freddy 4033a4d632
Remove dummy config (#6121) 2019-07-12 09:50:14 -06:00
Freddy 0b9111714b
Update TestServer creation in sdk/testutil (#6084)
* Retry the creation of the test server three times.
* Reduce the retry timeout for the API wait to 2 seconds, opting to fail faster and start over.
* Remove wait for leader from server creation. This wait can be added on a test by test basis now that the function is being exported.
* Remove wait for anti-entropy sync. This is built into the existing WaitForSerfCheck func, so that can be used if the anti-entropy wait is needed
2019-07-12 09:37:29 -06:00
Freddy 0f8837824e
Clean up StatsFetcher work when context is exceeded (#6086) 2019-07-12 08:23:28 -06:00
Matt Keeler de23af071a
Move ctx and cancel func setup into the Replicator.Start (#6115)
Previously a sequence of events like:

Start
Stop
Start
Stop

would segfault on the second stop because the original ctx and cancel func were only initialized during the constructor and not during Start.
2019-07-12 10:10:48 -04:00
hashicorp-ci 79e36385ce Merge Consul OSS branch 'master' at commit 3f3cd9a164 2019-07-11 02:00:37 +00:00
R.B. Boyer d4e58e9773 test: for envoy integration tests bump the time to wait for the upstream to be healthy (#6109) 2019-07-10 18:07:47 -04:00
R.B. Boyer 20caa4f744
test: for envoy integration tests, wait until 's2' is healthy in consul before interrogating envoy (#6108)
When the envoy healthy panic threshold was explicitly disabled as part
of L7 traffic management it changed how envoy decided to load balance to
endpoints in a cluster. This only matters when envoy is in "panic mode"
aka "when you have a bunch of unhealthy endpoints". Panic mode sends
traffic to unhealthy instances in certain circumstances.

Note: Prior to explicitly disabling the healthy panic threshold, the
default value is 50%.

What was happening is that the test harness was bringing up consul the
sidecars, and the service instances all at once and sometimes the
proxies wouldn't have time to be checked by consul to be labeled as
'passing' in the catalog before a round of EDS happened.

The xDS server in consul effectively queries /v1/health/connect/s2 and
gets 1 result, but that one result has a 'critical' check so the xDS
server sends back that endpoint labeled as UNHEALTHY.

Envoy sees that 100% of the endpoints in the cluster are unhealthy and
would enter panic mode and still send traffic to s2. This is why the
test suites PRIOR to disabling the healthy panic threshold worked. They
were _incorrectly_ passing.

When the healthy panic threshol is disabled, envoy never enters panic
mode in this situation and thus the cluster has zero healthy endpoints
so load balancing goes nowhere and the tests fail.

Why does this only affect the test suites for envoy 1.8.0? My guess is
that https://github.com/envoyproxy/envoy/pull/4442 was merged into the
1.9.x series and somehow that plays a role.

This PR modifies the bats scripts to explicitly wait until the upstream
sidecar is healthy as measured by /v1/health/connect/s2?passing BEFORE
trying to interrogate envoy which should make the tests less racy.
2019-07-10 15:58:25 -05:00
Jack Pearkes da1f0af18e
Update CHANGELOG.md 2019-07-09 15:41:32 +02:00
Jack Pearkes df6eab83d4 website: link to beta changelog 2019-07-09 13:43:29 +02:00
Jack Pearkes ad9dcfcfe5 website: fix use-case dropdown size 2019-07-09 08:45:58 +02:00
Jack Pearkes 6cf9abcf2f website: remove configuration use-case 2019-07-09 07:47:49 +02:00
Jack Pearkes 81aba02015 website: better mesh call to actions 2019-07-09 05:55:58 +02:00
Jack Pearkes 12dc9019af website: better mesh links into new docs 2019-07-09 05:51:23 +02:00
R.B. Boyer 068a069512
config entry doc snippet for mesh gateways (#6095) 2019-07-08 21:25:25 -05:00
R.B. Boyer edd0d4be5a
Initial L7 Documentation (#6056) 2019-07-08 21:11:19 -05:00
Judith Malnick a0ee71baf9
[docs] Guide - Connecting Services Across Datacenters (#6052)
* add connect gateway guide

* Remove stray space

Co-Authored-By: Freddy <freddygv@users.noreply.github.com>

* Specify stanza and exact options

Co-Authored-By: Freddy <freddygv@users.noreply.github.com>

* incorporate comments from freddy

* integrate feedback from matt

* make snippets all json

* incorporate more comments from matt

* added links

* incorporate comments from neena on google doc draft

* make learn lnks relative

* clarify that gateways are new

* change socat to netcat

* add more description about replication token permissions

* Apply suggestions from code review

Co-Authored-By: Matt Keeler <mkeeler@users.noreply.github.com>

* add the prerequisite to enable centralized service config

* finish adding docs links
2019-07-09 02:07:51 +02:00
Matt Keeler d4a3c0e661
Initial Mesh Gateway Docs (#6090) 2019-07-08 19:40:57 -04:00
Freddy 3f3cd9a164
Update CHANGELOG.md 2019-07-08 13:43:15 -06:00
Paul Banks 3716bac3b7
Better gateway image 2019-07-08 16:30:51 +02:00
Jack Pearkes e453e8aabb Putting source back into Dev Mode 2019-07-08 16:30:03 +02:00
Jack Pearkes 9013bc5199 website: changes for 1.6.0 beta (#6083)
* website: link to 1.6.0 beta in downloads page

* website: reorganize intention replication/ca federation

* website: remove announcement bar

* Update website/source/docs/connect/connect-internals.html.md

Co-Authored-By: Paul Banks <banks@banksco.de>

* website: update homepage and service mesh page

Aligning messaging to current product.

* website: fix link TODOs

* Add Mesh Gateway to mesh page, update use case wording
2019-07-08 15:12:42 +01:00
hashicorp-ci 83b929accf
Release v1.6.0-beta1 2019-07-08 13:20:36 +00:00
hashicorp-ci 7e379bb7f7
update bindata_assetfs.go 2019-07-08 13:20:35 +00:00
Matt Keeler 52997acb02
Update CHANGELOG.md 2019-07-08 08:54:10 -04:00
Jack Pearkes e6f1b78efb Make cluster names SNI always (#6081)
* Make cluster names SNI always

* Update some tests

* Ensure we check for prepared query types

* Use sni for route cluster names

* Proper mesh gateway mode defaulting when the discovery chain is used

* Ignore service splits from PatchSliceOfMaps

* Update some xds golden files for proper test output

* Allow for grpc/http listeners/cluster configs with the disco chain

* Update stats expectation
2019-07-08 12:48:48 +01:00
hashicorp-ci d2adfc0bda Merge Consul OSS branch 'master' at commit c2c154eaf4 2019-07-08 02:00:37 +00:00
Judith Malnick c2c154eaf4
[docs] Link to TLS guide in Encryption doc (#6071)
Fixes issue #6067
2019-07-07 16:55:03 +02:00
Jack Pearkes b8f5f08efe
Update CHANGELOG.md 2019-07-05 10:37:42 -07:00
Jack Pearkes 7d07a0b397
Update CHANGELOG.md 2019-07-05 10:22:48 -07:00
Michael Schurter b5aab27c21 connect: allow overriding envoy listener bind_address (#6033)
* connect: allow overriding envoy listener bind_address

* Update agent/xds/config.go

Co-Authored-By: Kyle Havlovitz <kylehav@gmail.com>

* connect: allow overriding envoy listener bind_port

* envoy: support unix sockets for grpc in bootstrap

Add AgentSocket BootstrapTplArgs which if set overrides the AgentAddress
and AgentPort to generate a bootstrap which points Envoy to a unix
socket file instead of an ip:port.

* Add a test for passing the consul addr as a unix socket

* Fix config formatting for envoy bootstrap tests

* Fix listeners test cases for bind addr/port

* Update website/source/docs/connect/proxies/envoy.md
2019-07-05 16:06:47 +01:00
John Cowen dcb9800442
ui: Gateway Addresses (#6075)
- Removes 'type' icons (basically the proxy icon, not the text itself)

- Add support for Mesh Gateways plus their addresses
This adds a 'Mesh Gateway' type label to service and service instance
pages, plus a new 'Addresses' tab if the service is a Mesh Gateway
showing a table of addresses for the service - plus tests
2019-07-05 09:07:25 +01:00
Matt Keeler 3d562bee5c Fix Internal.ServiceDump blocking (#6076)
maxIndexWatchTxn was only watching the IndexEntry of the max index of all the entries. It needed to watch all of them regardless of which was the max.

Also plumbed the query source through in the proxy config to help better track requests.
2019-07-04 16:17:49 +01:00
Matt Keeler 52c608df78
`make test-docker` (#6059)
* Implement the test-docker make target

Running tests within docker allows us to resource constrain them better to not take over our systems. Additionally it allows us to run the tests on linux instead of the host OS which often times is macOS.

* Use GOMAXPROCS instead of -p

* Add a comment about docker cpus
2019-07-04 10:22:59 -04:00
Matt Keeler 387a54cfaa
Don't use WatchedDatacenters in the xds code(#6068)
* Don't use WatchedDatacenters in the xds code as that map gets nil'ed before the ConfigSnapshot is sent to the xds layer.
2019-07-03 10:21:34 -04:00
Matt Keeler 62ad0294d4 Don't use WatchedDatacenters in the xds code as thsoe get nil'ed out prior to sending to xds 2019-07-03 09:59:21 -04:00
Matt Keeler d2b00bd591
xds message ordering (#6061)
xds message ordering
2019-07-03 09:18:58 -04:00
hashicorp-ci 7a32c5a618 Merge Consul OSS branch 'master' at commit a58d8e91ac 2019-07-03 02:00:31 +00:00
R.B. Boyer 065550e1c5
ensure consul config write has snake case conversions for MeshGateway (#6062) 2019-07-02 17:15:30 -05:00