Commit Graph

1383 Commits (348fd0f9bae390ea1663b2c11ff3576368d6c2d3)

Author SHA1 Message Date
Paul Banks 8336b5e6b9 XDS Server Config (#4730)
* Config for the coming XDS server

* Default gRPC to 8502 for -dev mode; Re-merge the command Info output that shows gRPC.
2018-10-10 16:55:34 +01:00
Paul Banks 0f27ffd163 Proxy Config Manager (#4729)
* Proxy Config Manager

This component watches for local state changes on the agent and ensures that each service registered locally with Kind == connect-proxy has it's state being actively populated in the cache.

This serves two purposes:
 1. For the built-in proxy, it ensures that the state needed to accept connections is available in RAM shortly after registration and likely before the proxy actually starts accepting traffic.
 2. For (future - next PR) xDS server and other possible future proxies that require _push_ based config discovery, this provides a mechanism to subscribe and be notified about updates to a proxy instance's config including upstream service discovery results.

* Address review comments

* Better comments; Better delivery of latest snapshot for slow watchers; Embed Config

* Comment typos

* Add upstream Stringer for funsies
2018-10-10 16:55:34 +01:00
Paul Banks 96b9b95a19 Add cache.Notify to abstract watching for cache updates for types that support blocking semantics. (#4695) 2018-10-10 16:55:34 +01:00
Paul Banks e812f5516a Add -sidecar-for and new /agent/service/:service_id endpoint (#4691)
- A new endpoint `/v1/agent/service/:service_id` which is a generic way to look up the service for a single instance. The primary value here is that it:
   - **supports hash-based blocking** and so;
   - **replaces `/agent/connect/proxy/:proxy_id`** as the mechanism the built-in proxy uses to read its config.
   - It's not proxy specific and so works for any service.
   - It has a temporary shim to call through to the existing endpoint to preserve current managed proxy config defaulting behaviour until that is removed entirely (tested).
 - The built-in proxy now uses the new endpoint exclusively for it's config
 - The built-in proxy now has a `-sidecar-for` flag that allows the service ID of the _target_ service to be specified, on the condition that there is exactly one "sidecar" proxy (that is one that has `Proxy.DestinationServiceID` set) for the service registered.
 - Several fixes for edge cases for SidecarService
 - A fix for `Alias` checks - when running locally they didn't update their state until some external thing updated the target. If the target service has no checks registered as below, then the alias never made it past critical.
2018-10-10 16:55:34 +01:00
Paul Banks 1e7eace066 Add SidecarService Syntax sugar to Service Definition (#4686)
* Added new Config for SidecarService in ServiceDefinitions.

* WIP: all the code needed for SidecarService is written... none of it is tested other than config :). Need API updates too.

* Test coverage for the new sidecarServiceFromNodeService method.

* Test API registratrion with SidecarService

* Recursive Key Translation 🤦

* Add tests for nested sidecar defintion arrays to ensure they are translated correctly

* Use dedicated internal state rather than Service Meta for tracking sidecars for deregistration.

Add tests for deregistration.

* API struct for agent register. No other endpoint should be affected yet.

* Additional test cases to cover updates to API registrations
2018-10-10 16:55:34 +01:00
Paul Banks b83bbf248c Add Proxy Upstreams to Service Definition (#4639)
* Refactor Service Definition ProxyDestination.

This includes:
 - Refactoring all internal structs used
 - Updated tests for both deprecated and new input for:
   - Agent Services endpoint response
   - Agent Service endpoint response
   - Agent Register endpoint
     - Unmanaged deprecated field
     - Unmanaged new fields
     - Managed deprecated upstreams
     - Managed new
   - Catalog Register
     - Unmanaged deprecated field
     - Unmanaged new fields
     - Managed deprecated upstreams
     - Managed new
   - Catalog Services endpoint response
   - Catalog Node endpoint response
   - Catalog Service endpoint response
 - Updated API tests for all of the above too (both deprecated and new forms of register)

TODO:
 - config package changes for on-disk service definitions
 - proxy config endpoint
 - built-in proxy support for new fields

* Agent proxy config endpoint updated with upstreams

* Config file changes for upstreams.

* Add upstream opaque config and update all tests to ensure it works everywhere.

* Built in proxy working with new Upstreams config

* Command fixes and deprecations

* Fix key translation, upstream type defaults and a spate of other subtele bugs found with ned to end test scripts...

TODO: tests still failing on one case that needs a fix. I think it's key translation for upstreams nested in Managed proxy struct.

* Fix translated keys in API registration.
≈

* Fixes from docs
 - omit some empty undocumented fields in API
 - Bring back ServiceProxyDestination in Catalog responses to not break backwards compat - this was removed assuming it was only used internally.

* Documentation updates for Upstreams in service definition

* Fixes for tests broken by many refactors.

* Enable travis on f-connect branch in this branch too.

* Add consistent Deprecation comments to ProxyDestination uses

* Update version number on deprecation notices, and correct upstream datacenter field with explanation in docs
2018-10-10 16:55:34 +01:00
Paul Banks b06ddc9187 Rename proxy package (re-run of #4550) (#4638)
* Rename agent/proxy package to reflect that it is limited to managed proxy processes

Rationale: we have several other components of the agent that relate to Connect proxies for example the ProxyConfigManager component needed for Envoy work. Those things are pretty separate from the focus of this package so far which is only concerned with managing external proxy processes so it's nota good fit to put code for that in here, yet there is a naming clash if we have other packages related to proxy functionality that are not in the `agent/proxy` package.

Happy to bikeshed the name. I started by calling it `managedproxy` but `managedproxy.Manager` is especially unpleasant. `proxyprocess` seems good in that it's more specific about purpose but less clearly connected with the concept of "managed proxies". The names in use are cleaner though e.g. `proxyprocess.Manager`.

This rename was completed automatically using golang.org/x/tools/cmd/gomvpkg.

Depends on #4541

* Fix missed windows tagged files
2018-10-10 16:55:34 +01:00
Paul Banks 88388d760d Support Agent Caching for Service Discovery Results (#4541)
* Add cache types for catalog/services and health/services and basic test that caching works

* Support non-blocking cache types with Cache-Control semantics.

* Update API docs to include caching info for every endpoint.

* Comment updates per PR feedback.

* Add note on caching to the 10,000 foot view on the architecture page to make the new data path more clear.

* Document prepared query staleness quirk and force all background requests to AllowStale so we can spread service discovery load across servers.
2018-10-10 16:55:34 +01:00
Igal Shprincis e1fe3af37f watch: don't set TLSConfig.Address explicitly (#4727)
Don't set the value of TLSConfig.Address explicitly.

This will make sure env vars like CONSUL_TLS_SERVER_NAME are taken into account for the connection. Fixes #4718.
2018-10-08 22:01:17 +02:00
Paul Banks e8ba527f23
Add a Close method to cache that stops background goroutines. (#4746)
In a real agent the `cache` instance is alive until the agent shuts down so this is not a real leak in production, however in out test suite, every testAgent that is started and stops leaks goroutines that never get cleaned up which accumulate consuming CPU and memory through subsequent test in the `agent` package which doesn't help our test flakiness.

This adds a Close method that doesn't invalidate or clean up the cache, and still allows concurrent blocking queries to run (for up to 10 mins which might still affect tests). But at least it doesn't maintain them forever with background refresh and an expiry watcher routine.

It would be nice to cancel any outstanding blocking requests as well when we close but that requires much more invasive surgery right into our RPC protocol since we don't have a way to cancel requests currently.

Unscientifically this seems to make tests pass a bit quicker and more reliably locally but I can't really be sure of that!
2018-10-04 11:27:11 +01:00
Paul O'Connor 6b7f03911e Fix prometheus error message (#4745) 2018-10-03 14:47:56 -07:00
R.B. Boyer 491826ddbc
cli: forward SIGTERM to child process of 'lock' and 'watch' subcommands (#4737)
cli: forward SIGTERM to child process of 'lock' and 'watch' subcommands on unix

This also removes the signal handler for SIGKILL as it's impossible to receive these signals.
2018-10-02 15:57:21 -05:00
Alex Dadgar 43d0f96c42 do not bootstrap with non voters 2018-09-19 17:41:36 -07:00
Kyle Havlovitz 57deb28ade connect/ca: tighten up the intermediate signing verification 2018-09-14 16:08:54 -07:00
Kyle Havlovitz 2919519665 connect/ca: add intermediate functions to Vault ca provider 2018-09-13 13:38:32 -07:00
Kyle Havlovitz 52e8652ac5 connect/ca: add intermediate functions to Consul CA provider 2018-09-13 13:09:21 -07:00
Kyle Havlovitz d515d25856
Merge pull request #4644 from hashicorp/ca-refactor
connect/ca: rework initialization/root generation in providers
2018-09-13 13:08:34 -07:00
mkeeler 48d287ef69
Release v1.2.3 2018-09-13 15:22:25 +00:00
Paul Banks 74f2a80a42
Fix CA pruning when CA config uses string durations. (#4669)
* Fix CA pruning when CA config uses string durations.

The tl;dr here is:

 - Configuring LeafCertTTL with a string like "72h" is how we do it by default and should be supported
 - Most of our tests managed to escape this by defining them as time.Duration directly
 - Out actual default value is a string
 - Since this is stored in a map[string]interface{} config, when it is written to Raft it goes through a msgpack encode/decode cycle (even though it's written from server not over RPC).
 - msgpack decode leaves the string as a `[]uint8`
 - Some of our parsers required string and failed
 - So after 1 hour, a default configured server would throw an error about pruning old CAs
 - If a new CA was configured that set LeafCertTTL as a time.Duration, things might be OK after that, but if a new CA was just configured from config file, intialization would cause same issue but always fail still so would never prune the old CA.
 - Mostly this is just a janky error that got passed tests due to many levels of complicated encoding/decoding.

tl;dr of the tl;dr: Yay for type safety. Map[string]interface{} combined with msgpack always goes wrong but we somehow get bitten every time in a new way :D

We already fixed this once! The main CA config had the same problem so @kyhavlov already wrote the mapstructure DecodeHook that fixes it. It wasn't used in several places it needed to be and one of those is notw in `structs` which caused a dependency cycle so I've moved them.

This adds a whole new test thta explicitly tests the case that broke here. It also adds tests that would have failed in other places before (Consul and Vaul provider parsing functions). I'm not sure if they would ever be affected as it is now as we've not seen things broken with them but it seems better to explicitly test that and support it to not be bitten a third time!

* Typo fix

* Fix bad Uint8 usage
2018-09-13 15:43:00 +01:00
Hans Hasselberg 8e235a72b4
Allow disabling the HTTP API again. (#4655)
If you provide an invalid HTTP configuration consul will still start again instead of failing. But if you do so the build-in proxy won't be able to start which you might need for connect.
2018-09-13 16:06:04 +02:00
Kyle Havlovitz 5c7fbc284d connect/ca: hash the consul provider ID and include isRoot 2018-09-12 13:44:15 -07:00
Pierre Souchay 1a906ef34e Fix more unstable tests in agent and command 2018-09-12 14:49:27 +01:00
Kyle Havlovitz c112a72880
connect/ca: some cleanup and reorganizing of the new methods 2018-09-11 16:43:04 -07:00
Pierre Souchay 2fe728c7bd Ensure that Proxies ARE always cleaned up, event with DeregisterCriticalServiceAfter (#4649)
This fixes https://github.com/hashicorp/consul/issues/4648
2018-09-11 17:34:09 +01:00
Matt Keeler d3ee66eed4
Add ECS option to EDNS responses where appropriate (#4647)
This implements parts of RFC 7871 where Consul is acting as an authoritative name server (or forwarding resolver when recursors are configured)

If ECS opt is present in the request we will mirror it back and return a response with a scope of 0 (global) or with the same prefix length as the request (indicating its valid specifically for that subnet).

We only mirror the prefix-length (non-global) for prepared queries as those could potentially use nearness checks that could be affected by the subnet. In the future we could get more sophisticated with determining the scope bits and allow for better caching of prepared queries that don’t rely on nearness checks.

The other thing this does not do is implement the part of the ECS RFC related to originating ECS headers when acting as a intermediate DNS server (forwarding resolver). That would take a quite a bit more effort and in general provide very little value. Consul will currently forward the ECS headers between recursors and the clients transparently, we just don't originate them for non-ECS clients to get potentially more accurate "location aware" results.
2018-09-11 09:37:46 -04:00
Pierre Souchay 22500f242e Fix unstable tests in agent, api, and command/watch 2018-09-10 16:58:53 +01:00
Mitchell Hashimoto 49b165965d
Merge pull request #4642 from hashicorp/f-ui-meta
agent: aggregate service instance meta for UI purposes
2018-09-07 17:36:23 -07:00
Mitchell Hashimoto b95348c4b1
agent: ExternalSources instead of Meta 2018-09-07 10:06:55 -07:00
Matt Keeler cc8327ed9a
Ensure that errors setting up the DNS servers get propagated back to the shell (#4598)
Fixes: #4578 

Prior to this fix if there was an error binding to ports for the DNS servers the error would be swallowed by the gated log writer and never output. This fix propagates the DNS server errors back to the shell with a multierror.
2018-09-07 10:48:29 -04:00
Pierre Souchay eddcf228ea Implementation of Weights Data structures (#4468)
* Implementation of Weights Data structures

Adding this datastructure will allow us to resolve the
issues #1088 and #4198

This new structure defaults to values:
```
   { Passing: 1, Warning: 0 }
```

Which means, use weight of 0 for a Service in Warning State
while use Weight 1 for a Healthy Service.
Thus it remains compatible with previous Consul versions.

* Implemented weights for DNS SRV Records

* DNS properly support agents with weight support while server does not (backwards compatibility)

* Use Warning value of Weights of 1 by default

When using DNS interface with only_passing = false, all nodes
with non-Critical healthcheck used to have a weight value of 1.
While having weight.Warning = 0 as default value, this is probably
a bad idea as it breaks ascending compatibility.

Thus, we put a default value of 1 to be consistent with existing behaviour.

* Added documentation for new weight field in service description

* Better documentation about weights as suggested by @banks

* Return weight = 1 for unknown Check states as suggested by @banks

* Fixed typo (of -> or) in error message as requested by @mkeeler

* Fixed unstable unit test TestRetryJoin

* Fixed unstable tests

* Fixed wrong Fatalf format in `testrpc/wait.go`

* Added notes regarding DNS SRV lookup limitations regarding number of instances

* Documentation fixes and clarification regarding SRV records with weights as requested by @banks

* Rephrase docs
2018-09-07 15:30:47 +01:00
Kyle Havlovitz 546bdf8663
connect/ca: add Configure/GenerateRoot to provider interface 2018-09-06 19:18:59 -07:00
Mitchell Hashimoto e9ea190df0
agent: aggregate service instance meta for UI purposes 2018-09-06 12:19:05 -07:00
Mitchell Hashimoto 99eb154f6f
agent: configure k8s go-discover 2018-09-05 13:38:13 -07:00
Martin feb3ce4ee0 Use target service name instead of ID as connect proxy service name (#4620) 2018-09-05 20:33:17 +01:00
Pierre Souchay 9a2ae6e8eb Fixed more flaky tests in ./agent/consul (#4617) 2018-09-04 14:02:47 +01:00
Pierre Souchay 92acdaa94c Fixed flaky tests (#4626) 2018-09-04 12:31:51 +01:00
Siva Prasad ca35d04472
Adds a new command line flag -log-file for file based logging. (#4581)
* Added log-file flag to capture Consul logs in a user specified file

* Refactored code.

* Refactored code. Added flags to rotate logs based on bytes and duration

* Added the flags for log file and log rotation on the webpage

* Fixed TestSantize from failing due to the addition of 3 flags

* Introduced changes : mutex, data-dir log writes, rotation logic

* Added test for logfile and updated the default log destination for docs

* Log name now uses UnixNano

* TestLogFile is now uses t.Parallel()

* Removed unnecessary int64Val function

* Updated docs to reflect default log name for log-file

* No longer writes to data-dir and adds .log if the filename has no extension
2018-08-29 16:56:58 -04:00
Freddy d7a404f2ee
Bugfix: Use "%#v" when formatting structs (#4600) 2018-08-28 12:37:34 -04:00
Siva Prasad b1a34f899f
TestAgentAntiEntropy: Wait until Consul service is up on the agent. (#4591)
* Anti-Entropy test wait for Consul service added

* Reverted some tests back to using WaitForLeader
2018-08-28 09:52:11 -04:00
Pierre Souchay 5e0218ccf4 Fix unit test TestOperatorAutopilotGetConfigCommand (#4594) 2018-08-27 13:29:25 -04:00
Pierre Souchay aea31d3c5d Fixed unstable test TestUiNodeInfo (#4586) 2018-08-27 11:49:14 -04:00
Pierre Souchay b898131723 [BUGFIX] Avoid returning empty data on startup of a non-leader server (#4554)
Ensure that DB is properly initialized when performing stale queries

Addresses:
- https://github.com/hashicorp/consul-replicate/issues/82
- https://github.com/hashicorp/consul/issues/3975
- https://github.com/hashicorp/consul-template/issues/1131
2018-08-23 12:06:39 -04:00
Miroslav Bagljas 3c23979afd Fixes #4483: Add support for Authorization: Bearer token Header (#4502)
Added Authorization Bearer token support as per RFC6750

* appended Authorization header token parsing after X-Consul-Token
* added test cases
* updated website documentation to mention Authorization header

* improve tests, improve Bearer parsing
2018-08-17 16:18:42 -04:00
Matt Keeler e81c85c051
Fix #4515: Segfault when serf_wan port was -1 but reconnect_time_wan was set (#4531)
Fixes #4515 

This just slightly refactors the logic to only attempt to set the serf wan reconnect timeout when the rest of the serf wan settings are configured - thus avoiding a segfault.
2018-08-17 14:44:25 -04:00
Kyle Havlovitz e5e1f867e5
Merge branch 'master' into ca-snapshot-fix 2018-08-16 13:00:54 -07:00
Kyle Havlovitz f186edc42c
fsm: add connect service config to snapshot/restore test 2018-08-16 12:58:54 -07:00
nickmy9729 beddf03b26 Added code to allow snapshot inclusion of NodeMeta (#4527) 2018-08-16 15:33:35 -04:00
Kyle Havlovitz b51d76f469
fsm: add missing CA config to snapshot/restore logic 2018-08-16 11:58:50 -07:00
Kyle Havlovitz 4b35d877ca
autopilot: don't follow the normal server removal rules for nonvoters 2018-08-14 14:24:51 -07:00
Kyle Havlovitz ea14482376
Fix stats fetcher healthcheck RPCs not being independent 2018-08-14 14:23:52 -07:00
Pierre Souchay 0d6de257a2 Display more information about check being not properly added when it fails (#4405)
* Display more information about check being not properly added when it fails

It follows an incident where we add lots of error messages:

  [WARN] consul.fsm: EnsureRegistration failed: failed inserting check: Missing service registration

That seems related to Consul failing to restart on respective agents.

Having Node information as well as service information would help diagnose the issue.

* Renamed ensureCheckIfNodeMatches() as requested by @banks
2018-08-14 17:45:33 +01:00
Freddy 6d43d24edb
Improve reliability of tests with TestAgent (#4525)
- Add WaitForTestAgent to tests flaky due to missing serfHealth registration

- Fix bug in retries calling Fatalf with *testing.T

- Convert TestLockCommand_ChildExitCode to table driven test
2018-08-14 12:08:33 -04:00
Pierre Souchay ef3b81ab13 Allow to rename nodes with IDs, will fix #3974 and #4413 (#4415)
* Allow to rename nodes with IDs, will fix #3974 and #4413

This change allow to rename any well behaving recent agent with an
ID to be renamed safely, ie: without taking the name of another one
with case insensitive comparison.

Deprecated behaviour warning
----------------------------

Due to asceding compatibility, it is still possible however to
"take" the name of another name by not providing any ID.

Note that when not providing any ID, it is possible to have 2 nodes
having similar names with case differences, ie: myNode and mynode
which might lead to DB corruption on Consul server side and
lead to server not properly restarting.

See #3983 and #4399 for Context about this change.

Disabling registration of nodes without IDs as specified in #4414
should probably be the way to go eventually.

* Removed the case-insensitive search when adding a node within the else
block since it breaks the test TestAgentAntiEntropy_Services

While the else case is probably legit, it will be fixed with #4414 in
a later release.

* Added again the test in the else to avoid duplicated names, but
enforce this test only for nodes having IDs.

Thus most tests without any ID will work, and allows us fixing

* Added more tests regarding request with/without IDs.

`TestStateStore_EnsureNode` now test registration and renaming with IDs

`TestStateStore_EnsureNodeDeprecated` tests registration without IDs
and tests removing an ID from a node as well as updated a node
without its ID (deprecated behaviour kept for backwards compatibility)

* Do not allow renaming in case of conflict, including when other node has no ID

* Fixed function GetNodeID that was not working due to wrong type when searching node from its ID

Thus, all tests about renaming were not working properly.

Added the full test cas that allowed me to detect it.

* Better error messages, more tests when nodeID is not a valid UUID in GetNodeID()

* Added separate TestStateStore_GetNodeID to test GetNodeID.

More complete test coverage for GetNodeID

* Added new unit test `TestStateStore_ensureNoNodeWithSimilarNameTxn`

Also fixed comments to be clearer after remarks from @banks

* Fixed error message in unit test to match test case

* Use uuid.ParseUUID to parse Node.ID as requested by @mkeeler
2018-08-10 11:30:45 -04:00
Siva Prasad c88900aaa9
PR to fix TestAgent_IndexChurn and TestPreparedQuery_Wrapper. (#4512)
* Fixes TestAgent_IndexChurn

* Fixes TestPreparedQuery_Wrapper

* Increased sleep in agent_test for IndexChurn to 500ms

* Made the comment about joinWAN operation much less of a cliffhanger
2018-08-09 12:40:07 -04:00
Armon Dadgar 4f1fd34e9e consul: Update buffer sizes 2018-08-08 10:26:58 -07:00
Siva Prasad 288d350a73
Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix." (#4497)
* Revert "BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)"

This reverts commit cec5d72396.

* Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493)"

This reverts commit 589b589b53.
2018-08-07 08:29:48 -04:00
Pierre Souchay cec5d72396 BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)
- Improve resilience of testrpc.WaitForLeader()

- Add additionall retry to CI

- Increase "go test" timeout to 8m

- Add wait for cluster leader to several tests in the agent package

- Add retry to some tests in the api and command packages
2018-08-06 19:46:09 -04:00
Siva Prasad 589b589b53
CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493)
* connect: fix an issue with Consul CA bootstrapping being interrupted

* streamline change server id test
2018-08-06 16:15:24 -04:00
Siva Prasad 865068a358
DNS : Fixes recursors answering the DNS query to properly return the correct response. (#4461)
* Fixes the DNS recursor properly resolving the requests

* Added a test case for the recursor bug

* Refactored code && added a test case for all failing recursors

* Inner indentation moved into else if check
2018-08-02 10:12:52 -04:00
Paul Banks 71dd3b408a
Fixes memory leak when blocking on /event/list (#4482) 2018-08-02 14:54:48 +01:00
mkeeler e716d1b5f8
Release v1.2.2 2018-07-30 16:01:13 +00:00
Matt Keeler 870a6ad6a8
Handle resolving proxy tokens when parsing HTTP requests (#4453)
Fixes: #4441

This fixes the issue with Connect Managed Proxies + ACLs being broken.

The underlying problem was that the token parsed for most http endpoints was sent untouched to the servers via the RPC request. These changes make it so that at the HTTP endpoint when parsing the token we additionally attempt to convert potential proxy tokens into regular tokens before sending to the RPC endpoint. Proxy tokens are only valid on the agent with the managed proxy so the resolution has to happen before it gets forwarded anywhere.
2018-07-30 09:11:51 -04:00
Matt Keeler 0e0227792b
Gossip tuneables (#4444)
Expose a few gossip tuneables for both lan and wan interfaces

gossip_nodes
gossip_interval
probe_timeout
probe_interval
retransmit_mult
suspicion_mult
2018-07-26 11:39:49 -04:00
Kyle Havlovitz fa0d8aff33
fix inconsistency in TestConnectCAConfig_GetSet 2018-07-26 07:46:47 -07:00
Paul Banks 8dd50d5b2d
Add config option to disable HTTP printable char path check (#4442) 2018-07-26 13:53:39 +01:00
Kyle Havlovitz ed87949385
Merge pull request #4400 from hashicorp/leaf-cert-ttl
Add configurable leaf cert TTL to Connect CA
2018-07-25 17:53:25 -07:00
Kyle Havlovitz f67a4d59c0
connect/ca: simplify passing of leaf cert TTL 2018-07-25 17:51:45 -07:00
Siva Prasad f4a1c381a5 Vendoring update for go-discover. (#4412)
* New Providers added and updated vendoring for go-discover

* Vendor.json formatted using make vendorfmt

* Docs/Agent/auto-join: Added documentation for the new providers introduced in this PR

* Updated the golang.org/x/sys/unix in the vendor directory

* Agent: TestGoDiscoverRegistration updated to reflect the addition of new providers

* Deleted terraform.tfstate from vendor.

* Deleted terraform.tfstate.backup

Deleted terraform state file artifacts from unknown runs.

* Updated x/sys/windows vendor for Windows binary compilation
2018-07-25 16:21:04 -07:00
Paul Banks 8cbeb29e73
Fixes #4421: General solution to stop blocking queries with index 0 (#4437)
* Fix theoretical cache collision bug if/when we use more cache types with same result type

* Generalized fix for blocking query handling when state store methods return zero index

* Refactor test retry to only affect CI

* Undo make file merge

* Add hint to error message returned to end-user requests if Connect is not enabled when they try to request cert

* Explicit error for Roots endpoint if connect is disabled

* Fix tests that were asserting old behaviour
2018-07-25 20:26:27 +01:00
Paul Banks 5635227fa6
Allow config-file based Service Definitions for unmanaged proxies and Connect-natice apps. (#4443) 2018-07-25 19:55:41 +01:00
Paul Banks d5e934f9ff
Ooops that was meant to be to a branch no master... EMORECOFFEE
Revert "Add config option to disable HTTP printable char path check"

This reverts commit eebe45a47b.
2018-07-25 15:54:11 +01:00
Paul Banks eebe45a47b
Add config option to disable HTTP printable char path check 2018-07-25 15:52:37 +01:00
Paul Banks e954450dec
Merge pull request #4353 from azam/add-serf-lan-wan-port-args
Make RPC, Serf LAN, Serf WAN port configurable from CLI
2018-07-24 12:33:10 +01:00
Kyle Havlovitz ce10de036e
connect/ca: check LeafCertTTL when rotating expired roots 2018-07-20 16:04:04 -07:00
Mitchell Hashimoto 7fa6bb022f
Merge pull request #4320 from hashicorp/f-alias-check
Add "Alias" Check Type
2018-07-20 13:01:33 -05:00
azam 342bcb1c24 Make Serf LAN & WAN port configurable from CLI
Make RPC port accessible to CLI

Add tests and documentation for server-port, serf-lan-port, serf-wan-port CLI arguments
2018-07-21 02:17:21 +09:00
Mitchell Hashimoto b3854fdd28
agent/local: silly spacing on select statements 2018-07-19 14:21:30 -05:00
Mitchell Hashimoto 8c72bb0cdf
agent/local: address remaining test feedback 2018-07-19 14:20:50 -05:00
Matt Keeler 560c9c26f7 Use the agent logger instead of log module 2018-07-19 11:22:01 -04:00
Matt Keeler ca5851318d Update a couple erroneous tests. 2018-07-19 09:20:51 -04:00
Mitchell Hashimoto 9f128e40d6
agent/local: don't use time.After in test since notify is instant 2018-07-18 16:16:28 -05:00
Matt Keeler 3fe5f566f2 Persist proxies from config files
Also change how loadProxies works. Now it will load all persisted proxies into a map, then when loading config file proxies will look up the previous proxy token in that map.
2018-07-18 17:04:35 -04:00
Kyle Havlovitz d6ca015a42
connect/ca: add configurable leaf cert TTL 2018-07-16 13:33:37 -07:00
Matt Keeler c891e264ca Fix issue with choosing a client addr that is 0.0.0.0 or :: 2018-07-16 16:30:15 -04:00
Mitchell Hashimoto 9a90400821
agent/checks: prevent overflow of backoff 2018-07-12 10:21:49 -07:00
Mitchell Hashimoto d6ecd97d1d
agent: use the correct ACL token for alias checks 2018-07-12 10:17:53 -07:00
Mitchell Hashimoto f97bfd5be8
agent: address some basic feedback 2018-07-12 09:36:11 -07:00
Mitchell Hashimoto 19ced12668
agent: alias checks have no interval 2018-07-12 09:36:11 -07:00
Mitchell Hashimoto 5bc27feb0b
agent/structs: check is alias if node is empty 2018-07-12 09:36:11 -07:00
Mitchell Hashimoto 36e330941a
agent/checks: support node-only checks 2018-07-12 09:36:11 -07:00
Mitchell Hashimoto 1e9233eec1
agent/checks: set critical if RPC fails 2018-07-12 09:36:11 -07:00
Mitchell Hashimoto e9914ee71c
agent/checks: use local state for local services 2018-07-12 09:36:11 -07:00
Mitchell Hashimoto 7543d270e2
agent/local: support local alias checks 2018-07-12 09:36:10 -07:00
Mitchell Hashimoto 4a67beb734
agent: run alias checks 2018-07-12 09:36:10 -07:00
Mitchell Hashimoto 60c75b88da
agent/checks: reflect node failure as alias check failure 2018-07-12 09:36:10 -07:00
Mitchell Hashimoto f0658a0ede
agent/config: support configuring alias check 2018-07-12 09:36:10 -07:00
Mitchell Hashimoto 632e4a2c69
agent/checks: add Alias check type 2018-07-12 09:36:09 -07:00
mkeeler 39f93f011e
Release v1.2.1 2018-07-12 16:33:56 +00:00
Matt Keeler 63d5c069fc
Merge pull request #4379 from hashicorp/persist-intermediates
connect: persist intermediate CAs on leader change
2018-07-12 12:09:13 -04:00
Paul Banks 9015cd62ab
Merge pull request #4381 from hashicorp/proxy-check-default
Proxy check default
2018-07-12 17:08:35 +01:00
Matt Keeler 0e83059d1f
Revert "Allow changing Node names since Node now have IDs" 2018-07-12 11:19:21 -04:00
Matt Keeler 91150cca59 Fixup formatting 2018-07-12 10:14:26 -04:00
Matt Keeler 3807e04de9 Revert PR 4294 - Catalog Register: Generate UUID for services registered without one
UUID auto-generation here causes trouble in a few cases. The biggest being older
nodes reregistering will fail when the UUIDs are different and the names match

This reverts commit 0f70034082.
This reverts commit d1a8f9cb3f.
This reverts commit cf69ec42a4.
2018-07-12 10:06:50 -04:00
Matt Keeler 7572ca0f37
Merge pull request #4374 from hashicorp/feature/proxy-env-vars
Setup managed proxy environment with API client env vars
2018-07-12 09:13:54 -04:00
Paul Banks 8405b41f2b
Update proxy config docs and add test for ipv6 2018-07-12 13:07:48 +01:00
Paul Banks bb9a5c703b
Default managed proxy TCP check address sanely when proxy is bound to 0.0.0.0.
This also provides a mechanism to configure custom address or disable the check entirely from managed proxy config.
2018-07-12 12:57:10 +01:00
Matt Keeler 0f56ed2d01 Set api.Config’s InsecureSkipVerify to the value of !RuntimeConfig.VerifyOutgoing 2018-07-12 07:49:23 -04:00
Matt Keeler 22e4058893 Use type switch instead of .Network for more reliably detecting UnixAddrs 2018-07-12 07:30:17 -04:00
Matt Keeler 700a275ddf Look specifically for tcp instead of unix
Add runtime -> api.Config tests
2018-07-11 17:25:36 -04:00
Matt Keeler c8df4b824c Update proxy manager test - test passing ProxyEnv vars 2018-07-11 16:50:27 -04:00
Kyle Havlovitz f95c6807e7
connect: use reflect.DeepEqual instead for test 2018-07-11 13:10:58 -07:00
Matt Keeler 98ead2a8f8
Merge pull request #3983 from pierresouchay/node_renaming
Allow changing Node names since Node now have IDs
2018-07-11 16:03:02 -04:00
Kyle Havlovitz 4e5fb6bc19
connect: add provider state to snapshots 2018-07-11 11:34:49 -07:00
Kyle Havlovitz 462ace4867
connect: update leader initializeCA comment 2018-07-11 10:00:42 -07:00
Kyle Havlovitz 1d3f4b5099
connect: persist intermediate CAs on leader change 2018-07-11 09:44:30 -07:00
Matt Keeler c54b43bef3 PR Updates
Proxy now doesn’t need to know anything about the api as we pass env vars to it instead of the api config.
2018-07-11 09:44:54 -04:00
Matt Keeler 4d1ead10b3
Merge pull request #4371 from hashicorp/bugfix/gh-4358
Remove https://prefix from TLSConfig.Address
2018-07-11 08:50:10 -04:00
Pierre Souchay fecae3de21 When renaming a node, ensure the name is not taken by another node.
Since DNS is case insensitive and DB as issues when similar names with different
cases are added, check for unicity based on case insensitivity.

Following another big incident we had in our cluster, we also validate
that adding/renaming a not does not conflicts with case insensitive
matches.

We had the following error once:

 - one node called: mymachine.MYDC.mydomain was shut off
 - another node (different ID) was added with name: mymachine.mydc.mydomain before
   72 hours

When restarting the consul server of domain, the consul server restarted failed
to start since it detected an issue in RAFT database because
mymachine.MYDC.mydomain and mymachine.mydc.mydomain had the same names.

Checking at registration time with case insensitivity should definitly fix
those issues and avoid Consul DB corruption.
2018-07-11 14:42:54 +02:00
Matt Keeler bd76a34002
Merge pull request #4365 from pierresouchay/fix_test_warning
Fixed compilation warning about wrong type
2018-07-10 16:53:29 -04:00
Matt Keeler 3b6eef8ec6 Pass around an API Config object and convert to env vars for the managed proxy 2018-07-10 12:13:51 -04:00
Pierre Souchay 7d2e4b77ec Use %q, not %s as it used to 2018-07-10 16:52:08 +02:00
Matt Keeler 0fd7e97c2d Merge remote-tracking branch 'origin/master' into bugfix/prevent-multi-cname 2018-07-10 10:26:45 -04:00
Matt Keeler d19c7d8882
Merge pull request #4303 from pierresouchay/non_blocking_acl
Only send one single ACL cache refresh across network when TTL is over
2018-07-10 08:57:33 -04:00
Matt Keeler d066fb7b18
Merge pull request #4362 from hashicorp/bugfix/gh-4354
Ensure TXT RRs always end up in the Additional section except for ANY or TXT queries
2018-07-10 08:50:31 -04:00
Pierre Souchay b112bdd52d Fixed compilation warning about wrong type
It fixes the following warnings:

  agent/config/builder.go:1201: Errorf format %q has arg s of wrong type *string
  agent/config/builder.go:1240: Errorf format %q has arg s of wrong type *string
2018-07-09 23:43:56 +02:00
Paul Banks 41c3a4ac8e
Merge pull request #4038 from pierresouchay/ACL_additional_info
Track calls blocked by ACLs using metrics
2018-07-09 20:21:21 +01:00
MagnumOpus21 371f0c3d5f Tests/Proxy : Changed function name to match the system being tested. 2018-07-09 13:18:57 -04:00
MagnumOpus21 9d57b72e81 Resolved merge conflicts 2018-07-09 12:48:34 -04:00
MagnumOpus21 300330e24b Agent/Proxy: Formatting and test cases fix 2018-07-09 12:46:10 -04:00
Matt Keeler 962f6a1816 Remove https://prefix from TLSConfig.Address 2018-07-09 12:31:15 -04:00
Matt Keeler cbf8f14451 Ensure TXT RRs always end up in the Additional section except for ANY or TXT queries
This also changes where the enforcement of the enable_additional_node_meta_txt configuration gets applied.

formatNodeRecord returns the main RRs and the meta/TXT RRs in separate slices. Its then up to the caller to add to the appropriate sections or not.
2018-07-09 12:30:11 -04:00
MagnumOpus21 94e8ff55cf Proxy/Tests: Added test cases to check env variables 2018-07-09 12:28:29 -04:00
MagnumOpus21 6cecf2961d Agent/Proxy : Properly passes env variables to child 2018-07-09 12:28:29 -04:00
Pierre Souchay ff53648df2 Merge remote-tracking branch 'origin/master' into ACL_additional_info 2018-07-07 14:09:18 +02:00
Pierre Souchay 0e4e451a56 Fixed indentation in test 2018-07-07 14:03:34 +02:00
Kyle Havlovitz 401b206a2e
Store the time CARoot is rotated out instead of when to prune 2018-07-06 16:05:25 -07:00
MagnumOpus21 1cd1b55682 Agent/Proxy : Properly passes env variables to child 2018-07-05 22:04:29 -04:00
Matt Keeler e3783a75e7 Refactor to make this much less confusing 2018-07-03 11:04:19 -04:00
Matt Keeler 554035974e Add a bunch of comments about preventing multi-cname
Hopefully this a bit clearer as to the reasoning
2018-07-03 10:32:52 -04:00
Matt Keeler 22c2be5bf1 Fix some edge cases and add some tests. 2018-07-02 16:58:52 -04:00
Matt Keeler 9a8500412b Only allow 1 CNAME when querying for a service.
This just makes sure that if multiple services are registered with unique service addresses that we don’t blast back multiple CNAMEs for the same service DNS name and keeps us within the DNS specs.
2018-07-02 16:12:06 -04:00
Kyle Havlovitz 1492243e0a
connect/ca: add logic for pruning old stale RootCA entries 2018-07-02 10:35:05 -07:00
Matt Keeler 8a12d803fd
Merge pull request #4315 from hashicorp/bugfix/fix-server-enterprise
Move starting enterprise functionality
2018-07-02 12:28:10 -04:00
Pierre Souchay bd023f352e Updated swith case to use same branch for async-cache and extend-cache 2018-07-02 17:39:34 +02:00
Pierre Souchay 1e7665c0d5 Updated documentation and adding more test case for async-cache 2018-07-01 23:50:30 +02:00
Pierre Souchay abde81a3e7 Added async-cache with similar behaviour as extend-cache but asynchronously 2018-07-01 23:50:30 +02:00
Pierre Souchay 9406ca1c95 Only send one single ACL cache refresh across network when TTL is over
It will allow the following:

 * when connectivity is limited (saturated linnks between DCs), only one
   single request to refresh an ACL will be sent to ACL master DC instead
   of statcking ACL refresh queries
 * when extend-cache is used for ACL, do not wait for result, but refresh
   the ACL asynchronously, so no delay is not impacting slave DC
 * When extend-cache is not used, keep the existing blocking mechanism,
   but only send a single refresh request.

This will fix https://github.com/hashicorp/consul/issues/3524
2018-07-01 23:50:30 +02:00
Abhishek Chanda 36306c0076 Change bind_port to an int 2018-06-30 14:18:13 +01:00
Matt Keeler 22b7b688a3
Move starting enterprise functionality 2018-06-29 17:38:29 -04:00
Mitchell Hashimoto 6ef28dece0
agent/config: parse upstreams with multiple service definitions 2018-06-28 15:13:33 -05:00
Mitchell Hashimoto e155d58b19
Merge pull request #4297 from hashicorp/b-intention-500-2
agent: 400 error on invalid UUID format, api handles errors properly
2018-06-28 05:27:19 +02:00
Matt Keeler 0f70034082 Move default uuid test into the consul package 2018-06-27 09:21:58 -04:00
Matt Keeler d1a8f9cb3f go fmt changes 2018-06-27 09:07:22 -04:00
Mitchell Hashimoto 1c3e9af316
agent: 400 error on invalid UUID format, api handles errors properly 2018-06-27 07:40:06 +02:00
Matt Keeler cf69ec42a4 Make sure to generate UUIDs when services are registered without one
This makes the behavior line up with the docs and expected behavior
2018-06-26 17:04:08 -04:00
mkeeler 28141971f9
Release v1.2.0 2018-06-25 19:45:20 +00:00
mkeeler 6813a99081 Merge remote-tracking branch 'connect/f-connect' 2018-06-25 19:42:51 +00:00
Kyle Havlovitz 162daca4d7 revert go changes to hide rotation config 2018-06-25 12:26:18 -07:00
Kyle Havlovitz c20bbf8760 connect/ca: hide the RotationPeriod config field since it isn't used yet 2018-06-25 12:26:18 -07:00
Mitchell Hashimoto a76f652fd2 agent: convert the proxy bind_port to int if it is a float 2018-06-25 12:26:18 -07:00
Matt Keeler 677d6dac80 Remove x509 name constraints
These were only added as SPIFFE intends to use the in the future but currently does not mandate their usage due to patch support in common TLS implementations and some ambiguity over how to use them with URI SAN certificates. We included them because until now everything seem fine with it, however we've found the latest version of `openssl` (1.1.0h) fails to validate our certificats if its enabled. LibreSSL as installed on OS X by default doesn’t have these issues. For now it's most compatible not to have them and later we can find ways to add constraints with wider compatibility testing.
2018-06-25 12:26:10 -07:00
Matt Keeler 163fe11101 Make sure we omit the Kind value in JSON if empty 2018-06-25 12:26:10 -07:00
Jack Pearkes 105c4763dc update UI to latest 2018-06-25 12:25:42 -07:00
Kyle Havlovitz 3baa67cdef connect/ca: pull the cluster ID from config during a rotation 2018-06-25 12:25:42 -07:00
Kyle Havlovitz 8c2c9705d9 connect/ca: use weak type decoding in the Vault config parsing 2018-06-25 12:25:42 -07:00
Kyle Havlovitz b4ef7bb64d connect/ca: leave blank root key/cert out of the default config (unnecessary) 2018-06-25 12:25:42 -07:00
Kyle Havlovitz 050da22473 connect/ca: undo the interface changes and use sign-self-issued in Vault 2018-06-25 12:25:42 -07:00
Kyle Havlovitz 914d9e5e20 connect/ca: add leaf verify check to cross-signing tests 2018-06-25 12:25:41 -07:00
Kyle Havlovitz bc997688e3 connect/ca: update Consul provider to use new cross-sign CSR method 2018-06-25 12:25:41 -07:00
Kyle Havlovitz 8a70ea64a6 connect/ca: update Vault provider to add cross-signing methods 2018-06-25 12:25:41 -07:00
Kyle Havlovitz 6a2fc00997 connect/ca: add URI SAN support to the Vault provider 2018-06-25 12:25:41 -07:00
Kyle Havlovitz 226a59215d connect/ca: fix vault provider URI SANs and test 2018-06-25 12:25:41 -07:00
Kyle Havlovitz 1a8ac686b2 connect/ca: add the Vault CA provider 2018-06-25 12:25:41 -07:00
Paul Banks 51fc48e8a6 Sign certificates valid from 1 minute earlier to avoid failures caused by clock drift 2018-06-25 12:25:41 -07:00
Paul Banks e33bfe249e Note leadership issues in comments 2018-06-25 12:25:41 -07:00
Paul Banks b5f24a21cb Fix test broken by final telemetry PR change! 2018-06-25 12:25:40 -07:00
Paul Banks e514570dfa Actually return Intermediate certificates bundled with a leaf! 2018-06-25 12:25:40 -07:00
Matt Keeler e22b9c8e15 Output the service Kind in the /v1/internal/ui/services endpoint 2018-06-25 12:25:40 -07:00
Paul Banks 17789d4fe3 register TCP check for managed proxies 2018-06-25 12:25:40 -07:00
Paul Banks 280f14d64c Make proxy only listen after initial certs are fetched 2018-06-25 12:25:40 -07:00
Paul Banks 420ae3df69 Limit proxy telemetry config to only be visible with authenticated with a proxy token 2018-06-25 12:25:39 -07:00
Paul Banks 597e55e8e2 Misc test fixes 2018-06-25 12:25:39 -07:00
Paul Banks c6ef6a61c9 Refactor to use embedded struct. 2018-06-25 12:25:39 -07:00
Paul Banks 9f559da913 Revert telemetry config changes ready for cleaner approach 2018-06-25 12:25:39 -07:00
Paul Banks 38405bd4a9 Allow user override of proxy telemetry config 2018-06-25 12:25:38 -07:00
Paul Banks 7649d630c6 Basic proxy telemetry working; not sure if it's too ugly; need to instrument things we care about 2018-06-25 12:25:38 -07:00
Paul Banks d83f2e8e21 Expose telemetry config from RuntimeConfig to proxy config endpoint 2018-06-25 12:25:38 -07:00
Paul Banks 8aeb7bd206 Disable TestAgent proxy execution properly 2018-06-25 12:25:38 -07:00
Paul Banks 2e223ea2b7 Fix hot loop in cache for RPC returning zero index. 2018-06-25 12:25:37 -07:00
Paul Banks 43b48bc06b Get agent cache tests passing without global hit count (which is racy).
Few other fixes in here just to get a clean run locally - they are all also fixed in other PRs but shouldn't conflict.

This should be robust to timing between goroutines now.
2018-06-25 12:25:37 -07:00
Mitchell Hashimoto 155bb67c52 Update UI for beta3 2018-06-25 12:25:16 -07:00
Mitchell Hashimoto 6b1e0a3003 agent/cache: always schedule the refresh 2018-06-25 12:25:14 -07:00
Mitchell Hashimoto 7cbbac43a3 agent: clarify comment 2018-06-25 12:25:14 -07:00
Mitchell Hashimoto a08faf5a11 agent: add additional assertion to test 2018-06-25 12:25:13 -07:00
Paul Banks 2c21ead80e More test tweaks 2018-06-25 12:25:13 -07:00
Paul Banks 05a8097c5d Fix misc test failures (some from other PRs) 2018-06-25 12:25:13 -07:00
Paul Banks 382ce8f98a Only set precedence on write path 2018-06-25 12:25:13 -07:00
Paul Banks 4a54f8f7e3 Fix some tests failures caused by the sorting change and some cuased by previous UpdatePrecedence() change 2018-06-25 12:25:13 -07:00
Paul Banks bf7a62e0e0 Sort intention list by precedence 2018-06-25 12:25:13 -07:00
Mitchell Hashimoto 181fbcc9b9 agent: intention update/delete responess match ACL/KV behavior 2018-06-25 12:25:12 -07:00