Commit Graph

127 Commits (5ff0c19709e6a23b74eeb32025fd7b780c056a80)

Author SHA1 Message Date
Pierre Souchay bfcfcc06d0 Revendor memberlist to Fix #3217
Upgrade leads to protocol version (2) is incompatible: [1, 0] (#5313)

This is fixed in https://github.com/hashicorp/memberlist/pull/178, bump
memberlist to fix possible split brain in Consul.
2019-02-05 10:20:14 -05:00
Matt Keeler 8f0d622a54
Revendor serf to pull in keyring list truncation changes. (#5251) 2019-01-22 16:07:04 -05:00
R.B. Boyer b96391ecff
update github.com/hashicorp/{serf,memberlist,go-sockaddr} (#5189)
This activates large-cluster improvements in the gossip layer from
https://github.com/hashicorp/memberlist/pull/167
2019-01-07 15:00:47 -06:00
Matt Keeler 18b29c45c4
New ACLs (#4791)
This PR is almost a complete rewrite of the ACL system within Consul. It brings the features more in line with other HashiCorp products. Obviously there is quite a bit left to do here but most of it is related docs, testing and finishing the last few commands in the CLI. I will update the PR description and check off the todos as I finish them over the next few days/week.
Description

At a high level this PR is mainly to split ACL tokens from Policies and to split the concepts of Authorization from Identities. A lot of this PR is mostly just to support CRUD operations on ACLTokens and ACLPolicies. These in and of themselves are not particularly interesting. The bigger conceptual changes are in how tokens get resolved, how backwards compatibility is handled and the separation of policy from identity which could lead the way to allowing for alternative identity providers.

On the surface and with a new cluster the ACL system will look very similar to that of Nomads. Both have tokens and policies. Both have local tokens. The ACL management APIs for both are very similar. I even ripped off Nomad's ACL bootstrap resetting procedure. There are a few key differences though.

    Nomad requires token and policy replication where Consul only requires policy replication with token replication being opt-in. In Consul local tokens only work with token replication being enabled though.
    All policies in Nomad are globally applicable. In Consul all policies are stored and replicated globally but can be scoped to a subset of the datacenters. This allows for more granular access management.
    Unlike Nomad, Consul has legacy baggage in the form of the original ACL system. The ramifications of this are:
        A server running the new system must still support other clients using the legacy system.
        A client running the new system must be able to use the legacy RPCs when the servers in its datacenter are running the legacy system.
        The primary ACL DC's servers running in legacy mode needs to be a gate that keeps everything else in the entire multi-DC cluster running in legacy mode.

So not only does this PR implement the new ACL system but has a legacy mode built in for when the cluster isn't ready for new ACLs. Also detecting that new ACLs can be used is automatic and requires no configuration on the part of administrators. This process is detailed more in the "Transitioning from Legacy to New ACL Mode" section below.
2018-10-19 12:04:07 -04:00
Matt Keeler d1e52e5292
Update Raft Vendoring (#4539)
Pulls in a fix for a potential memory leak regarding consistent reads that invoke VerifyLeader.
2018-09-06 15:07:42 -04:00
Mitchell Hashimoto bbb13598bf
vendor k8s client lib 2018-09-05 14:59:02 -07:00
Mitchell Hashimoto 66e31f02f7
Update go-discover vendor 2018-09-05 13:31:10 -07:00
Paul Banks 9ce10769ce Update Serf and memberlist (#4511)
This includes fixes that improve gossip scalability on very large (> 10k node) clusters.

The Serf changes:
 - take snapshot disk IO out of the critical path for handling messages hashicorp/serf#524
 - make snapshot compaction much less aggressive - the old fixed threshold caused snapshots to be constantly compacted (synchronously with request handling) on clusters larger than about 2000 nodes! hashicorp/serf#525

Memberlist changes:
 - prioritize handling alive messages over suspect/dead to improve stability, and handle queue in LIFO order to avoid acting on info that 's already stale in the queue by the time we handle it. hashicorp/memberlist#159
 - limit the number of concurrent pushPull requests being handled at once to 128. In one test scenario with 10s of thousands of servers we saw channel and lock blocking cause over 3000 pushPulls at once which ballooned the memory of the server because each push pull contained a de-serialised list of all known 10k+ nodes and their tags for a total of about 60 million objects and 7GB of memory stuck. While the rest of the fixes here should prevent the same root cause from blocking in the same way, this prevents any other bug or source of contention from allowing pushPull messages to stack up and eat resources. hashicorp/memberlist#158
2018-08-09 13:16:13 -04:00
Siva Prasad f4a1c381a5 Vendoring update for go-discover. (#4412)
* New Providers added and updated vendoring for go-discover

* Vendor.json formatted using make vendorfmt

* Docs/Agent/auto-join: Added documentation for the new providers introduced in this PR

* Updated the golang.org/x/sys/unix in the vendor directory

* Agent: TestGoDiscoverRegistration updated to reflect the addition of new providers

* Deleted terraform.tfstate from vendor.

* Deleted terraform.tfstate.backup

Deleted terraform state file artifacts from unknown runs.

* Updated x/sys/windows vendor for Windows binary compilation
2018-07-25 16:21:04 -07:00
mkeeler 6813a99081 Merge remote-tracking branch 'connect/f-connect' 2018-06-25 19:42:51 +00:00
Matt Keeler 98e98fa815 Remove build tags from vendored vault file to allow for this to merge properly into enterprise 2018-06-25 12:26:10 -07:00
Matt Keeler 01f82717b4 Vendor the vault api 2018-06-25 12:26:10 -07:00
Leo Zhang 7f6d727aa5
Fix invalid vendor.json syntax for go-discover 2018-06-15 02:02:12 -07:00
Matt Keeler 2786ec979e Update yamux vendoring
Pulls in logging fixes.
2018-06-04 16:02:50 -04:00
Kyle Havlovitz bd42da760b
vendor: pull in latest version of go-discover 2018-05-10 15:40:16 -07:00
Preetha Appan fff532cf84
Update serf to pick up clean leave fix 2018-05-04 15:51:55 -05:00
Preetha Appan 7091595595
Update yamux to pick up performance improvements 2018-03-26 08:56:40 -05:00
Preetha dfd484c090
Fix panic in azure go discover provider (#3876) 2018-02-08 16:46:33 -06:00
James Phillips fb31d0ec6b
Updates hashicorp/go-discover to pull in support for Azure Virtual Machine Scale Sets. 2018-01-19 16:24:08 -08:00
James Phillips 5800474f02
Updates Serf to pickup fix for spammy zero RTT log messages.
Fixes #3789.
2018-01-19 14:47:12 -08:00
James Phillips aaf43f999b
Updates go-discover to get monkey patch for golang.org/x/net/trace. 2018-01-03 13:22:42 -08:00
James Phillips 88d475595a
Updates hashicorp/go-cleanhttp to pick up new sanitizer. 2017-12-20 13:48:49 -08:00
James Phillips bcc9aea18f
Updates Serf to pull in new queue depth controls. 2017-12-06 17:06:08 -08:00
James Phillips 9f2989424e
Updated memberlist to fix negative RTT measurements.
Fixes #3704
2017-11-21 01:37:49 -08:00
James Phillips e738bd584c
Updates memberlist to pick up https://github.com/hashicorp/memberlist/pull/69.
Fixes #3671
2017-11-10 09:31:02 -08:00
Frank Schröder 3673aca010 vendor: update github.com/hashicorp/go-sockaddr (#3633)
Pull in changes for

 * hashicorp/go-sockaddr#12
 * hashicorp/go-sockaddr#13
 * hashicorp/go-sockaddr#14
 * hashicorp/go-sockaddr#16
2017-10-31 17:05:57 -07:00
Frank Schröder a052255f86 vendor: update go-discover (#3634)
* vendor: update go-discover

Pull in providers:

 * Aliyun (Alibaba Cloud)
 * Digital Ocean
 * OpenStack (os)
 * Scaleway

* doc: use ... instead of xxx

* doc: strip trailing whitespace

* doc: add docs for aliyun, digitalocean, os and scaleway

* agent: fix test
2017-10-31 17:03:54 -05:00
Preetha Appan 1af51560d0 Update serf library to pick up coordinate persistence fix 2017-10-21 21:19:43 -05:00
Preetha Appan f94ba25b9d REbase master serf 2017-10-20 10:33:59 -05:00
Preetha Appan 9449a60fae Vendor update serf to pick up fix for out of range ping periods in coordinate subsystem 2017-10-20 10:14:15 -05:00
Matt McQuillan aa7f712b09 Updating go-checkpoint lib to have a fixed timeout (#3564)
* Updating go-checkpoint lib to have a fixed timeout

* formatting vendor/vendor.json file per project spec
2017-10-17 17:01:23 -07:00
Preetha Appan 4fef528054 Update go-retryablehttp 2017-10-06 13:42:11 -05:00
Frank Schroeder c58c310419 Update github.com/hashicorp/go-discover to pull in new config parser
This patch updates the go-discover library to use the new config parser
which uses a different encoding scheme for the go-discover config DSL.
Values are no longer URL encoded but taken literally unless they contain
spaces, backslashes or double quotes. To support keys or values with
these special characters the string needs to be double quoted and usual
escaping rules apply.

Fixes #3417
2017-10-04 19:12:28 +02:00
Preetha Appan f7d009b177 Updates vendor directory for raft to record raft v1.0.0. No functional changes 2017-10-03 17:19:10 -05:00
Frank Schroeder 99246d38a7
Revert monkey patch since it is not clear whether this is an issue at all. 2017-09-26 13:42:32 +02:00
Frank Schroeder 2567a94a81
serf: monkey patch https://github.com/hashicorp/serf/pull/486 2017-09-26 13:40:26 +02:00
Frank Schroeder 3011c828c9
Replace monkey patch with code from https://github.com/hashicorp/hcl/pull/213 2017-09-26 12:42:03 +02:00
Frank Schröder 12216583a1 New config parser, HCL support, multiple bind addrs (#3480)
* new config parser for agent

This patch implements a new config parser for the consul agent which
makes the following changes to the previous implementation:

 * add HCL support
 * all configuration fragments in tests and for default config are
   expressed as HCL fragments
 * HCL fragments can be provided on the command line so that they
   can eventually replace the command line flags.
 * HCL/JSON fragments are parsed into a temporary Config structure
   which can be merged using reflection (all values are pointers).
   The existing merge logic of overwrite for values and append
   for slices has been preserved.
 * A single builder process generates a typed runtime configuration
   for the agent.

The new implementation is more strict and fails in the builder process
if no valid runtime configuration can be generated. Therefore,
additional validations in other parts of the code should be removed.

The builder also pre-computes all required network addresses so that no
address/port magic should be required where the configuration is used
and should therefore be removed.

* Upgrade github.com/hashicorp/hcl to support int64

* improve error messages

* fix directory permission test

* Fix rtt test

* Fix ForceLeave test

* Skip performance test for now until we know what to do

* Update github.com/hashicorp/memberlist to update log prefix

* Make memberlist use the default logger

* improve config error handling

* do not fail on non-existing data-dir

* experiment with non-uniform timeouts to get a handle on stalled leader elections

* Run tests for packages separately to eliminate the spurious port conflicts

* refactor private address detection and unify approach for ipv4 and ipv6.

Fixes #2825

* do not allow unix sockets for DNS

* improve bind and advertise addr error handling

* go through builder using test coverage

* minimal update to the docs

* more coverage tests fixed

* more tests

* fix makefile

* cleanup

* fix port conflicts with external port server 'porter'

* stop test server on error

* do not run api test that change global ENV concurrently with the other tests

* Run remaining api tests concurrently

* no need for retry with the port number service

* monkey patch race condition in go-sockaddr until we understand why that fails

* monkey patch hcl decoder race condidtion until we understand why that fails

* monkey patch spurious errors in strings.EqualFold from here

* add test for hcl decoder race condition. Run with go test -parallel 128

* Increase timeout again

* cleanup

* don't log port allocations by default

* use base command arg parsing to format help output properly

* handle -dc deprecation case in Build

* switch autopilot.max_trailing_logs to int

* remove duplicate test case

* remove unused methods

* remove comments about flag/config value inconsistencies

* switch got and want around since the error message was misleading.

* Removes a stray debug log.

* Removes a stray newline in imports.

* Fixes TestACL_Version8.

* Runs go fmt.

* Adds a default case for unknown address types.

* Reoders and reformats some imports.

* Adds some comments and fixes typos.

* Reorders imports.

* add unix socket support for dns later

* drop all deprecated flags and arguments

* fix wrong field name

* remove stray node-id file

* drop unnecessary patch section in test

* drop duplicate test

* add test for LeaveOnTerm and SkipLeaveOnInt in client mode

* drop "bla" and add clarifying comment for the test

* split up tests to support enterprise/non-enterprise tests

* drop raft multiplier and derive values during build phase

* sanitize runtime config reflectively and add test

* detect invalid config fields

* fix tests with invalid config fields

* use different values for wan sanitiziation test

* drop recursor in favor of recursors

* allow dns_config.udp_answer_limit to be zero

* make sure tests run on machines with multiple ips

* Fix failing tests in a few more places by providing a bind address in the test

* Gets rid of skipped TestAgent_CheckPerformanceSettings and adds case for builder.

* Add porter to server_test.go to make tests there less flaky

* go fmt
2017-09-25 11:40:42 -07:00
Frank Schroeder 9362cbcbc2
Add support to discover public v4 and v6 addresses on AWS (#3471)
Update github.com/hashicorp/go-discover/provider/aws to support the
'addr_type' option which allows detection of private_v4, public_v4 and
public_v6 addresses.

Fixes #3471
2017-09-25 03:16:27 +02:00
Preetha Appan 276f26ea70 Updating vendor directory for raft address provider interface changes 2017-08-30 09:57:48 -05:00
Preetha Appan 30fd0d25a5 Update raft library for windows snapshot fsync fixes. This fixes #3409 2017-08-24 16:44:05 -05:00
Frank Schroeder ad82659eed vendor: upgrade github.com/hashicorp/go-discover
Pull in improved debug logging for AWS
2017-08-23 21:23:34 +02:00
Preetha Appan c9d5e17410 Update serf to pick up fixes for fsyncing snapshots and panic when coordinates are disabled 2017-08-17 16:35:06 -05:00
Preetha Appan 0e73777ce2 Update memberlist for a deadlock fix 2017-08-15 18:07:28 -05:00
James Phillips 1eea530ce6
Propagates a better error message from memberlist.
Fixes #3312.
2017-08-07 16:35:57 -07:00
Preetha Appan 454b3a2a61 Pick up raft library change that fsyncs snapshot files correctly 2017-08-04 10:36:41 -05:00
Frank Schroeder 6346ac34cf
vendor: update hashicorp/go-discover to pull in hashicorp/go-discover#7 2017-08-03 21:00:37 +02:00
Frank Schroeder e7285af6cc vendor: add go-discover 2017-08-01 11:41:43 +02:00
Preetha Appan b841c99b87 Govendor update go-memdb and go-immutable-radix to pick up changes for DeletePrefix 2017-07-25 17:28:43 -05:00
James Phillips 31a7701891 Updates memberlist to pick up Lifeguard research findings. (#3287)
See https://www.hashicorp.com/blog/making-gossip-more-robust-with-lifeguard/.
2017-07-17 12:54:17 -07:00