Commit Graph

207 Commits (4ca65733847abe538cd210b5d24e7471138202d9)

Author SHA1 Message Date
Dhia Ayachi 7f6c52a9ee
bump raft version to v1.3.11 (#14897)
* bump raft version to v1.3.11

* Add change log

* fix go.sum
2022-10-12 08:51:52 -04:00
John Murret 79a541fd7d
Upgrade serf to v0.10.1 and memberlist to v0.5.0 to get memberlist size metrics and broadcast queue depth metric (#14873)
* updating to serf v0.10.1 and memberlist v0.5.0 to get memberlist size metrics and memberlist broadcast queue depth metric

* update changelog

* update changelog

* correcting changelog

* adding "QueueCheckInterval" for memberlist to test

* updating integration test containers to grab latest api
2022-10-04 17:51:37 -06:00
Nick Ethier 1c1b0994b8
add HCP integration component (#14723)
* add HCP integration

* lint: use non-deprecated logging interface
2022-09-26 14:58:15 -04:00
freddygv f30bc96239 Test fixes
- Pulls in CLI test fix from main
- Updates psutils to fix TestAgent_Host on M1 Mac
2022-09-16 17:57:10 -06:00
cskh e84e4b8868
Fix: upgrade pkg imdario/merg to prevent merge config panic (#14237)
* upgrade imdario/merg to prevent merge config panic

* test: service definition takes precedence over service-defaults in merged results
2022-08-17 21:14:04 -04:00
cskh 81931e52c3
feat(telemetry): add labels to serf and memberlist metrics (#14161)
* feat(telemetry): add labels to serf and memberlist metrics
* changelog
* doc update

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-08-11 22:09:56 -04:00
Nitya Dhanushkodi f47319b7c6
update generate token endpoint to take external addresses (#13844)
Update generate token endpoint (rpc, http, and api module)

If ServerExternalAddresses are set, it will override any addresses gotten from the "consul" service, and be used in the token instead, and dialed by the dialer. This allows for setting up a load balancer for example, in front of the consul servers.
2022-07-21 14:56:11 -07:00
Paul Glass 77afe0e76e
Extract AWS auth implementation out of Consul (#13760) 2022-07-19 16:26:44 -05:00
Daniel Kimsey feead0b11b Update go-grpc/grpc to resolve conection memory leak
Reported in #12288

The initial test reported was ported and accurately reproduced the issue.
However, since it is a test of an upstream library's internal behavior it won't
be codified in our test suite. Refer to the ticket/PR for details on how to
demonstrate the behavior.
2022-06-08 11:29:29 +01:00
Chris S. Kim f0a9b30174
Update repo to use go:embed (#10996)
Replace bindata packages with stdlib go:embed.
Modernize some uiserver code with newer interfaces introduced in go 1.16 (mainly working with fs.File instead of http.File.
Remove steps that are no longer used from our build files.
Add Github Action to detect differences in agent/uiserver/dist and verify that the files are correct (by compiling UI assets and comparing contents).
2022-05-31 15:33:56 -04:00
Kyle Havlovitz 4bc6c23357 Add connection limit setting to service defaults 2022-05-24 10:13:38 -07:00
R.B. Boyer 2e72f44fda
peering: accept replication stream of discovery chain information at the importing side (#13151) 2022-05-19 16:37:52 -05:00
Dhia Ayachi 78412ae069
upgrade serf to v0.9.8 (#13062)
* upgrade serf to v0.9.8

* add changelog

* Update .changelog/13062.txt

Co-authored-by: Dan Upton <daniel@floppy.co>

Co-authored-by: Dan Upton <daniel@floppy.co>
2022-05-16 14:13:23 -04:00
Dhia Ayachi b895dd7d2d
change mod go version to 1.18 (#12976)
* change mod go version to 1.18

* fix go.mod format for 1.18
2022-05-09 13:29:43 -04:00
Dan Upton 7a6f86c1d4
Upgrade Raft to v1.3.9 for saturation metrics (#12865) 2022-04-27 17:17:31 +01:00
Dhia Ayachi b83a790927
update raft to v1.3.8 (#12844)
* update raft to v1.3.7

* add changelog

* fix compilation error

* fix HeartbeatTimeout

* fix ElectionTimeout to reload only if value is valid

* fix default values for `ElectionTimeout` and `HeartbeatTimeout`

* fix test defaults

* bump raft to v1.3.8
2022-04-25 10:19:26 -04:00
R.B. Boyer f507f62f3c
peering: initial sync (#12842)
- Add endpoints related to peering: read, list, generate token, initiate peering
- Update node/service/check table indexing to account for peers
- Foundational changes for pushing service updates to a peer
- Plumb peer name through Health.ServiceNodes path

see: ENT-1765, ENT-1280, ENT-1283, ENT-1283, ENT-1756, ENT-1739, ENT-1750, ENT-1679,
     ENT-1709, ENT-1704, ENT-1690, ENT-1689, ENT-1702, ENT-1701, ENT-1683, ENT-1663,
     ENT-1650, ENT-1678, ENT-1628, ENT-1658, ENT-1640, ENT-1637, ENT-1597, ENT-1634,
     ENT-1613, ENT-1616, ENT-1617, ENT-1591, ENT-1588, ENT-1596, ENT-1572, ENT-1555

Co-authored-by: R.B. Boyer <rb@hashicorp.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Evan Culver <eculver@hashicorp.com>
Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com>
2022-04-21 17:34:40 -05:00
DanStough 95250e7915 Update go version to 1.18.1 2022-04-18 11:41:10 -04:00
R.B. Boyer 8beaca4e01
deps: update to latest go-discover (#12739)
Fixes #11253

    $ go mod why -m github.com/dgrijalva/jwt-go
    # github.com/dgrijalva/jwt-go
    (main module does not need module github.com/dgrijalva/jwt-go)

    $ go mod why -m github.com/form3tech-oss/jwt-go
    # github.com/form3tech-oss/jwt-go
    github.com/hashicorp/consul/agent
    github.com/hashicorp/go-discover
    github.com/hashicorp/go-discover/provider/azure
    github.com/Azure/go-autorest/autorest
    github.com/Azure/go-autorest/autorest/adal
    github.com/form3tech-oss/jwt-go
2022-04-12 13:41:12 -05:00
Matt Keeler a553982506
Enable running autopilot state updates on all servers (#12617)
* Fixes a lint warning about t.Errorf not supporting %w

* Enable running autopilot on all servers

On the non-leader servers all they do is update the state and do not attempt any modifications.

* Fix the RPC conn limiting tests

Technically they were relying on racey behavior before. Now they should be reliable.
2022-04-07 10:48:48 -04:00
Eric e4b4f175ed Bump go-control-plane
* `go get cloud.google.com/go@v0.59.0`
* `go get github.com/envoyproxy/go-control-plane@v0.9.9`
* `make envoy-library`
* Bumpprotoc to 3.15.8
2022-03-30 13:11:27 -04:00
Matt Keeler 15ddbbc686
Update raft-boltdb to pull in new writeCapacity metric (#12646) 2022-03-30 11:38:44 -04:00
Eric 776f5843d0 remove gogo from pbservice 2022-03-23 12:18:01 -04:00
FFMMM db27ea3484
[sync oss] add net/rpc interceptor implementation (#12573)
* sync ent changes from 866dcb0667

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

* update oss go.mod

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-03-17 16:02:26 -07:00
R.B. Boyer 58e053c336
raft: upgrade to v1.3.6 (#12496)
Add additional protections on the Consul side to prevent NonVoters from bootstrapping raft.

This should un-flake TestServer_Expect_NonVoters
2022-03-02 17:00:02 -06:00
R.B. Boyer 7b0548dd8d
server: suppress spurious blocking query returns where multiple config entries are involved (#12362)
Starting from and extending the mechanism introduced in #12110 we can specially handle the 3 main special Consul RPC endpoints that react to many config entries in a single blocking query in Connect:

- `DiscoveryChain.Get`
- `ConfigEntry.ResolveServiceConfig`
- `Intentions.Match`

All of these will internally watch for many config entries, and at least one of those will likely be not found in any given query. Because these are blends of multiple reads the exact solution from #12110 isn't perfectly aligned, but we can tweak the approach slightly and regain the utility of that mechanism.

### No Config Entries Found

In this case, despite looking for many config entries none may be found at all. Unlike #12110 in this scenario we do not return an empty reply to the caller, but instead synthesize a struct from default values to return. This can be handled nearly identically to #12110 with the first 1-2 replies being non-empty payloads followed by the standard spurious wakeup suppression mechanism from #12110.

### No Change Since Last Wakeup

Once a blocking query loop on the server has completed and slept at least once, there is a further optimization we can make here to detect if any of the config entries that were present at specific versions for the prior execution of the loop are identical for the loop we just woke up for. In that scenario we can return a slightly different internal sentinel error and basically externally handle it similar to #12110.

This would mean that even if 20 discovery chain read RPC handling goroutines wakeup due to the creation of an unrelated config entry, the only ones that will terminate and reply with a blob of data are those that genuinely have new data to report.

### Extra Endpoints

Since this pattern is pretty reusable, other key config-entry-adjacent endpoints used by `agent/proxycfg` also were updated:

- `ConfigEntry.List`
- `Internal.IntentionUpstreams` (tproxy)
2022-02-25 15:46:34 -06:00
Dhia Ayachi cd9d8d44a5
file watcher to be used for configuration auto-reload feature (#12301)
* add config watcher to the config package

* add logging to watcher

* add test and refactor to add WatcherEvent.

* add all API calls and fix a bug with recreated files

* add tests for watcher

* remove the unnecessary use of context

* Add debug log and a test for file rename

* use inode to detect if the file is recreated/replaced and only listen to create events.

* tidy ups (#1535)

* tidy ups

* Add tests for inode reconcile

* fix linux vs windows syscall

* fix linux vs windows syscall

* fix windows compile error

* increase timeout

* use ctime ID

* remove remove/creation test as it's a use case that fail in linux

* fix linux/windows to use Ino/CreationTime

* fix the watcher to only overwrite current file id

* fix linter error

* fix remove/create test

* set reconcile loop to 200 Milliseconds

* fix watcher to not trigger event on remove, add more tests

* on a remove event try to add the file back to the watcher and trigger the handler if success

* fix race condition

* fix flaky test

* fix race conditions

* set level to info

* fix when file is removed and get an event for it after

* fix to trigger handler when we get a remove but re-add fail

* fix error message

* add tests for directory watch and fixes

* detect if a file is a symlink and return an error on Add

* rename Watcher to FileWatcher and remove symlink deref

* add fsnotify@v1.5.1

* fix go mod

* fix flaky test

* Apply suggestions from code review

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>

* fix a possible stack overflow

* do not reset timer on errors, rename OS specific files

* start the watcher when creating it

* fix data race in tests

* rename New func

* do not call handler when a remove event happen

* events trigger on write and rename

* fix watcher tests

* make handler async

* remove recursive call

* do not produce events for sub directories

* trim "/" at the end of a directory when adding

* add missing test

* fix logging

* add todo

* fix failing test

* fix flaking tests

* fix flaky test

* add logs

* fix log text

* increase timeout

* reconcile when remove

* check reconcile when removed

* fix reconcile move test

* fix logging

* delete invalid file

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix review comments

* fix is watched to properly catch a remove

* change test timeout

* fix test and rename id

* fix test to create files with different mod time.

* fix deadlock when stopping watcher

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix a deadlock when calling stop while emitting event is blocked

* make sure to close the event channel after the event loop is done

* add go doc

* back date file instead of sleeping

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* check error

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-02-21 11:36:52 -05:00
R.B. Boyer 80dfcb1bcd
raft: update to v1.3.5 (#12325)
This includes closing some leadership transfer gaps and adding snapshot
restore progress logging.
2022-02-14 13:48:52 -06:00
FFMMM 78264a8030
Vendor in rpc mono repo for net/rpc fork, go-msgpack, msgpackrpc. (#12311)
This commit syncs ENT changes to the OSS repo.

Original commit details in ENT:

```
commit 569d25f7f4578981c3801e6e067295668210f748
Author: FFMMM <FFMMM@users.noreply.github.com>
Date:   Thu Feb 10 10:23:33 2022 -0800

    Vendor fork net rpc (#1538)

    * replace net/rpc w consul-net-rpc/net/rpc

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

    * replace msgpackrpc and go-msgpack with fork from mono repo

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

    * gofmt all files touched

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
```

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-02-14 09:45:45 -08:00
Dhia Ayachi 25c36af222
update serf to v0.9.7 (#12057)
* update serf to v0.9.7

* add change log

* update changelog
2022-01-18 13:03:22 -05:00
dependabot[bot] f3ac9dafa6
build(deps): bump github.com/ryanuber/columnize (#12062)
Bumps [github.com/ryanuber/columnize](https://github.com/ryanuber/columnize) from 2.1.0+incompatible to 2.1.2+incompatible.
- [Release notes](https://github.com/ryanuber/columnize/releases)
- [Commits](https://github.com/ryanuber/columnize/compare/v2.1.0...v2.1.2)

---
updated-dependencies:
- dependency-name: github.com/ryanuber/columnize
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-14 12:39:13 -05:00
dependabot[bot] 2c8de28005
build(deps): bump github.com/kr/text from 0.1.0 to 0.2.0 (#12063)
Bumps [github.com/kr/text](https://github.com/kr/text) from 0.1.0 to 0.2.0.
- [Release notes](https://github.com/kr/text/releases)
- [Commits](https://github.com/kr/text/compare/v0.1.0...v0.2.0)

---
updated-dependencies:
- dependency-name: github.com/kr/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-14 12:35:00 -05:00
dependabot[bot] 4947d6d29f
build(deps): bump github.com/mitchellh/pointerstructure (#12072)
Bumps [github.com/mitchellh/pointerstructure](https://github.com/mitchellh/pointerstructure) from 1.0.0 to 1.2.1.
- [Release notes](https://github.com/mitchellh/pointerstructure/releases)
- [Commits](https://github.com/mitchellh/pointerstructure/compare/v1.0.0...v1.2.1)

---
updated-dependencies:
- dependency-name: github.com/mitchellh/pointerstructure
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-14 11:56:26 -05:00
dependabot[bot] 97348e25a7
Bump github.com/aws/aws-sdk-go from 1.25.41 to 1.42.34 (#12083)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.25.41 to 1.42.34.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.25.41...v1.42.34)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-14 11:54:41 -05:00
dependabot[bot] 151e516a8c
Bump github.com/hashicorp/go-memdb from 1.3.1 to 1.3.2 (#11066)
Bumps [github.com/hashicorp/go-memdb](https://github.com/hashicorp/go-memdb) from 1.3.1 to 1.3.2.
- [Release notes](https://github.com/hashicorp/go-memdb/releases)
- [Changelog](https://github.com/hashicorp/go-memdb/blob/master/changes.go)
- [Commits](https://github.com/hashicorp/go-memdb/compare/v1.3.1...v1.3.2)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-memdb
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-13 10:54:36 -08:00
dependabot[bot] b9ed140b4b
Bump github.com/hashicorp/go-raftchunking from 0.6.1 to 0.6.2 (#11065)
Bumps [github.com/hashicorp/go-raftchunking](https://github.com/hashicorp/go-raftchunking) from 0.6.1 to 0.6.2.
- [Release notes](https://github.com/hashicorp/go-raftchunking/releases)
- [Commits](https://github.com/hashicorp/go-raftchunking/compare/v0.6.1...v0.6.2)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-raftchunking
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-13 10:28:49 -08:00
dependabot[bot] 8878f0ea21
build(deps): bump github.com/hashicorp/go-multierror from 1.1.0 to 1.1.1
Bumps [github.com/hashicorp/go-multierror](https://github.com/hashicorp/go-multierror) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/hashicorp/go-multierror/releases)
- [Commits](https://github.com/hashicorp/go-multierror/compare/v1.1.0...v1.1.1)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-multierror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-13 17:41:15 +00:00
Chris S. Kim ea0df29afb
Update memberlist to 0.3.1 (#12042) 2022-01-12 12:00:18 -05:00
Dhia Ayachi d0274d38a9
upgrade raft to v1.3.3 (#11958)
* upgrade raft to v1.3.3

* add change log

* reword the changelog

Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>

Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>
2022-01-06 14:09:13 -05:00
kisunji 59ddc44cc7 Update golang.org/x/net to address CVE-2021-44716 2021-12-15 11:54:47 -05:00
Daniel Nephin dccd3f5806 Merge remote-tracking branch 'origin/main' into serve-panic-recovery 2021-12-07 16:30:41 -05:00
Matt Keeler 42a5635bc3 Use raft-boltdb/v2 2021-12-02 16:56:15 -05:00
Mike Morris 25826e3ee4
deps: update gopsutil to fix Windows ARM and macOS non-Apple LLVM builds (#11586)
Bumps transitive dep go-ole to v1.2.6 with fixes
2021-11-16 15:40:11 -05:00
R.B. Boyer eb21649f82
partitions: various refactors to support partitioning the serf LAN pool (#11568) 2021-11-15 09:51:14 -06:00
Giulio Micheloni af7b7b5693
Merge branch 'main' into serve-panic-recovery 2021-11-06 16:12:06 +01:00
FFMMM 0954d261ae
use *telemetry.MetricsPrefix as prometheus.PrometheusOpts.Name (#11290)
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2021-10-21 13:33:01 -07:00
Dhia Ayachi ab31c50915
update raft to v1.3.2 (#11375)
* update raft to v1.3.2

* add changelog

* fix changelog
2021-10-21 13:21:22 -04:00
Giulio Micheloni fecce25658 Separete test file and no stack trace in ret error 2021-10-16 18:02:03 +01:00
Giulio Micheloni 0c78ddacde Merge branch 'main' of https://github.com/hashicorp/consul into hashicorp-main 2021-10-16 16:59:32 +01:00
Jeff Widman 2dc62aa0c4
Bump `go-discover` to fix broken dep tree (#10898) 2021-09-16 15:31:22 -04:00
freddygv a78390a30b Update yamux 2021-08-25 19:46:12 -06:00
Giulio Micheloni cbf437efdb Fix go.sum with go mod tidy 2021-08-22 19:50:10 +01:00
Giulio Micheloni 655da1fc42
Merge branch 'main' into serve-panic-recovery 2021-08-22 20:31:11 +02:00
Daniel Nephin 31bcd80528 debug: improve a couple of the test cases
Use gotest.tools/v3/fs to make better assertions about the files

Remove the TestAgent from TestDebugCommand_Prepare_ValidateTiming, since we can test that validation
without making any API calls.
2021-08-18 12:29:34 -04:00
Roopak Venkatakrishnan e43cf46267 Update x/sys to support go 1.17 2021-08-18 03:00:22 +00:00
Mike Morris 3bae53a989
deps: upgrade gogo-protobuf to v1.3.2 (#10813)
* deps: upgrade gogo-protobuf to v1.3.2

* go mod tidy using go 1.16

* proto: regen protobufs after upgrading gogo/protobuf

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-08-12 14:05:46 -04:00
Giulio Micheloni 2b14a9b59a grpc Server: turn panic into error through middleware 2021-08-07 13:21:12 +01:00
Daniel Nephin 5f753dedab Update armon/go-metrics
To pickup new InMemSink.Stream method
2021-07-26 15:58:17 -04:00
Daniel Nephin 414ce3f09b Update serf
To pick up data race fixes
2021-07-14 18:58:16 -04:00
Dhia Ayachi c72ee2063e
upgrade golang crypto from 0.0.0-20200930160638-afb6bcd081ae => v0.0.0-20210513164829-c07d793c2f9a (#10390) 2021-06-14 12:38:42 -04:00
Dhia Ayachi 005ad9e46d
generate a single debug file for a long duration capture (#10279)
* debug: remove the CLI check for debug_enabled

The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.

Also:
- fix the API client to return errors on non-200 status codes for debug
  endpoints
- improve the failure messages when pprof data can not be collected

Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>

* remove parallel test runs

parallel runs create a race condition that fail the debug tests

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* extract wait group outside the go routine to avoid a race condition

* capture pprof in a separate go routine

* perform a single capture for pprof data for the whole duration

* add missing vendor dependency

* add a change log and fix documentation to reflect the change

* create function for timestamp dir creation and simplify error handling

* use error groups and ticker to simplify interval capture loop

* Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval

* refactor Logs capture routine and add log capture specific test

* improve error reporting when log test fail

* change test duration to 1s

* make time parsing in log line more robust

* refactor log time format in a const

* test on log line empty the earliest possible and return

Co-authored-by: Freddy <freddygv@users.noreply.github.com>

* rename function to captureShortLived

* more specific changelog

Co-authored-by: Paul Banks <banks@banksco.de>

* update documentation to reflect current implementation

* add test for behavior when invalid param is passed to the command

* fix argument line in test

* a more detailed description of the new behaviour

Co-authored-by: Paul Banks <banks@banksco.de>

* print success right after the capture is done

* remove an unnecessary error check

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 13:00:51 -04:00
Matt Keeler 3a0007b158
Bump raft-autopilot version to the latest. (#10306) 2021-05-27 12:59:14 -04:00
Daniel Nephin 9a7fb48dcb Update a couple dependencies
To pickup bug fixes
2021-05-04 14:09:10 -04:00
Paul Banks 3ad754ca7b
Make Raft trailing logs and snapshot timing reloadable (#10129)
* WIP reloadable raft config

* Pre-define new raft gauges

* Update go-metrics to change gauge reset behaviour

* Update raft to pull in new metric and reloadable config

* Add snapshot persistance timing and installSnapshot to our 'protected' list as they can be infrequent but are important

* Update telemetry docs

* Update config and telemetry docs

* Add note to oldestLogAge on when it is visible

* Add changelog entry

* Update website/content/docs/agent/options.mdx

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
2021-05-04 15:36:53 +01:00
Daniel Nephin 567e588d93 vendor: commit changes from update-vendor
I guess a couple PRs crossed streams and somehow that resulted in this line not being
needed anymore in go.sum
2021-04-16 14:31:56 -04:00
R.B. Boyer 2c3d7da5dd
mod: bump to github.com/hashicorp/mdns v1.0.4 (#10018) 2021-04-14 14:17:52 -05:00
Daniel Nephin 46279547ec Update memberlist to v0.2.3
To pickup data race fixes
2021-03-24 18:20:19 -04:00
Daniel Nephin cae4b2c0eb Update go-memdb
To use a version that will not panic when an iterator is used with modifications.
2021-01-28 17:19:55 -05:00
Daniel Nephin 727a402810
Merge pull request #9302 from hashicorp/dnephin/add-service-3
agent: remove ServiceManager.Start goroutine
2021-01-28 16:59:41 -05:00
Matt Keeler f561462064
Upgrade raft-autopilot and wait for autopilot it to stop when revoking leadership (#9644)
Fixes: 9626
2021-01-27 11:14:52 -05:00
Daniel Nephin 28736e60fd lib/mutex: add mutex with TryLock and update vendor 2021-01-25 18:01:47 -05:00
Daniel Nephin aeb9c09e25 Update mapstructure 2021-01-12 12:24:56 -05:00
Pierre Souchay 408b249fc5 [bugfix] Prometheus metrics without warnings
go-metrics is updated to 0.3.6 to properly handle help in prometheus metrics

This fixes https://github.com/hashicorp/consul/issues/9303 and
https://github.com/hashicorp/consul/issues/9471
2021-01-06 13:54:05 +01:00
Matt Keeler c048e86bb2
Switch to using the external autopilot module 2020-11-09 09:22:11 -05:00
Mike Morris 75019baadd
chore: upgrade to gopsutil/v3 (#9118)
* deps: update golang.org/x/sys

* deps: update imports to gopsutil/v3

* chore: make update-vendor
2020-11-06 20:48:38 -05:00
Kit Patella 9636906e53 rollback golang.org/x/sys version to fix distro-build 2020-11-05 12:09:07 -08:00
Kit Patella b203b8874b apply make update-vendor 2020-11-05 11:51:58 -08:00
Kit Patella 9d9c4a646b upgrade go-metrics to latest 2020-11-04 14:02:13 -08:00
Kyle Havlovitz 4abe96aa74 vendor: Update github.com/hashicorp/yamux 2020-10-09 05:05:46 -07:00
Kyle Havlovitz dd6ed08924 vendor: Update github.com/hashicorp/mdns 2020-10-09 04:43:27 -07:00
Kyle Havlovitz 1cc012b202 vendor: Update github.com/hashicorp/hil 2020-10-09 04:43:27 -07:00
Kyle Havlovitz b95ab0d33c vendor: Update github.com/hashicorp/go-version 2020-10-09 04:43:27 -07:00
Kyle Havlovitz f389f1184d vendor: Update github.com/hashicorp/go-memdb 2020-10-09 04:43:27 -07:00
Kyle Havlovitz 40481e2b8f vendor: Update github.com/hashicorp/go-checkpoint 2020-10-09 04:43:27 -07:00
Mike Morris 708957a982
chore: update raft to v1.2.0 (#8822) 2020-10-08 15:07:10 -04:00
Matt Keeler 38f5ddce2a
Add per-agent reconnect timeouts (#8781)
This allows for client agent to be run in a more stateless manner where they may be abruptly terminated and not expected to come back. If advertising a per-agent reconnect timeout using the advertise_reconnect_timeout configuration when that agent leaves, other agents will wait only that amount of time for the agent to come back before reaping it.

This has the advantageous side effect of causing servers to deregister the node/services/checks for that agent sooner than if the global reconnect_timeout was used.
2020-10-08 15:02:19 -04:00
Mike Morris 1ebc2fb006
chore(deps): update gopsutil to v2.20.9 (#8843)
* core(deps): bump golang.org/x/sys

To resolve /go/pkg/mod/github.com/shirou/gopsutil@v2.20.9+incompatible/host/host_bsd.go:20:13: undefined: unix.SysctlTimeval

* chore(deps): make update-vendor
2020-10-07 12:57:18 -04:00
Daniel Nephin 627449a870 Vendor gofuzz and google/go-cmp 2020-09-28 18:28:37 -04:00
Kyle Havlovitz b1b21139ca Merge branch 'master' into vault-ca-renew-token 2020-09-15 14:39:04 -07:00
Kyle Havlovitz 1cd7c43544 Update vault CA for latest api client 2020-09-15 13:33:55 -07:00
Kyle Havlovitz 74dc50a771 vendor: Update vault api package 2020-09-15 12:45:29 -07:00
Daniel Nephin 0c87cf468c Update go-metrics dependencies, to use metrics.Default() 2020-09-14 19:05:22 -04:00
Hans Hasselberg a932aafc91
add primary keys to list keyring (#8522)
During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key.

Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output:

```json
[
  {
    "WAN": true,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "NumNodes": 6
  },
  {
    "WAN": false,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  },
  {
    "WAN": false,
    "Datacenter": "dc1",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  }
]
```

I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later:
* add a flag to show the primary keys
* add a flag to show json output

Fixes #3393.
2020-08-18 09:50:24 +02:00
s-christoff 102b7e55da
Update Go-Metrics 0.3.4 (#8478) 2020-08-11 11:17:43 -05:00
Kyle Havlovitz f4efd53d57 vendor: Update github.com/armon/go-metrics to v0.3.3 2020-07-23 11:37:33 -07:00
Matt Keeler a6a1a0e3d6
Update mapstructure to v1.3.3 (#8361)
This was done in preparation for another PR where I was running into https://github.com/mitchellh/mapstructure/issues/202 and implemented a fix for the library.
2020-07-22 15:13:21 -04:00
R.B. Boyer e853368c23
gossip: Avoid issue where two unique leave events for the same node could lead to infinite rebroadcast storms (#8343)
bump serf to v0.9.3 to include fix for https://github.com/hashicorp/serf/pull/606
2020-07-21 15:48:10 -05:00
Pierre Souchay 20d1ea7d2d
Upgrade go-connlimit to v0.3.0 / return http 429 on too many connections (#8221)
Fixes #7527

I want to highlight this and explain what I think the implications are and make sure we are aware:

* `HTTPConnStateFunc` closes the connection when it is beyond the limit. `Close` does not block.
* `HTTPConnStateFuncWithDefault429Handler(10 * time.Millisecond)` blocks until the following is done (worst case):
  1) `conn.SetDeadline(10*time.Millisecond)` so that
  2) `conn.Write(429error)` is guaranteed to timeout after 10ms, so that the http 429 can be written and 
  3) `conn.Close` can happen

The implication of this change is that accepting any new connection is worst case delayed by 10ms. But only after a client reached the limit already.
2020-07-03 09:25:07 +02:00
Hans Hasselberg 95c027a3ea
Update gopsutil (#8208)
https://github.com/shirou/gopsutil/pull/895 is merged and fixes our
problem. Time to update. Since there is no new version just yet,
updating to the sha.
2020-07-01 14:47:56 +02:00
Matt Keeler e9835610f3
Add a test for go routine leaks
This is in its own separate package so that it will be a separate test binary that runs thus isolating the go runtime from other tests and allowing accurate go routine leak checking.

This test would ideally use goleak.VerifyTestMain but that will fail 100% of the time due to some architectural things (blocking queries and net/rpc uncancellability).

This test is not comprehensive. We should enable/exercise more features and more cluster configurations. However its a start.
2020-06-24 17:09:50 -04:00