Commit Graph

8870 Commits (ca35d04472a79cfbde76fa0ecf27caa3ef387a33)

Author SHA1 Message Date
Siva Prasad ca35d04472
Adds a new command line flag -log-file for file based logging. (#4581)
* Added log-file flag to capture Consul logs in a user specified file

* Refactored code.

* Refactored code. Added flags to rotate logs based on bytes and duration

* Added the flags for log file and log rotation on the webpage

* Fixed TestSantize from failing due to the addition of 3 flags

* Introduced changes : mutex, data-dir log writes, rotation logic

* Added test for logfile and updated the default log destination for docs

* Log name now uses UnixNano

* TestLogFile is now uses t.Parallel()

* Removed unnecessary int64Val function

* Updated docs to reflect default log name for log-file

* No longer writes to data-dir and adds .log if the filename has no extension
2018-08-29 16:56:58 -04:00
John Cowen 40e71f1b91
UI: Simplify/refactor the actions/notification layer (#4572) + (#4573)
* Move notification texts to a slightly different layer (#4572)
* Further Simplify/refactor the actions/notification layer (#4573)

1. Move the 'with-feedback' actions to a 'with-blocking-action' mixin
which better describes what it does
2. Additional set of unit tests almost over the entire layer to prove
things work/add confidence for further changes

The multiple 'with-action' mixins used for every 'index/edit' combo are
now reduced down to only contain the functionality related to their
specific routes, i.e. where to redirect.

The actual functionality to block and carry out the action and then
notify are 'almost' split out so that their respective classes/objects do
one thing and one thing 'well'.

Mixins are chosen for the moment as the decoration approach used by
mixins feels better than multiple levels of inheritence, but I would
like to take this fuether in the future to a 'compositional' based
approach.

There is still possible further work to be done here, but I'm a lot
happier now this is reduced down into separate parts.
2018-08-29 19:14:31 +01:00
John Cowen d59331f115
Update CHANGELOG.md (#4435) 2018-08-29 12:18:28 +01:00
John Cowen b41cad6fdf
UI: CSS refactor (#4430) + Fullscreen Layout (#4435)
* Begin refactoring CSS into component folders. Moved most
components into layout/skin folders, left out a couple of ones I want
to think about more.
* Adjust grays based on recent Structure changes 
* Switch to fullscreen layout for lists and detail, left aligned forms (#4435)
* Specifically use the 'actions_close' label, not just the :last-child (actions-group)
* Replace some non-var-ed colours in vaults code skin, plus prefixing (black and white)
2018-08-29 12:11:58 +01:00
John Cowen 7bb35c4c78
UI: Repo layer integration tests (#4454) (#4563)
ui: Repo layer integration tests for methods that touch the API

Includes a `repo` test helper to make repetitive tasks easier, plus a
injectable reporter for sending performance metrics to a centralized metrics
system

Also noticed somewhere in the ember models that I'd like to improve, but left
for the moment to make sure I concentrate on one task at a time, more or less:

The tests currently asserts against the existing JSON tree, which doesn't
seem to be a very nice tree.

The work at hand here is to refactor what is there, so test for the not
nice tree to ensure we don't get any regression, and add a skipped test
so I can come back here later
2018-08-29 10:00:15 +01:00
John Cowen 1b3d566a7a
UI: Begin unskipping some more trivial tests (#4574)
WIP Unskip some lower level trivial tests.

This is the beginning of work to unskip some of the more trivial tests that I'd skipped a while back (if the thing they are testing broke, they would have failed higher up in other acceptance tests).

I'd rather keep the tests, as they do test things in a more isolated manner, and the plan was to always come back and work to unskip them time allowing.

I didn't get to far into this work in progress here, but I'd rather merge what I've done all the same and come back at a later date and continue.
2018-08-29 09:59:02 +01:00
Freddy fbe45513d8
Correct rpc telemetry docs (#4602) 2018-08-28 16:38:22 -04:00
Freddy 4c5758d891
Remove operator_area note from godoc overview (#4603) 2018-08-28 16:02:24 -04:00
Freddy d7a404f2ee
Bugfix: Use "%#v" when formatting structs (#4600) 2018-08-28 12:37:34 -04:00
Jack Pearkes 82f9e48a18
github: some minor changes to issue templates (#4521)
Just cleaning these up a tiny bit based on seeing how they're
being used.
2018-08-28 09:07:28 -07:00
Jack Pearkes c1bf14be30
website: use 127.0.0.1 instead of consul.rocks (#4523)
By default, the Consul agent listens on the local interface
at port 8500 for API requests. This change makes the API examples
using `curl` copy-pasteable for this default configuration.
2018-08-28 09:07:15 -07:00
Siva Prasad b1a34f899f
TestAgentAntiEntropy: Wait until Consul service is up on the agent. (#4591)
* Anti-Entropy test wait for Consul service added

* Reverted some tests back to using WaitForLeader
2018-08-28 09:52:11 -04:00
Pierre Souchay 8e7b8bb524 Fixed unit test TestCatalogListServicesCommand (#4592) 2018-08-27 13:53:46 -04:00
Pierre Souchay 5e0218ccf4 Fix unit test TestOperatorAutopilotGetConfigCommand (#4594) 2018-08-27 13:29:25 -04:00
Pierre Souchay aea31d3c5d Fixed unstable test TestUiNodeInfo (#4586) 2018-08-27 11:49:14 -04:00
Pierre Souchay 3101086a27 Fixed unstable test TestProxy_public (#4587) 2018-08-27 11:45:07 -04:00
Pierre Souchay af90c88f6a Fixed unstable test TestRTTCommand_LAN in command/rtt (#4585) 2018-08-27 11:37:13 -04:00
Pierre Souchay 3f9d1370b7 Fix unstable test TestRegisterMonitor_heartbeat (#4568) 2018-08-24 13:33:58 -04:00
Kyle Havlovitz 2eefa74710
Merge pull request #4577 from remijouannet/patch-1
Update monitoring-telegraf.html.md
2018-08-24 09:13:53 -07:00
Rémi Jouannet 2767ae860b
Update monitoring-telegraf.html.md 2018-08-24 16:48:02 +02:00
John Cowen 4ebd70e6cd
UI: Fixes healthy node listing resize on large portrait screens (#4564)
1. Split the resizing functionality of into a separate mixin to be
shared across components
2. Add basic integration tests to prove that everything is getting
called through out the lifetime of the app. I decided against unit
testing as there isn't really any isolated logic to be tested, more
checking that things are being called in the correct order etc i.e. the
integration is correct.

Adds assertion to with-resizing so its obvious to override `resize`
2018-08-24 12:35:52 +02:00
Freddy a00a2dc148
Update CHANGELOG.md (#4562) 2018-08-23 15:20:26 -04:00
Pierre Souchay b898131723 [BUGFIX] Avoid returning empty data on startup of a non-leader server (#4554)
Ensure that DB is properly initialized when performing stale queries

Addresses:
- https://github.com/hashicorp/consul-replicate/issues/82
- https://github.com/hashicorp/consul/issues/3975
- https://github.com/hashicorp/consul-template/issues/1131
2018-08-23 12:06:39 -04:00
Paul Banks 4d658f34cf
Intention ACL API clarification (#4547) 2018-08-20 20:33:15 +01:00
jjshanks 657b8d27ac Update intentions documentation to clarify ACL behavior (#4546)
* Update intentions documentation to clarify ACL behavior

* Incorprate @banks suggestions into docs

* Fix my own typos!
2018-08-20 20:03:53 +01:00
Matt Keeler 542cace9a2
Update CHANGELOG.md 2018-08-17 16:20:34 -04:00
Miroslav Bagljas 3c23979afd Fixes #4483: Add support for Authorization: Bearer token Header (#4502)
Added Authorization Bearer token support as per RFC6750

* appended Authorization header token parsing after X-Consul-Token
* added test cases
* updated website documentation to mention Authorization header

* improve tests, improve Bearer parsing
2018-08-17 16:18:42 -04:00
Matt Keeler 1b1a9c6fc6
Update CHANGELOG.md 2018-08-17 14:45:52 -04:00
Matt Keeler e81c85c051
Fix #4515: Segfault when serf_wan port was -1 but reconnect_time_wan was set (#4531)
Fixes #4515 

This just slightly refactors the logic to only attempt to set the serf wan reconnect timeout when the rest of the serf wan settings are configured - thus avoiding a segfault.
2018-08-17 14:44:25 -04:00
Siva Prasad f8cc241b28
Fixed a make build issue with Windows Binaries. (#4538)
* Fixed an issue where Windows binary had trouble being copied correctly

* Enclosed binname inside angular brackets
2018-08-17 09:31:57 -04:00
Kyle Havlovitz 9257300f7e
Merge pull request #4535 from hashicorp/ca-snapshot-fix
fsm: add missing CA config to snapshot/restore logic
2018-08-16 13:05:47 -07:00
Kyle Havlovitz e5e1f867e5
Merge branch 'master' into ca-snapshot-fix 2018-08-16 13:00:54 -07:00
Kyle Havlovitz f186edc42c
fsm: add connect service config to snapshot/restore test 2018-08-16 12:58:54 -07:00
Matt Keeler efe14620ea
Update CHANGELOG.md 2018-08-16 15:35:02 -04:00
nickmy9729 beddf03b26 Added code to allow snapshot inclusion of NodeMeta (#4527) 2018-08-16 15:33:35 -04:00
Kyle Havlovitz b51d76f469
fsm: add missing CA config to snapshot/restore logic 2018-08-16 11:58:50 -07:00
Kyle Havlovitz fa8990c5a2
Merge pull request #4528 from hashicorp/autopilot-fixes
Fix inconsistency caused by the autopilot StatsFetcher
2018-08-15 11:35:52 -07:00
Siva Prasad 771e0bafd2
Addresses the flakiness of CatalogNodes (#4530) 2018-08-15 11:16:05 -04:00
sandstrom 14f19f75a6 Clarify port usage for agents (#4510) 2018-08-14 16:10:01 -07:00
Kyle Havlovitz 4b35d877ca
autopilot: don't follow the normal server removal rules for nonvoters 2018-08-14 14:24:51 -07:00
Kyle Havlovitz ea14482376
Fix stats fetcher healthcheck RPCs not being independent 2018-08-14 14:23:52 -07:00
Shubheksha fc3997f266 replace old fork of text package (#4501) 2018-08-14 12:23:18 -07:00
Pierre Souchay 0d6de257a2 Display more information about check being not properly added when it fails (#4405)
* Display more information about check being not properly added when it fails

It follows an incident where we add lots of error messages:

  [WARN] consul.fsm: EnsureRegistration failed: failed inserting check: Missing service registration

That seems related to Consul failing to restart on respective agents.

Having Node information as well as service information would help diagnose the issue.

* Renamed ensureCheckIfNodeMatches() as requested by @banks
2018-08-14 17:45:33 +01:00
Freddy 6d43d24edb
Improve reliability of tests with TestAgent (#4525)
- Add WaitForTestAgent to tests flaky due to missing serfHealth registration

- Fix bug in retries calling Fatalf with *testing.T

- Convert TestLockCommand_ChildExitCode to table driven test
2018-08-14 12:08:33 -04:00
Paul Banks e34acd275f
Update intentions.html.md 2018-08-14 15:09:45 +01:00
Freddy e305443db4
Address flakiness in command/exec tests (#4517)
* Add fn to wait for TestAgent node and check registration

* Add waits for TestAgent and retries before timeouts in exec_test
2018-08-10 15:04:07 -04:00
Geoffrey Grosenbach a03512496f
Consul Production Deployment Guide
Renames guide to "Production Deployment"
Adds link in sidebar menu.
Implements edits suggested by Consul engineering team.
2018-08-10 11:51:05 -07:00
Matt Keeler daacbaa520
Update CHANGELOG.md 2018-08-10 11:33:22 -04:00
Pierre Souchay ef3b81ab13 Allow to rename nodes with IDs, will fix #3974 and #4413 (#4415)
* Allow to rename nodes with IDs, will fix #3974 and #4413

This change allow to rename any well behaving recent agent with an
ID to be renamed safely, ie: without taking the name of another one
with case insensitive comparison.

Deprecated behaviour warning
----------------------------

Due to asceding compatibility, it is still possible however to
"take" the name of another name by not providing any ID.

Note that when not providing any ID, it is possible to have 2 nodes
having similar names with case differences, ie: myNode and mynode
which might lead to DB corruption on Consul server side and
lead to server not properly restarting.

See #3983 and #4399 for Context about this change.

Disabling registration of nodes without IDs as specified in #4414
should probably be the way to go eventually.

* Removed the case-insensitive search when adding a node within the else
block since it breaks the test TestAgentAntiEntropy_Services

While the else case is probably legit, it will be fixed with #4414 in
a later release.

* Added again the test in the else to avoid duplicated names, but
enforce this test only for nodes having IDs.

Thus most tests without any ID will work, and allows us fixing

* Added more tests regarding request with/without IDs.

`TestStateStore_EnsureNode` now test registration and renaming with IDs

`TestStateStore_EnsureNodeDeprecated` tests registration without IDs
and tests removing an ID from a node as well as updated a node
without its ID (deprecated behaviour kept for backwards compatibility)

* Do not allow renaming in case of conflict, including when other node has no ID

* Fixed function GetNodeID that was not working due to wrong type when searching node from its ID

Thus, all tests about renaming were not working properly.

Added the full test cas that allowed me to detect it.

* Better error messages, more tests when nodeID is not a valid UUID in GetNodeID()

* Added separate TestStateStore_GetNodeID to test GetNodeID.

More complete test coverage for GetNodeID

* Added new unit test `TestStateStore_ensureNoNodeWithSimilarNameTxn`

Also fixed comments to be clearer after remarks from @banks

* Fixed error message in unit test to match test case

* Use uuid.ParseUUID to parse Node.ID as requested by @mkeeler
2018-08-10 11:30:45 -04:00
Paul Banks 9ce10769ce Update Serf and memberlist (#4511)
This includes fixes that improve gossip scalability on very large (> 10k node) clusters.

The Serf changes:
 - take snapshot disk IO out of the critical path for handling messages hashicorp/serf#524
 - make snapshot compaction much less aggressive - the old fixed threshold caused snapshots to be constantly compacted (synchronously with request handling) on clusters larger than about 2000 nodes! hashicorp/serf#525

Memberlist changes:
 - prioritize handling alive messages over suspect/dead to improve stability, and handle queue in LIFO order to avoid acting on info that 's already stale in the queue by the time we handle it. hashicorp/memberlist#159
 - limit the number of concurrent pushPull requests being handled at once to 128. In one test scenario with 10s of thousands of servers we saw channel and lock blocking cause over 3000 pushPulls at once which ballooned the memory of the server because each push pull contained a de-serialised list of all known 10k+ nodes and their tags for a total of about 60 million objects and 7GB of memory stuck. While the rest of the fixes here should prevent the same root cause from blocking in the same way, this prevents any other bug or source of contention from allowing pushPull messages to stack up and eat resources. hashicorp/memberlist#158
2018-08-09 13:16:13 -04:00