Commit Graph

2231 Commits (2d30d864ceed5a4feb14416ce79e9b7b8f3d68c5)

Author SHA1 Message Date
Daniel Nephin 4fa0fdc0e0 stream: Move EventPublisher to stream package
The EventPublisher is the central hub of the PubSub system. It is toughly coupled with much of
stream. Some stream internals were exported exclusively for EventPublisher.

The two Subscribe cases (with or without index) were also awkwardly split between two packages. By
moving EventPublisher into stream they are now both in the same package (although still in different files).
2020-07-14 15:57:47 -04:00
Daniel Nephin 489876c86b state: Make handleACLUpdate async once again
So that we keep as much as possible out of the FSM commit hot path.
2020-07-14 15:57:47 -04:00
Daniel Nephin 5f9db94956 state: Use interface for Txn
Also store the index in Changes instead of the Txn.

This change is in preparation for movinng EventPublisher to the stream package, and
making handleACLUpdates async once again.
2020-07-14 15:57:46 -04:00
Daniel Nephin a709ed1ab5 stream.Subscription unexport fields and additiona docstrings 2020-07-14 15:57:46 -04:00
Daniel Nephin 17b833b4c9 Add a context for stopping EventPublisher goroutine 2020-07-14 15:57:46 -04:00
Daniel Nephin 2c8342f115 EventPublisher: Make Unsubscribe a function on Subscription
It is critical that Unsubscribe be called with the same pointer to a
SubscriptionRequest that was used to create the Subscription. The
docstring made that clear, but it sill allowed a caler to get it wrong by
creating a new SubscriptionRequest.

By hiding this detail from the caller, and only exposing an Unsubscribe
method, it should be impossible to fail to Unsubscribe.

Also update some godoc strings.
2020-07-14 15:57:46 -04:00
Daniel Nephin 1622bb3a45 EventPublisher: handleACL changes synchronously
Use a separate lock for subscriptions.ByToken to allow it to happen synchronously
in the commit flow.
This removes the need to create a new txn for the goroutine, and removes
the need for EventPublisher to contain a reference to DB.
2020-07-14 15:57:46 -04:00
Daniel Nephin effab15131 stream.EventSnapshot: reduce the fields on the struct
Many of the fields are only needed in one place, and by using a closure
they can be removed from the struct. This reduces the scope of the variables
making it esier to see how they are used.
2020-07-14 15:57:45 -04:00
Daniel Nephin a5cf933fe8 stream.EventBuffer: Seed the fuzz test with time.Now()
Otherwise the test will run with exactly the same values each time.

By printing the seed we can attempt to reproduce the test by adding an env var to override the seed
2020-07-14 15:57:45 -04:00
Daniel Nephin bbe7272d8e state: memdb_wrapper.go -> memdb.go
Renaming in a separate commit so that git can merge changes to the file.
2020-07-14 15:57:45 -04:00
Daniel Nephin 2a8a8f7b8d state: publish changes from Commit
Make topicRegistry use functions instead of unbound methods
Use a regular memDB in EventPublisher to remove a reference cycle
Removes the need for EventPublisher to use a store
2020-07-14 15:57:45 -04:00
Daniel Nephin f5ecd5de5f EventPublisher: docstrings and getTopicBuffer
also rename commitCh -> publishCh
2020-07-14 15:57:45 -04:00
Daniel Nephin 555cfe52d9 ProcessChanges: use stream.Event
Also remove secretHash, which was used to hash tokens. We don't expose
these tokens anywhere, so we can use the string itself instead of a
Hash.

Fix acl_events_test.go for storing a structs type.
2020-07-14 15:57:45 -04:00
Daniel Nephin 4e0bc8013b stream: Use local types for Event Topic SubscriptionRequest 2020-07-14 15:57:45 -04:00
Daniel Nephin aacd514dca Rename stream_publisher.go -> event_publisher.go 2020-07-14 15:57:44 -04:00
Daniel Nephin c0b0109e80 Add streaming package with Subscription and Snapshot components.
The remaining files from 7965767de0bd62ab07669b85d6879bd5f815d157

Co-authored-by: Paul Banks <banks@banksco.de>
2020-07-14 15:57:44 -04:00
Matt Keeler 8beca1cb8d
Add ability for notifications when one of the agent tokens is updated (#8301)
Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>
2020-07-14 09:53:55 -04:00
Hans Hasselberg 496fb5fc5b
add support for envoy 1.14.4, 1.13.4, 1.12.6 (#8216) 2020-07-13 15:44:44 -05:00
Chris Piraino 4d857d117f
Set enterprise metadata after resolving the token (#8302)
The token can encode enterprise metadata information, and we must make
sure we set that on the reply so that we can correct filter ACLs.
2020-07-13 13:39:57 -05:00
Daniel Nephin 7b3f26e61d watch: Allow args from different types
Fixes a bug where specifying a slice of args with a single item was being converted to
a string when config was loaded, causing an error.
2020-07-10 17:18:32 -04:00
Freddy e72af87918
Add api mod support for /catalog/gateway-services (#8278) 2020-07-10 13:01:45 -06:00
Daniel Nephin 653c938edc watch: extract makeWatchPlan to facilitate testing
There is a bug in here now that slices in opaque config are unsliced. But to test that bug fix we
need a function that can be easily tested.
2020-07-10 13:33:45 -04:00
R.B. Boyer 1eef096dfe
xds: version sniff envoy and switch regular expressions from 'regex' to 'safe_regex' on newer envoy versions (#8222)
- cut down on extra node metadata transmission
- split the golden file generation to compare all envoy version
2020-07-09 17:04:51 -05:00
Daniel Nephin f22f3d300d
Merge pull request #8231 from hashicorp/dnephin/unembed-HTTPServer-Server
agent/http: un-embed the http.Server
2020-07-09 17:42:33 -04:00
Daniel Nephin df4088291c agent/http: Update TestSetupHTTPServer_HTTP2
To remove the need to store the http.Server. This will allow us to
remove the http.Server field from the HTTPServer struct.
2020-07-09 16:42:19 -04:00
Daniel Nephin d98a4c1317
Merge pull request #8237 from hashicorp/dnephin/remove-acls-enabled-from-delegate
Remove ACLsEnabled from delegate interface
2020-07-09 16:35:43 -04:00
Matt Keeler 4fb535ba48
Pass the Config and TLS Configurator into the AutoConfig constructor
This is instead of having the AutoConfigBackend interface provide functions for retrieving them.

NOTE: the config is not reloadable. For now this is fine as we don’t look at any reloadable fields. If that changes then we should provide a way to make it reloadable.
2020-07-08 12:36:11 -04:00
Matt Keeler f2f32735ce
Rename (*Server).forward to (*Server).ForwardRPC
Also get rid of the preexisting shim in server.go that existed before to have this name just call the unexported one.
2020-07-08 11:05:44 -04:00
Matt Keeler d2e4869c7c
Refactor AutoConfig RPC to not have a direct dependency on the Server type
Instead it has an interface which can be mocked for better unit testing that is deterministic and not prone to flakiness.
2020-07-08 11:05:44 -04:00
Chris Piraino 735337b170
Append port number to ingress host domain (#8190)
A port can be sent in the Host header as defined in the HTTP RFC, so we
take any hosts that we want to match traffic to and also add another
host with the listener port added.

Also fix an issue with envoy integration tests not running the
case-ingress-gateway-tls test.
2020-07-07 10:43:04 -05:00
Daniel Nephin 5247ef4c70 Remove ACLsEnabled from delegate interface
In all cases (oss/ent, client/server) this method was returning a value from config. Since the
value is consistent, it doesn't need to be part of the delegate interface.
2020-07-03 17:00:20 -04:00
Daniel Nephin a7f69b615a
Merge pull request #8215 from hashicorp/dnephin/support-not-modified-response-server
agent/consul: Add support for NotModified to two endpoints
2020-07-03 16:15:31 -04:00
Pierre Souchay 20d1ea7d2d
Upgrade go-connlimit to v0.3.0 / return http 429 on too many connections (#8221)
Fixes #7527

I want to highlight this and explain what I think the implications are and make sure we are aware:

* `HTTPConnStateFunc` closes the connection when it is beyond the limit. `Close` does not block.
* `HTTPConnStateFuncWithDefault429Handler(10 * time.Millisecond)` blocks until the following is done (worst case):
  1) `conn.SetDeadline(10*time.Millisecond)` so that
  2) `conn.Write(429error)` is guaranteed to timeout after 10ms, so that the http 429 can be written and 
  3) `conn.Close` can happen

The implication of this change is that accepting any new connection is worst case delayed by 10ms. But only after a client reached the limit already.
2020-07-03 09:25:07 +02:00
Daniel Nephin a5e45defb1 agent/http: un-embed the HTTPServer
The embedded HTTPServer struct is not used by the large HTTPServer
struct. It is used by tests and the agent. This change is a small first
step in the process of removing that field.

The eventual goal is to reduce the scope of HTTPServer making it easier
to test, and split into separate packages.
2020-07-02 17:21:12 -04:00
Daniel Nephin 5d36f98710 agent/consul: Add support for NotModified to two endpoints
A query made with AllowNotModifiedResponse and a MinIndex, where the
result has the same Index as MinIndex, will return an empty response
with QueryMeta.NotModified set to true.

Co-authored-by: Pierre Souchay <pierresouchay@users.noreply.github.com>
2020-07-02 17:05:46 -04:00
Matt Keeler f8e8f48125
Merge pull request #8211 from hashicorp/bugfix/auto-encrypt-various 2020-07-02 09:49:49 -04:00
Yury Evtikhov 10361dd210 DNS: add IsErrQueryNotFound function for easier error evaluation 2020-07-01 03:41:44 +01:00
Yury Evtikhov 8d18422f19 DNS: fix agent returning SERVFAIL where NXDOMAIN should be returned 2020-07-01 01:51:21 +01:00
Yury Evtikhov 3b4ddaaab5 DNS: add test to verify NXDOMAIN is returned when a non-existent domain is queried over RPC 2020-07-01 01:51:16 +01:00
Matt Keeler 6e7acfa618
Add an AutoEncrypt “integration” test
Also fix a bug where Consul could segfault if TLS was enabled but no client certificate was provided. How no one has reported this as a problem I am not sure.
2020-06-30 15:23:29 -04:00
Matt Keeler 2ddcba00c6
Overwrite agent leaf cert trust domain on the servers 2020-06-30 09:59:08 -04:00
Matt Keeler 19040f1166
Store the Connect CA rate limiter on the server
This fixes a bug where auto_encrypt was operating without utilizing a common rate limiter.
2020-06-30 09:59:07 -04:00
Matt Keeler a5a9560bbd
Initialize the agent leaf cert cache result with a state to prevent unnecessary second certificate signing 2020-06-30 09:59:07 -04:00
Matt Keeler 39b567a55a
Fix auto_encrypt IP/DNS SANs
The initial auto encrypt CSR wasn’t containing the user supplied IP and DNS SANs. This fixes that. Also We were configuring a default :: IP SAN. This should be ::1 instead and was fixed.
2020-06-30 09:59:07 -04:00
Matt Keeler 85fd8c552f
Merge pull request #8193 from hashicorp/feature/auto-config/suppress-config-warnings 2020-06-27 10:06:52 -04:00
R.B. Boyer 462f0f37ed
connect: various changes to make namespaces for intentions work more like for other subsystems (#8194)
Highlights:

- add new endpoint to query for intentions by exact match

- using this endpoint from the CLI instead of the dump+filter approach

- enforcing that OSS can only read/write intentions with a SourceNS or
  DestinationNS field of "default".

- preexisting OSS intentions with now-invalid namespace fields will
  delete those intentions on initial election or for wildcard namespaces
  an attempt will be made to downgrade them to "default" unless one
  exists.

- also allow the '-namespace' CLI arg on all of the intention subcommands

- update lots of docs
2020-06-26 16:59:15 -05:00
Matt Keeler be576c9737
Use the DNS and IP SANs from the auto config stanza when set 2020-06-26 16:01:30 -04:00
Matt Keeler e8b39dd255
Overhaul the auto-config translation
This fixes some issues around spurious warnings about using enterprise configuration in OSS.
2020-06-26 15:25:21 -04:00
Freddy 10d6e9c458
Split up unused key validation for oss/ent (#8189)
Split up unused key validation in config entry decode for oss/ent.

This is needed so that we can return an informative error in OSS if namespaces are provided.
2020-06-25 13:58:29 -06:00
Daniel Nephin a891ee8428
Merge pull request #8176 from hashicorp/dnephin/add-linter-unparam-1
lint: add unparam linter and fix some of the issues
2020-06-25 15:34:48 -04:00
Matt Keeler 7041f69892
Merge pull request #8184 from hashicorp/bugfix/goroutine-leaks 2020-06-25 09:22:19 -04:00
Chris Piraino df48db0abd
Merge pull request #7932 from hashicorp/ingress/internal-ui-endpoint-multiple-ports
Update gateway-services-nodes API endpoint to allow multiple addresses
2020-06-24 17:11:01 -05:00
Chris Piraino f213d3592a remove obsolete comments about test parallelization 2020-06-24 16:36:13 -05:00
Chris Piraino b3db907bdf Update gateway-services-nodes API endpoint to allow multiple addresses
Previously, we were only returning a single ListenerPort for a single
service. However, we actually allow a single service to be serviced over
multiple ports, as well as allow users to define what hostnames they
expect their services to be contacted over. When no hosts are defined,
we return the default ingress domain for any configured DNS domain.

To show this in the UI, we modify the gateway-services-nodes API to
return a GatewayConfig.Addresses field, which is a list of addresses
over which the specific service can be contacted.
2020-06-24 16:35:23 -05:00
Matt Keeler e9835610f3
Add a test for go routine leaks
This is in its own separate package so that it will be a separate test binary that runs thus isolating the go runtime from other tests and allowing accurate go routine leak checking.

This test would ideally use goleak.VerifyTestMain but that will fail 100% of the time due to some architectural things (blocking queries and net/rpc uncancellability).

This test is not comprehensive. We should enable/exercise more features and more cluster configurations. However its a start.
2020-06-24 17:09:50 -04:00
Matt Keeler 29d0cfdd7d
Fix go routine leak in auto encrypt ca roots tracking 2020-06-24 17:09:50 -04:00
Matt Keeler 25a4f3c83b
Allow cancelling blocking queries in response to shutting down. 2020-06-24 17:09:50 -04:00
Daniel Nephin 0279bf6fe5 Update TestAgent_GetCoordinate
The old test case was a very specific regresion test for a case that is no longer possible.
Replaced with a new test that checks the default coordinate is returned.
2020-06-24 13:00:15 -04:00
Daniel Nephin f65e21e6dc Remove unused return values 2020-06-24 13:00:15 -04:00
Daniel Nephin 010a609912 Fix a bunch of unparam lint issues 2020-06-24 13:00:14 -04:00
Matt Keeler 15e7b3940c
Ensure that retryLoopBackoff can be cancelled
We needed to pass a cancellable context into the limiter.Wait instead of context.Background. So I made the func take a context instead of a chan as most places were just passing through a Done chan from a context anyways.

Fix go routine leak in the gateway locator
2020-06-24 12:41:08 -04:00
Matt Keeler e2cfa93f02
Don’t leak metrics go routines in tests (#8182) 2020-06-24 10:15:25 -04:00
gitforbit 808f632346
agent-http: cleanup: return nil instead of err (#8043)
Since err is already checked, it should return `nil`
2020-06-24 14:29:21 +02:00
R.B. Boyer c63c994b04
connect: upgrade github.com/envoyproxy/go-control-plane to v0.9.5 (#8165) 2020-06-23 15:19:56 -05:00
freddygv c791fbc79c Update namespaces subject-verb agreement 2020-06-23 10:57:30 -06:00
freddygv 044d027ff8 Remove break 2020-06-22 19:59:04 -06:00
freddygv 70810b0602 Let users know namespaces are ent only in config entry decode 2020-06-22 19:59:04 -06:00
Pierre Souchay 35d852fd9a
Returns DNS Error NSDOMAIN when DC does not exists (#8103)
This will allow to increase cache value when DC is not valid (aka
return SOA to avoid too many consecutive requests) and will
distinguish DC being temporarily not available from DC not existing.

Implements https://github.com/hashicorp/consul/issues/8102
2020-06-22 09:01:48 -04:00
Matt Keeler 4a5b352c18
Require enabling TLS to enable Auto Config (#8159)
On the servers they must have a certificate.

On the clients they just have to set verify_outgoing to true to attempt TLS connections for RPCs.

Eventually we may relax these restrictions but right now all of the settings we push down (acl tokens, acl related settings, certificates, gossip key) are sensitive and shouldn’t be transmitted over an unencrypted connection. Our guides and docs should recoommend verify_server_hostname on the clients as well.

Another reason to do this is weird things happen when making an insecure RPC when TLS is not enabled. Basically it tries TLS anyways. We should probably fix that to make it clearer what is going on.
2020-06-19 16:38:14 -04:00
Freddy 5baa7b1b04
Always return a gateway cluster (#8158) 2020-06-19 13:31:39 -06:00
Matt Keeler d6e05482ab
Allow cancelling startup when performing auto-config (#8157)
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2020-06-19 15:16:00 -04:00
Daniel Nephin 4f17350928
Merge pull request #8147 from hashicorp/dnephin/remove-private-ip-2
Remove some dead code from agent/consul/util.go
2020-06-18 15:51:09 -04:00
Matt Keeler b0fcf86140 Change auto config authorizer to allow for future extension
The envisioned changes would allow extra settings to enable dynamically defined auth methods to be used instead of  or in addition to the statically defined one in the configuration.
2020-06-18 15:22:24 -04:00
Daniel Nephin b0ba546a1f Remove bytesToUint64 from agent/consul 2020-06-18 12:45:43 -04:00
Daniel Nephin a00f007c5e Remove unused private IP code from agent/consul 2020-06-18 12:40:38 -04:00
Matt Keeler 3dbbd2d37d
Implement Client Agent Auto Config
There are a couple of things in here.

First, just like auto encrypt, any Cluster.AutoConfig RPC will implicitly use the less secure RPC mechanism.

This drastically modifies how the Consul Agent starts up and moves most of the responsibilities (other than signal handling) from the cli command and into the Agent.
2020-06-17 16:49:46 -04:00
Matt Keeler 8b7d669a27
Allow the Agent its its child Client/Server to share a connection pool
This is needed so that we can make an AutoConfig RPC at the Agent level prior to creating the Client/Server.
2020-06-17 16:19:33 -04:00
Matt Keeler 51c3a605ad
Merge pull request #8035 from hashicorp/feature/auto-config/server-rpc 2020-06-17 16:07:25 -04:00
Chris Piraino 79a862d019
Remove ACLEnforceVersion8 from tests (#8138)
The field had been deprecated for a while and was recently removed,
however a PR which added these tests prior to removal was merged.
2020-06-17 14:58:01 -05:00
Daniel Nephin 692a4a8fc8
Merge pull request #7762 from hashicorp/dnephin/warn-on-unknown-service-file
config: warn if a config file is being skipped because of its file extension
2020-06-17 15:14:40 -04:00
Daniel Nephin be29d6bf75 config: warn when a config file is skipped
All commands which read config (agent, services, and validate) will now
print warnings when one of the config files is skipped because it did
not match an expected format.

Also ensures that config validate prints all warnings.
2020-06-17 13:08:54 -04:00
Daniel Nephin 5afcf5c1bc
Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4
ci: enable SA4006 staticcheck check and add ineffassign
2020-06-17 12:16:02 -04:00
Matt Keeler 9b01f9423c
Implement the insecure version of the Cluster.AutoConfig RPC endpoint
Right now this is only hooked into the insecure RPC server and requires JWT authorization. If no JWT authorizer is setup in the configuration then we inject a disabled “authorizer” to always report that JWT authorization is disabled.
2020-06-17 11:25:29 -04:00
Pierre Souchay d31691dc87
gossip: Ensure that metadata of Consul Service is updated (#7903)
While upgrading servers to a new version, I saw that metadata of
existing servers are not upgraded, so the version and raft meta
is not up to date in catalog.

The only way to do it was to:
 * update Consul server
 * make it leave the cluster, then metadata is accurate

That's because the optimization to avoid updating catalog does
not take into account metadata, so no update on catalog is performed.
2020-06-17 12:16:13 +02:00
Daniel Nephin d345cd8d30 ci: Add ineffsign linter
And fix an additional ineffective assignment that was not caught by staticcheck
2020-06-16 17:32:50 -04:00
Daniel Nephin a9851e1812
Merge pull request #8070 from hashicorp/dnephin/add-gofmt-simplify
ci: Enable gofmt simplify
2020-06-16 17:18:38 -04:00
Matt Keeler 9f7b22a5eb
Agent Auto Configuration: Configuration Syntax Updates (#8003) 2020-06-16 15:03:22 -04:00
Daniel Nephin 068b43df90 Enable gofmt simplify
Code changes done automatically with 'gofmt -s -w'
2020-06-16 13:21:11 -04:00
Daniel Nephin cb050b280c ci: enable SA4006 staticcheck check
And fix the 'value not used' issues.

Many of these are not bugs, but a few are tests not checking errors, and
one appears to be a missed error in non-test code.
2020-06-16 13:10:11 -04:00
Daniel Nephin f7c84ad802 Rename txnWrapper to txn 2020-06-16 13:06:02 -04:00
Daniel Nephin 32aa3ada35 Rename db 2020-06-16 13:04:31 -04:00
Daniel Nephin deef6fcc32 Handle return value from txn.Commit 2020-06-16 13:04:31 -04:00
Daniel Nephin 59bac0f99d state: Update docstrings for changeTrackerDB and txn
And un-embed memdb.DB to prevent accidental access to underlying
methods.
2020-06-16 13:04:31 -04:00
Paul Banks f6ac08be04 state: track changes so that they may be used to produce change events 2020-06-16 13:04:29 -04:00
Matt Keeler d3881dd754
ACL Node Identities (#7970)
A Node Identity is very similar to a service identity. Its main targeted use is to allow creating tokens for use by Consul agents that will grant the necessary permissions for all the typical agent operations (node registration, coordinate updates, anti-entropy).

Half of this commit is for golden file based tests of the acl token and role cli output. Another big updates was to refactor many of the tests in agent/consul/acl_endpoint_test.go to use the same style of tests and the same helpers. Besides being less boiler plate in the tests it also uses a common way of starting a test server with ACLs that should operate without any warnings regarding deprecated non-uuid master tokens etc.
2020-06-16 12:54:27 -04:00
Daniel Nephin 476b57fe22 config: refactor to consolidate all File->Source loading
Previously the logic for reading ConfigFiles and produces Sources was split
between NewBuilder and Build. This commit moves all of the logic into NewBuilder
so that Build() can operate entirely on Sources.

This change is in preparation for logging warnings when files have an
unsupported extension.

It also reduces the scope of BuilderOpts, and gets us very close to removing
Builder.options.
2020-06-16 12:52:23 -04:00
Daniel Nephin 219790ca49 config: Make ConfigFormat not a pointer
The nil value was never used. We can avoid a bunch of complications by
making the field a string value instead of a pointer.

This change is in preparation for fixing a silent config failure.
2020-06-16 12:52:22 -04:00
Daniel Nephin 77101eee82 config: rename Flags to BuilderOpts
Flags is an overloaded term in this context. It generally is used to
refer to command line flags. This struct, however, is a data object
used as input to the construction.

It happens to be partially populated by command line flags, but
otherwise has very little to do with them.

Renaming this struct should make the actual responsibility of this struct
more obvious, and remove the possibility that it is confused with
command line flags.

This change is in preparation for adding additional fields to
BuilderOpts.
2020-06-16 12:51:19 -04:00
Daniel Nephin 85e0338136 config: remove Args field from Flags
This field was populated for one reason, to test that it was empty.
Of all the callers, only a single one used this functionality. The rest
constructed a `Flags{}` struct which did not set Args.

I think this shows that the logic was in the wrong place. Only the agent
command needs to care about validating the args.

This commit removes the field, and moves the logic to the one caller
that cares.

Also fix some comments.
2020-06-16 12:49:53 -04:00
Daniel Nephin 73cd0b6fac agent/service_manager: remove 'updateCh' field from serviceConfigWatch
Passing the channel to the function which uses it significantly
reduces the scope of the variable, and makes its usage more explicit. It
also moves the initialization of the channel closer to where it is used.

Also includes a couple very small cleanups to remove a local var and
read the error from `ctx.Err()` directly instead of creating a channel
to check for an error.
2020-06-16 12:15:57 -04:00