Commit Graph

155 Commits (a9851e181259f3e09240d1afee7a8406755851fd)

Author SHA1 Message Date
Matt Keeler 9f7b22a5eb
Agent Auto Configuration: Configuration Syntax Updates (#8003) 2020-06-16 15:03:22 -04:00
Hans Hasselberg 72f92ae7ca
agent: add option to disable agent cache for HTTP endpoints (#8023)
This allows the operator to disable agent caching for the http endpoint.
It is on by default for backwards compatibility and if disabled will
ignore the url parameter `cached`.
2020-06-08 10:08:12 +02:00
R.B. Boyer ffb9c7d6f7
acl: remove the deprecated `acl_enforce_version_8` option (#7991)
Fixes #7292
2020-05-29 16:16:03 -05:00
Daniel Nephin c88fae0aac ci: Add staticcheck and fix most errors
Three of the checks are temporarily disabled to limit the size of the
diff, and allow us to enable all the other checks in CI.

In a follow up we can fix the issues reported by the other checks one
at a time, and enable them.
2020-05-28 11:59:58 -04:00
Pierre Souchay e9d176db2a
Allow to restrict servers that can join a given Serf Consul cluster. (#7628)
Based on work done in https://github.com/hashicorp/memberlist/pull/196
this allows to restrict the IP ranges that can join a given Serf cluster
and be a member of the cluster.

Restrictions on IPs can be done separatly using 2 new differents flags
and config options to restrict IPs for LAN and WAN Serf.
2020-05-20 11:31:19 +02:00
Matt Keeler cbe3a70f56
Update enterprise configurations to be in OSS
This will emit warnings about the configs not doing anything but still allow them to be parsed.

This also added the warnings for enterprise fields that we already had in OSS but didn’t change their enforcement behavior. For example, attempting to use a network segment will cause a hard error in OSS.
2020-05-04 10:21:05 -04:00
Hans Hasselberg 1194fe441f
auto_encrypt: add validations for auto_encrypt.{tls,allow_tls} (#7704)
Fixes https://github.com/hashicorp/consul/issues/7407.
2020-04-24 15:51:38 +02:00
Kit Patella e2467f4b2c
Merge pull request #7656 from hashicorp/feature/audit/oss-merge
agent: stub out auditing functionality in OSS
2020-04-17 13:33:06 -07:00
Kit Patella 927f584761 agent: stub out auditing functionality in OSS 2020-04-16 15:07:52 -07:00
Kyle Havlovitz e9e8c0e730
Ingress Gateways for TCP services (#7509)
* Implements a simple, tcp ingress gateway workflow

This adds a new type of gateway for allowing Ingress traffic into Connect from external services.

Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>
2020-04-16 14:00:48 -07:00
Pierre Souchay be1c5c4b48
config: validate system limits against limits.http_max_conns_per_client (#7434)
I spent some time today on my local Mac to figure out why Consul 1.6.3+
was not accepting limits.http_max_conns_per_client.

This adds an explicit check on number of file descriptors to be sure
it might work (this is no guarantee as if many clients are reaching
the agent, it might consume even more file descriptors)

Anyway, many users are fighting with RLIMIT_NOFILE, having a clear
message would allow them to figure out what to fix.

Example of message (reload or start):

```
2020-03-11T16:38:37.062+0100 [ERROR] agent: Error starting agent: error="system allows a max of 512 file descriptors, but limits.http_max_conns_per_client: 8192 needs at least 8212"
```
2020-04-02 09:22:17 +02:00
Pierre Souchay 54b22c638d
config: allow running `consul agent -dev -ui-dir=some_path` (#7525)
When run in with `-dev` in DevMode, it is not possible to replace
the embeded UI with another one because `-dev` implies `-ui`.

This commit allows this an slightly change the error message
about Consul 0.7.0 which is very old and does not apply to
current version anyway.
2020-03-31 22:36:20 +02:00
Freddy 18d356899c
Enable CLI to register terminating gateways (#7500)
* Enable CLI to register terminating gateways

* Centralize gateway proxy configuration
2020-03-26 10:20:56 -06:00
R.B. Boyer 6adad71125
wan federation via mesh gateways (#6884)
This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch:

There are several distinct chunks of code that are affected:

* new flags and config options for the server

* retry join WAN is slightly different

* retry join code is shared to discover primary mesh gateways from secondary datacenters

* because retry join logic runs in the *agent* and the results of that
  operation for primary mesh gateways are needed in the *server* there are
  some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur
  at multiple layers of abstraction just to pass the data down to the right
  layer.

* new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers

* the function signature for RPC dialing picked up a new required field (the
  node name of the destination)

* several new RPCs for manipulating a FederationState object:
  `FederationState:{Apply,Get,List,ListMeshGateways}`

* 3 read-only internal APIs for debugging use to invoke those RPCs from curl

* raft and fsm changes to persist these FederationStates

* replication for FederationStates as they are canonically stored in the
  Primary and replicated to the Secondaries.

* a special derivative of anti-entropy that runs in secondaries to snapshot
  their local mesh gateway `CheckServiceNodes` and sync them into their upstream
  FederationState in the primary (this works in conjunction with the
  replication to distribute addresses for all mesh gateways in all DCs to all
  other DCs)

* a "gateway locator" convenience object to make use of this data to choose
  the addresses of gateways to use for any given RPC or gossip operation to a
  remote DC. This gets data from the "retry join" logic in the agent and also
  directly calls into the FSM.

* RPC (`:8300`) on the server sniffs the first byte of a new connection to
  determine if it's actually doing native TLS. If so it checks the ALPN header
  for protocol determination (just like how the existing system uses the
  type-byte marker).

* 2 new kinds of protocols are exclusively decoded via this native TLS
  mechanism: one for ferrying "packet" operations (udp-like) from the gossip
  layer and one for "stream" operations (tcp-like). The packet operations
  re-use sockets (using length-prefixing) to cut down on TLS re-negotiation
  overhead.

* the server instances specially wrap the `memberlist.NetTransport` when running
  with gateway federation enabled (in a `wanfed.Transport`). The general gist is
  that if it tries to dial a node in the SAME datacenter (deduced by looking
  at the suffix of the node name) there is no change. If dialing a DIFFERENT
  datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh
  gateways to eventually end up in a server's :8300 port.

* a new flag when launching a mesh gateway via `consul connect envoy` to
  indicate that the servers are to be exposed. This sets a special service
  meta when registering the gateway into the catalog.

* `proxycfg/xds` notice this metadata blob to activate additional watches for
  the FederationState objects as well as the location of all of the consul
  servers in that datacenter.

* `xds:` if the extra metadata is in place additional clusters are defined in a
  DC to bulk sink all traffic to another DC's gateways. For the current
  datacenter we listen on a wildcard name (`server.<dc>.consul`) that load
  balances all servers as well as one mini-cluster per node
  (`<node>.server.<dc>.consul`)

* the `consul tls cert create` command got a new flag (`-node`) to help create
  an additional SAN in certs that can be used with this flavor of federation.
2020-03-09 15:59:02 -05:00
Kim Ngo a8f4123d37
agent/txn_endpoint: configure max txn request length (#7388)
configure max transaction size separately from kv limit
2020-03-05 15:42:37 -06:00
Hans Hasselberg 315d57bfb1
agent: sensible keyring error (#7272)
Fixes #7231. Before an agent would always emit a warning when there is
an encrypt key in the configuration and an existing keyring stored,
which is happening on restart.

Now it only emits that warning when the encrypt key from the
configuration is not part of the keyring.
2020-02-13 20:35:09 +01:00
Akshay Ganeshen 8beb716414
feat: support sending body in HTTP checks (#6602) 2020-02-10 09:27:12 -07:00
Freddy cb77fc6d01
Add managed service provider token (#7218)
Stubs for enterprise-only ACL token to be used by managed service providers.
2020-02-04 13:58:56 -07:00
Hans Hasselberg 5531678e9e
Security fixes (#7182)
* Mitigate HTTP/RPC Services Allow Unbounded Resource Usage

Fixes #7159.

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
2020-01-31 11:19:37 -05:00
Chris Piraino 401221de58
Allow users to configure either unstructured or JSON logging (#7130)
* hclog Allow users to choose between unstructured and JSON logging
2020-01-28 17:50:41 -06:00
R.B. Boyer 0f44bcd3d8
agent: default the primary_datacenter to the datacenter if not configured (#7111)
Something similar already happens inside of the server
(agent/consul/server.go) but by doing it in the general config parsing
for the agent we can have agent-level code rely on the PrimaryDatacenter
field, too.
2020-01-23 09:59:31 -06:00
Hans Hasselberg 804eb17094
connect: check if intermediate cert needs to be renewed. (#6835)
Currently when using the built-in CA provider for Connect, root certificates are valid for 10 years, however secondary DCs get intermediates that are valid for only 1 year. There is no mechanism currently short of rotating the root in the primary that will cause the secondary DCs to renew their intermediates.
This PR adds a check that renews the cert if it is half way through its validity period.

In order to be able to test these changes, a new configuration option was added: IntermediateCertTTL which is set extremely low in the tests.
2020-01-17 23:27:13 +01:00
Hans Hasselberg 87f32c8ba6
auto_encrypt: set dns and ip san for k8s and provide configuration (#6944)
* Add CreateCSRWithSAN
* Use CreateCSRWithSAN in auto_encrypt and cache
* Copy DNSNames and IPAddresses to cert
* Verify auto_encrypt.sign returns cert with SAN
* provide configuration options for auto_encrypt dnssan and ipsan
* rename CreateCSRWithSAN to CreateCSR
2020-01-17 23:25:26 +01:00
Aestek ba8fd8296f Add support for dual stack IPv4/IPv6 network (#6640)
* Use consts for well known tagged adress keys

* Add ipv4 and ipv6 tagged addresses for node lan and wan

* Add ipv4 and ipv6 tagged addresses for service lan and wan

* Use IPv4 and IPv6 address in DNS
2020-01-17 09:54:17 -05:00
Matej Urbas ce023359fe agent: configurable MaxQueryTime and DefaultQueryTime. (#3777) 2020-01-17 14:20:57 +01:00
Matt Keeler 3faee222f2
OSS changes to allow for parsing the enterprise DNS config prop… (#6959) 2019-12-18 10:16:35 -05:00
Matt Keeler 5934f803bf
Sync of OSS changes to support namespaces (#6909) 2019-12-09 21:26:41 -05:00
Hans Hasselberg 9ff69194a2
tls: auto_encrypt and verify_incoming (#6811) (#6899)
* relax requirements for auto_encrypt on server
* better error message when auto_encrypt and verify_incoming on
* docs: explain verify_incoming on Consul clients.
2019-12-06 21:36:13 +01:00
Paul Banks cd1b613352
connect: Add AWS PCA provider (#6795)
* Update AWS SDK to use PCA features.

* Add AWS PCA provider

* Add plumbing for config, config validation tests, add test for inheriting existing CA resources created by user

* Unparallel the tests so we don't exhaust PCA limits

* Merge updates

* More aggressive polling; rate limit pass through on sign; Timeout on Sign and CA create

* Add AWS PCA docs

* Fix Vault doc typo too

* Doc typo

* Apply suggestions from code review

Co-Authored-By: R.B. Boyer <rb@hashicorp.com>
Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com>

* Doc fixes; tests for erroring if State is modified via API

* More review cleanup

* Uncomment tests!

* Minor suggested clean ups
2019-11-21 17:40:29 +00:00
Sarah Christoff 5e1c6e907b
Set MinQuorum variable in Autopilot (#6654)
* Add MinQuorum to Autopilot
2019-10-29 09:04:41 -05:00
PHBourquin 039615641e Checks to passing/critical only after reaching a consecutive success/failure threshold (#5739)
A check may be set to become passing/critical only if a specified number of successive
checks return passing/critical in a row. Status will stay identical as before until
the threshold is reached.
This feature is available for HTTP, TCP, gRPC, Docker & Monitor checks.
2019-10-14 21:49:49 +01:00
Sarah Christoff 194f5740ce
ui_content_path config option fix (#6601)
* fix ui-content-path config option
2019-10-09 09:14:48 -05:00
Freddy fdd10dd8b8
Expose HTTP-based paths through Connect proxy (#6446)
Fixes: #5396

This PR adds a proxy configuration stanza called expose. These flags register
listeners in Connect sidecar proxies to allow requests to specific HTTP paths from outside of the node. This allows services to protect themselves by only
listening on the loopback interface, while still accepting traffic from non
Connect-enabled services.

Under expose there is a boolean checks flag that would automatically expose all
registered HTTP and gRPC check paths.

This stanza also accepts a paths list to expose individual paths. The primary
use case for this functionality would be to expose paths for third parties like
Prometheus or the kubelet.

Listeners for requests to exposed paths are be configured dynamically at run
time. Any time a proxy, or check can be registered, a listener can also be
created.

In this initial implementation requests to these paths are not
authenticated/encrypted.
2019-09-25 20:55:52 -06:00
Hans Hasselberg faa54ab989
auto_encrypt: verify_incoming_rpc is good enough for auto_encrypt.allow_tls (#6376)
Previously `verify_incoming` was required when turning on `auto_encrypt.allow_tls`, but that doesn't work together with HTTPS UI in some scenarios. Adding `verify_incoming_rpc` to the allowed configurations.
2019-08-27 14:36:36 +02:00
Mike Morris 65be58703c
connect: remove managed proxies (#6220)
* connect: remove managed proxies implementation and all supporting config options and structs

* connect: remove deprecated ProxyDestination

* command: remove CONNECT_PROXY_TOKEN env var

* agent: remove entire proxyprocess proxy manager

* test: remove all managed proxy tests

* test: remove irrelevant managed proxy note from TestService_ServerTLSConfig

* test: update ContentHash to reflect managed proxy removal

* test: remove deprecated ProxyDestination test

* telemetry: remove managed proxy note

* http: remove /v1/agent/connect/proxy endpoint

* ci: remove deprecated test exclusion

* website: update managed proxies deprecation page to note removal

* website: remove managed proxy configuration API docs

* website: remove managed proxy note from built-in proxy config

* website: add note on removing proxy subdirectory of data_dir
2019-08-09 15:19:30 -04:00
Paul Banks e87cef2bb8 Revert "connect: support AWS PCA as a CA provider" (#6251)
This reverts commit 3497b7c00d.
2019-07-31 09:08:10 -04:00
Todd Radel 3497b7c00d
connect: support AWS PCA as a CA provider (#6189)
Port AWS PCA provider from consul-ent
2019-07-30 22:57:51 -04:00
Todd Radel 2552f4a11a
connect: Support RSA keys in addition to ECDSA (#6055)
Support RSA keys in addition to ECDSA
2019-07-30 17:47:39 -04:00
R.B. Boyer c6c4a2251a Merge Consul OSS branch master at commit b3541c4f34 2019-07-26 10:34:24 -05:00
Jeff Mitchell 94c73d0c92 Chunking support (#6172)
* Initial chunk support

This uses the go-raft-middleware library to allow for chunked commits to the KV
2019-07-24 17:06:39 -04:00
Matt Keeler 3053342198
Envoy Mesh Gateway integration tests (#6187)
* Allow setting the mesh gateway mode for an upstream in config files

* Add envoy integration test for mesh gateways

This necessitated many supporting changes in most of the other test cases.

Add remote mode mesh gateways integration test
2019-07-24 17:01:42 -04:00
Alvin Huang ef6b80bab2 resolve circleci config conflicts 2019-07-23 20:18:36 -04:00
Paul Banks f38da47c55
Allow raft TrailingLogs to be configured. (#6186)
This fixes pathological cases where the write throughput and snapshot size are both so large that more than 10k log entries are written in the time it takes to restore the snapshot from disk. In this case followers that restart can never catch up with leader replication again and enter a loop of constantly downloading a full snapshot and restoring it only to find that snapshot is already out of date and the leader has truncated its logs so a new snapshot is sent etc.

In general if you need to adjust this, you are probably abusing Consul for purposes outside its design envelope and should reconsider your usage to reduce data size and/or write volume.
2019-07-23 15:19:57 +01:00
hashicorp-ci a4431da1cc Merge Consul OSS branch 'master' at commit ef257b084d 2019-07-20 02:00:29 +00:00
javicrespo b006060d4c log rotation: limit count of rotated log files (#5831) 2019-07-19 15:36:34 -06:00
Matt Keeler 8d953f5840 Implement Mesh Gateways
This includes both ingress and egress functionality.
2019-07-01 16:28:30 -04:00
hashicorp-ci 43bda6fb76 Merge Consul OSS branch 'master' at commit e91f73f592 2019-06-30 02:00:31 +00:00
Hans Hasselberg 33a7df3330
tls: auto_encrypt enables automatic RPC cert provisioning for consul clients (#5597) 2019-06-27 22:22:07 +02:00
Akshay Ganeshen 98a35fbe69 dns: support alt domains for dns resolution (#5940)
this adds an option for an alt domain to be used with dns while migrating to a new consul domain.
2019-06-27 12:00:37 +02:00
hashicorp-ci f4304e2e5b Merge Consul OSS branch 'master' at commit 4eb73973b6 2019-06-27 02:00:41 +00:00