consul

Commit Graph

Author	SHA1	Message	Date
Kyle Havlovitz	955ee64b95	Merge pull request #7373 from hashicorp/acl-segments-fix Add stub methods for ACL/segment bug fix from enterprise	2020-03-09 14:25:49 -07:00
R.B. Boyer	6adad71125	wan federation via mesh gateways (#6884 ) This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch: There are several distinct chunks of code that are affected: * new flags and config options for the server * retry join WAN is slightly different * retry join code is shared to discover primary mesh gateways from secondary datacenters * because retry join logic runs in the agent and the results of that operation for primary mesh gateways are needed in the server there are some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur at multiple layers of abstraction just to pass the data down to the right layer. * new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers * the function signature for RPC dialing picked up a new required field (the node name of the destination) * several new RPCs for manipulating a FederationState object: `FederationState:{Apply,Get,List,ListMeshGateways}` * 3 read-only internal APIs for debugging use to invoke those RPCs from curl * raft and fsm changes to persist these FederationStates * replication for FederationStates as they are canonically stored in the Primary and replicated to the Secondaries. * a special derivative of anti-entropy that runs in secondaries to snapshot their local mesh gateway `CheckServiceNodes` and sync them into their upstream FederationState in the primary (this works in conjunction with the replication to distribute addresses for all mesh gateways in all DCs to all other DCs) * a "gateway locator" convenience object to make use of this data to choose the addresses of gateways to use for any given RPC or gossip operation to a remote DC. This gets data from the "retry join" logic in the agent and also directly calls into the FSM. * RPC (`:8300`) on the server sniffs the first byte of a new connection to determine if it's actually doing native TLS. If so it checks the ALPN header for protocol determination (just like how the existing system uses the type-byte marker). * 2 new kinds of protocols are exclusively decoded via this native TLS mechanism: one for ferrying "packet" operations (udp-like) from the gossip layer and one for "stream" operations (tcp-like). The packet operations re-use sockets (using length-prefixing) to cut down on TLS re-negotiation overhead. * the server instances specially wrap the `memberlist.NetTransport` when running with gateway federation enabled (in a `wanfed.Transport`). The general gist is that if it tries to dial a node in the SAME datacenter (deduced by looking at the suffix of the node name) there is no change. If dialing a DIFFERENT datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh gateways to eventually end up in a server's :8300 port. * a new flag when launching a mesh gateway via `consul connect envoy` to indicate that the servers are to be exposed. This sets a special service meta when registering the gateway into the catalog. * `proxycfg/xds` notice this metadata blob to activate additional watches for the FederationState objects as well as the location of all of the consul servers in that datacenter. * `xds:` if the extra metadata is in place additional clusters are defined in a DC to bulk sink all traffic to another DC's gateways. For the current datacenter we listen on a wildcard name (`server.<dc>.consul`) that load balances all servers as well as one mini-cluster per node (`<node>.server.<dc>.consul`) * the `consul tls cert create` command got a new flag (`-node`) to help create an additional SAN in certs that can be used with this flavor of federation.	2020-03-09 15:59:02 -05:00
Matt Keeler	e3891db55b	Gather instance counts of aggregated services (#7415 )	2020-03-09 11:56:19 -04:00
Pierre Souchay	864f7efffa	agent: configuration reload preserves check's statuses for services (#7345 ) This fixes issue #7318 Between versions 1.5.2 and 1.5.3, a regression has been introduced regarding health of services. A patch #6144 had been issued for HealthChecks of nodes, but not for healthchecks of services. What happened when a reload was: 1. save all healthcheck statuses 2. cleanup everything 3. add new services with healthchecks In step 3, the state of healthchecks was taken into account locally, so at step 3, but since we cleaned up at step 2, state was lost. This PR introduces the snap parameter, so step 3 can use information from step 1	2020-03-09 12:59:41 +01:00
Hans Hasselberg	c46e2ae59b	docs: add docs for kv_max_value_size (#7405 ) Apart from the added docs, the error messages are similar now and are pointing to the corresponding options. Fixes #6708.	2020-03-09 11:13:40 +01:00
Kim Ngo	a8f4123d37	agent/txn_endpoint: configure max txn request length (#7388 ) configure max transaction size separately from kv limit	2020-03-05 15:42:37 -06:00
Matt Keeler	7584dfe8c8	Fix session backwards incompatibility with 1.6.x and earlier.	2020-03-05 15:34:55 -05:00
John Cowen	e83fb1882c	Adds http_config.response_headers to the UI headers plus tests (#7369 )	2020-03-03 13:18:35 +00:00
Pierre Souchay	2300e2d4ba	agent: take Prometheus MIME-type header into account (#7371 ) This will avoid adding format=prometheus in request and to parse more easily metrics using Prometheus. This commit follows https://github.com/hashicorp/consul/pull/6514 as the PR has been closed and extends it by accepting old Prometheus mime-type.	2020-03-03 14:18:19 +01:00
Kyle Havlovitz	7c57837908	Add stub methods for ACL/segment bug fix from enterprise	2020-03-02 10:30:23 -08:00
Hans Hasselberg	e05ac57e8f	tls: support tls 1.3 (#7325 )	2020-02-19 23:22:31 +01:00
Matt Keeler	861f754dad	Properly detect no alt domain set (#7323 )	2020-02-19 14:41:43 -05:00
Matt Keeler	4c9577678e	xDS Mesh Gateway Resolver Subset Fixes (#7294 ) * xDS Mesh Gateway Resolver Subset Fixes The first fix was that clusters were being generated for every service resolver subset regardless of there being any service instances of the associated service in that dc. The previous logic didn’t care at all but now it will omit generating those clusters unless we also have service instances that should be proxied. The second fix was to respect the DefaultSubset of a service resolver so that mesh-gateways would configure the endpoints of the unnamed subset cluster to only those endpoints matched by the default subsets filters. * Refactor the gateway endpoint generation to be a little easier to read	2020-02-19 11:57:55 -05:00
rerorero	2630a949f7	fix: Destroying a session that doesn't exist returns status cod… (#6905 ) fix #6840	2020-02-18 11:13:15 -05:00
Wim	3a2c865ff6	Fix high cpu usage with IPv6 recursor address. Closes #6120 (#6128 )	2020-02-18 11:09:11 -05:00
Chris Piraino	47ff532735	Fixes envoy config when both RetryOn* values are set (#7280 )	2020-02-18 09:25:47 -06:00
Lars Lehtonen	6bcd596539	agent/proxycfg: fix dropped error in state.initWatchesMeshGateway() (#7267 )	2020-02-18 14:41:01 +01:00
Matt Keeler	b137060630	Allow the PolicyResolve and RoleResolve endpoints to process na… (#7296 )	2020-02-13 14:55:27 -05:00
Hans Hasselberg	315d57bfb1	agent: sensible keyring error (#7272 ) Fixes #7231. Before an agent would always emit a warning when there is an encrypt key in the configuration and an existing keyring stored, which is happening on restart. Now it only emits that warning when the encrypt key from the configuration is not part of the keyring.	2020-02-13 20:35:09 +01:00
Hans Hasselberg	cb0f94487c	config: increase http_max_conns_per_client default to 200 (#7289 )	2020-02-13 16:27:33 +01:00
R.B. Boyer	12876983cf	avoid 'panic: Log in goroutine after TestCacheGet_refreshAge has completed' (#7276 )	2020-02-12 10:01:51 -06:00
R.B. Boyer	80b1165976	fix use of hclog logger (#7264 )	2020-02-12 09:37:16 -06:00
Matt Keeler	f523469529	Merge branch 'master' of github.com:hashicorp/consul	2020-02-11 11:54:58 -05:00
hashicorp-ci	f0cac9260f	update bindata_assetfs.go	2020-02-11 15:19:16 +00:00
ShimmerGlass	68e0f6bf84	agent: add server raft.{last,applied}_index gauges (#6694 ) These metrics are useful for : * Tracking the rate of update to the db * Allow to have a rough idea of when an index originated	2020-02-11 10:50:18 +01:00
gaoxinge	216eb29d6b	tests: convert windows style path to posix style path to avoid hcl parsing error (#6351 )	2020-02-11 10:13:31 +01:00
Matt Keeler	e231d62bc9	Make the config entry and leaf cert cache types ns aware (#7256 )	2020-02-10 19:26:01 -05:00
Hans Hasselberg	6739fe6e83	connect: add validations around intermediate cert ttl (#7213 )	2020-02-11 00:05:49 +01:00
R.B. Boyer	73ba5d9990	make the TestRPC_RPCMaxConnsPerClient test less flaky (#7255 )	2020-02-10 15:13:53 -06:00
Sarah Christoff	6678c8898a	Fix flaky TestAutopilot_BootstrapExpect (#7242 )	2020-02-10 14:52:58 -06:00
Kit Patella	55f19a9eb2	rpc: measure blocking queries (#7224 ) * agent: measure blocking queries * agent.rpc: update docs to mention we only record blocking queries * agent.rpc: make go fmt happy * agent.rpc: fix non-atomic read and decrement with bitwise xor of uint64 0 * agent.rpc: clarify review question * agent.rpc: today I learned that one must declare all variables before interacting with goto labels * Update agent/consul/server.go agent.rpc: more precise comment on `Server.queriesBlocking` Co-Authored-By: Paul Banks <banks@banksco.de> * Update website/source/docs/agent/telemetry.html.md agent.rpc: improve queries_blocking description Co-Authored-By: Paul Banks <banks@banksco.de> * agent.rpc: fix some bugs found in review * add a note about the updated counter behavior to telemetry.md * docs: add upgrade-specific note on consul.rpc.quer{y,ies_blocking} behavior Co-authored-by: Paul Banks <banks@banksco.de>	2020-02-10 10:01:15 -08:00
Akshay Ganeshen	8beb716414	feat: support sending body in HTTP checks (#6602 )	2020-02-10 09:27:12 -07:00
Matt Keeler	4f21bbdb4e	OSS Changes for agent local state namespace testing (#7250 )	2020-02-10 11:25:12 -05:00
Matt Keeler	d0cd092e3b	Catalog + Namespace OSS changes. (#7219 ) * Various Prepared Query + Namespace things * Last round of OSS changes for a namespaced catalog	2020-02-10 10:40:44 -05:00
R.B. Boyer	8c596953b0	agent: ensure that we always use the same settings for msgpack (#7245 ) We set RawToString=true so that []uint8 => string when decoding an interface{}. We set the MapType so that map[interface{}]interface{} decodes to map[string]interface{}. Add tests to ensure that this doesn't break existing usages. Fixes #7223	2020-02-07 15:50:24 -06:00
Freddy	01855d8579	Remove outdated TODO (#7244 )	2020-02-07 13:14:48 -07:00
Matt Keeler	444517080b	Fix a bug with ACL enforcement of reads on namespaced config entries. (#7239 )	2020-02-07 08:30:40 -05:00
Kit Patella	9a220f3010	agent/consul server: fix LeaderTest_ChangeNodeID (#7236 ) * fix LeaderTest_ChangeNodeID to use StatusLeft and add waitForAnyLANLeave * unextract the waitFor... fn, simplify, and provide a more descriptive error	2020-02-06 16:37:53 -08:00
Matt Keeler	9e5fd7f925	OSS Changes for various config entry namespacing bugs (#7226 )	2020-02-06 10:52:25 -05:00
Hans Hasselberg	6a18f01b42	agent: ensure node info sync and full sync. (#7189 ) This fixes #7020. There are two problems this PR solves: * if the node info changes it is highly likely to get service and check registration permission errors unless those service tokens have node:write. Hopefully services you register don’t have this permission. * the timer for a full sync gets reset for every partial sync which means that many partial syncs are preventing a full sync from happening Instead of syncing node info last, after services and checks, and possibly saving one RPC because it is included in every service sync, I am syncing node info first. It is only ever going to be a single RPC that we are only doing when node info has changed. This way we are guaranteed to sync node info even when something goes wrong with services or checks which is more likely because there are more syncs happening for them.	2020-02-06 15:30:58 +01:00
R.B. Boyer	0ecb4538c1	agent: differentiate wan vs lan loggers in memberlist and serf (#7205 ) This should be a helpful change until memberlist and serf can be properly switched to native hclog.	2020-02-05 09:52:43 -06:00
Matt Keeler	dceb107325	Fix disco chain graph validation for namespaces (#7217 ) Previously this happened to be validating only the chains in the default namespace. Now it will validate all chains in all namespaces when the global proxy-defaults is changed.	2020-02-05 10:06:27 -05:00
Matt Keeler	228da48f5d	Minor Non-Functional Updates (#7215 ) * Cleanup the discovery chain compilation route handling Nothing functionally should be different here. The real difference is that when creating new targets or handling route destinations we use the router config entries name and namespace instead of that of the top level request. Today they SHOULD always be the same but that may not always be the case. This hopefully also makes it easier to understand how the router entries are handled. * Refactor a small bit of the service manager tests in oss We used to use the stringHash function to compute part of the filename where things would get persisted to. This has been changed in the core code to calling the StringHash method on the ServiceID type. It just so happens that the new method will output the same value for anything in the default namespace (by design actually). However, logically this filename computation in the test should do the same thing as the core code itself so I updated it here. Also of note is that newer enterprise-only tests for the service manager cannot use the old stringHash function at all because it will produce incorrect results for non-default namespaces.	2020-02-05 10:06:11 -05:00
Freddy	cb77fc6d01	Add managed service provider token (#7218 ) Stubs for enterprise-only ACL token to be used by managed service providers.	2020-02-04 13:58:56 -07:00
Hans Hasselberg	f6ec8ed92b	agent: increase watchLimit to 8192. (#7200 ) The previous value was too conservative and users with many instances were having problems because of it. This change increases the limit to 8192 which reportedly fixed most of the issues with that. Related: #4984, #4986, #5050.	2020-02-04 13:11:30 +01:00
Matt Keeler	dfb0177dbc	Testing updates to support namespaced testing of the agent/xds… (#7185 ) * Various testing updates to support namespaced testing of the agent/xds package * agent/proxycfg package updates to support better namespace testing	2020-02-03 09:26:47 -05:00
Davor Kapsa	3cb4def563	auto_encrypt: check previously ignored error (#6604 )	2020-02-03 10:35:11 +01:00
hashicorp-ci	1fcf4bfc10	update bindata_assetfs.go	2020-01-31 21:38:38 +00:00
Hans Hasselberg	5531678e9e	Security fixes (#7182 ) * Mitigate HTTP/RPC Services Allow Unbounded Resource Usage Fixes #7159. Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>	2020-01-31 11:19:37 -05:00
Matt Keeler	d5f9268222	ACL enforcement for the agent/health/services endpoints (#7191 ) ACL enforcement for the agent/health/services endpoints	2020-01-31 11:16:24 -05:00

1 2 3 4 5 ...

1830 Commits (955ee64b95ffe6d53bf73fc36881cae5a00e4967)