diff --git a/CHANGELOG.md b/CHANGELOG.md index 5736dba690..bfa311ec2a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,31 +1,12 @@ ## 0.7.0 (UNRELEASED) -BACKWARDS INCOMPATIBILITIES: +FEATURES: -* `skip_leave_on_interrupt`'s default behavior is now dependent on whether or - not the agent is acting as a server or client. When Consul is started as a - server the default is `true` and `false` when a client. [GH-1909] -* HTTP check output is truncated to 4k, similar to script check output. [GH-1952] - -IMPROVEMENTS: - -* Implemented a new set of feedback controls for the gossip layer that help - prevent degraded nodes that can't meet the soft real-time requirements from - erroneously causing `serfHealth` flapping in other, healthy nodes. [GH-2101] -* Joins based on a DNS lookup will use TCP and attempt to join with the full - list of returned addresses. [GH-2101] -* Added a new network tomogroaphy visualization to the UI. [GH-2046] * Added a new `/v1/txn` state store transaction API that allows for atomic updates to and fetches from multiple entries in the key/value store. [GH-2028] -* Consul agents will now periodically reconnect to available Consul servers - in order to redistribute their RPC query load. Consul clients will, by - default, attempt to establish a new connection every 120s to 180s unless - the size of the cluster is sufficiently large. The rate at which agents - begin to query new servers is proportional to the size of the Consul - cluster (servers should never receive more than 64 new connections per - second per Consul server as a result of rebalancing). Clusters in stable - environments who use `allow_stale` should see a more even distribution of - query load across all of their Consul servers. [GH-1743] +* Script checks now support an optional `timeout` parameter. [GH-1762] +* Reap time for failed nodes is now configurable via new `reconnect_timeout` and + `reconnect_timeout_wan` config options ([use with caution](https://www.consul.io/docs/agent/options.html#reconnect_timeout)). [GH-1935] * Consul agents can now limit the number of UDP answers returned via the DNS interface. The default number of UDP answers is `3`, however by adjusting the `dns_config.udp_answer_limit` configuration parameter, it is now @@ -37,15 +18,41 @@ IMPROVEMENTS: [agent options](https://www.consul.io/docs/agent/options.html#udp_answer_limit) documentation for additional details for when this should be used. [GH-1712] +* Prepared queries support baking in the `Near` sorting parameter [GH-2137] + +BACKWARDS INCOMPATIBILITIES: + +* `skip_leave_on_interrupt`'s default behavior is now dependent on whether or + not the agent is acting as a server or client. When Consul is started as a + server the default is `true` and `false` when a client. [GH-1909] +* HTTP check output is truncated to 4k, similar to script check output. [GH-1952] + +IMPROVEMENTS: + +* Consul will now retry RPC calls that result in "no leader" errors for up to + 5 seconds. This allows agents to ride out leader elections with a delayed + response vs. an error. [GH-2175] +* Implemented a new set of feedback controls for the gossip layer that help + prevent degraded nodes that can't meet the soft real-time requirements from + erroneously causing `serfHealth` flapping in other, healthy nodes. [GH-2101] +* Joins based on a DNS lookup will use TCP and attempt to join with the full + list of returned addresses. [GH-2101] +* Added a new network tomogroaphy visualization to the UI. [GH-2046] +* Consul agents will now periodically reconnect to available Consul servers + in order to redistribute their RPC query load. Consul clients will, by + default, attempt to establish a new connection every 120s to 180s unless + the size of the cluster is sufficiently large. The rate at which agents + begin to query new servers is proportional to the size of the Consul + cluster (servers should never receive more than 64 new connections per + second per Consul server as a result of rebalancing). Clusters in stable + environments who use `allow_stale` should see a more even distribution of + query load across all of their Consul servers. [GH-1743] * Consul will now refuse to start with a helpful message if the same UNIX socket is used for more than one listening endpoint. [GH-1910] * Removed an obsolete warning message when Consul starts on Windows. [GH-1920] * Defaults bind address to 127.0.0.1 when running in `-dev` mode. [GH-1878] * Builds Consul releases with Go 1.6.1. [GH-1948] * HTTP health checks limit saved output to 4K to avoid performance issues. [GH-1952] -* Reap time for failed nodes is now configurable via new `reconnect_timeout` and - `reconnect_timeout_wan` config options ([use with caution](https://www.consul.io/docs/agent/options.html#reconnect_timeout)). [GH-1935] -* Script checks now support an optional `timeout` parameter. [GH-1762] BUG FIXES: diff --git a/api/prepared_query.go b/api/prepared_query.go index c8141887c4..63e741e050 100644 --- a/api/prepared_query.go +++ b/api/prepared_query.go @@ -25,6 +25,11 @@ type ServiceQuery struct { // Service is the service to query. Service string + // Near allows baking in the name of a node to automatically distance- + // sort from. The magic "_agent" value is supported, which sorts near + // the agent which initiated the request by default. + Near string + // Failover controls what we do if there are no healthy nodes in the // local datacenter. Failover QueryDatacenterOptions @@ -40,6 +45,17 @@ type ServiceQuery struct { Tags []string } +// QueryTemplate carries the arguments for creating a templated query. +type QueryTemplate struct { + // Type specifies the type of the query template. Currently only + // "name_prefix_match" is supported. This field is required. + Type string + + // Regexp allows specifying a regex pattern to match against the name + // of the query being executed. + Regexp string +} + // PrepatedQueryDefinition defines a complete prepared query. type PreparedQueryDefinition struct { // ID is this UUID-based ID for the query, always generated by Consul. @@ -67,6 +83,11 @@ type PreparedQueryDefinition struct { // DNS has options that control how the results of this query are // served over DNS. DNS QueryDNSOptions + + // Template is used to pass through the arguments for creating a + // prepared query with an attached template. If a template is given, + // interpolations are possible in other struct fields. + Template QueryTemplate } // PreparedQueryExecuteResponse has the results of executing a query. diff --git a/command/agent/command.go b/command/agent/command.go index 56885dc657..bb40fa2632 100644 --- a/command/agent/command.go +++ b/command/agent/command.go @@ -62,6 +62,7 @@ func (c *Command) readConfig() *Config { var retryIntervalWan string var dnsRecursors []string var dev bool + var dcDeprecated string cmdFlags := flag.NewFlagSet("agent", flag.ContinueOnError) cmdFlags.Usage = func() { c.Ui.Output(c.Help()) } @@ -72,7 +73,8 @@ func (c *Command) readConfig() *Config { cmdFlags.StringVar(&cmdConfig.LogLevel, "log-level", "", "log level") cmdFlags.StringVar(&cmdConfig.NodeName, "node", "", "node name") - cmdFlags.StringVar(&cmdConfig.Datacenter, "dc", "", "node datacenter") + cmdFlags.StringVar(&dcDeprecated, "dc", "", "node datacenter (deprecated: use 'datacenter' instead)") + cmdFlags.StringVar(&cmdConfig.Datacenter, "datacenter", "", "node datacenter") cmdFlags.StringVar(&cmdConfig.DataDir, "data-dir", "", "path to the data directory") cmdFlags.BoolVar(&cmdConfig.EnableUi, "ui", false, "enable the built-in web UI") cmdFlags.StringVar(&cmdConfig.UiDir, "ui-dir", "", "path to the web UI directory") @@ -239,6 +241,14 @@ func (c *Command) readConfig() *Config { } } + // Output a warning if the 'dc' flag has been used. + if dcDeprecated != "" { + c.Ui.Error("WARNING: the 'dc' flag has been deprecated. Use 'datacenter' instead") + + // Making sure that we don't break previous versions. + config.Datacenter = dcDeprecated + } + // Ensure the datacenter is always lowercased. The DNS endpoints automatically // lowercase all queries, and internally we expect DC1 and dc1 to be the same. config.Datacenter = strings.ToLower(config.Datacenter) @@ -1074,7 +1084,8 @@ Options: -dev Starts the agent in development mode. -recursor=1.2.3.4 Address of an upstream DNS server. Can be specified multiple times. - -dc=east-aws Datacenter of the agent + -dc=east-aws Datacenter of the agent (deprecated: use 'datacenter' instead). + -datacenter=east-aws Datacenter of the agent. -encrypt=key Provides the gossip encryption key -join=1.2.3.4 Address of an agent to join at start time. Can be specified multiple times. diff --git a/command/agent/dns.go b/command/agent/dns.go index 2a8d8dd5c3..9e2357dea4 100644 --- a/command/agent/dns.go +++ b/command/agent/dns.go @@ -598,6 +598,15 @@ func (d *DNSServer) preparedQueryLookup(network, datacenter, query string, req, Token: d.agent.config.ACLToken, AllowStale: d.config.AllowStale, }, + + // Always pass the local agent through. In the DNS interface, there + // is no provision for passing additional query parameters, so we + // send the local agent's data through to allow distance sorting + // relative to ourself on the server side. + Agent: structs.QuerySource{ + Datacenter: d.agent.config.Datacenter, + Node: d.agent.config.NodeName, + }, } // TODO (slackpad) - What's a safe limit we can set here? It seems like diff --git a/command/agent/dns_test.go b/command/agent/dns_test.go index 863a0bfefb..1182d9172e 100644 --- a/command/agent/dns_test.go +++ b/command/agent/dns_test.go @@ -3166,3 +3166,37 @@ func TestDNS_InvalidQueries(t *testing.T) { } } } + +func TestDNS_PreparedQuery_AgentSource(t *testing.T) { + dir, srv := makeDNSServer(t) + defer os.RemoveAll(dir) + defer srv.agent.Shutdown() + + testutil.WaitForLeader(t, srv.agent.RPC, "dc1") + + m := MockPreparedQuery{} + if err := srv.agent.InjectEndpoint("PreparedQuery", &m); err != nil { + t.Fatalf("err: %v", err) + } + + m.executeFn = func(args *structs.PreparedQueryExecuteRequest, reply *structs.PreparedQueryExecuteResponse) error { + // Check that the agent inserted its self-name and datacenter to + // the RPC request body. + if args.Agent.Datacenter != srv.agent.config.Datacenter || + args.Agent.Node != srv.agent.config.NodeName { + t.Fatalf("bad: %#v", args.Agent) + } + return nil + } + + { + m := new(dns.Msg) + m.SetQuestion("foo.query.consul.", dns.TypeSRV) + + c := new(dns.Client) + addr, _ := srv.agent.config.ClientListener("", srv.agent.config.Ports.DNS) + if _, _, err := c.Exchange(m, addr.String()); err != nil { + t.Fatalf("err: %v", err) + } + } +} diff --git a/command/agent/prepared_query_endpoint.go b/command/agent/prepared_query_endpoint.go index bf643f7c26..1a6ff6d72e 100644 --- a/command/agent/prepared_query_endpoint.go +++ b/command/agent/prepared_query_endpoint.go @@ -96,6 +96,10 @@ func parseLimit(req *http.Request, limit *int) error { func (s *HTTPServer) preparedQueryExecute(id string, resp http.ResponseWriter, req *http.Request) (interface{}, error) { args := structs.PreparedQueryExecuteRequest{ QueryIDOrName: id, + Agent: structs.QuerySource{ + Node: s.agent.config.NodeName, + Datacenter: s.agent.config.Datacenter, + }, } s.parseSource(req, &args.Source) if done := s.parse(resp, req, &args.Datacenter, &args.QueryOptions); done { @@ -131,6 +135,10 @@ func (s *HTTPServer) preparedQueryExecute(id string, resp http.ResponseWriter, r func (s *HTTPServer) preparedQueryExplain(id string, resp http.ResponseWriter, req *http.Request) (interface{}, error) { args := structs.PreparedQueryExecuteRequest{ QueryIDOrName: id, + Agent: structs.QuerySource{ + Node: s.agent.config.NodeName, + Datacenter: s.agent.config.Datacenter, + }, } s.parseSource(req, &args.Source) if done := s.parse(resp, req, &args.Datacenter, &args.QueryOptions); done { diff --git a/command/agent/prepared_query_endpoint_test.go b/command/agent/prepared_query_endpoint_test.go index 8997de05f3..ff757e0acf 100644 --- a/command/agent/prepared_query_endpoint_test.go +++ b/command/agent/prepared_query_endpoint_test.go @@ -286,6 +286,10 @@ func TestPreparedQuery_Execute(t *testing.T) { Datacenter: "dc1", Node: "my-node", }, + Agent: structs.QuerySource{ + Datacenter: srv.agent.config.Datacenter, + Node: srv.agent.config.NodeName, + }, QueryOptions: structs.QueryOptions{ Token: "my-token", RequireConsistent: true, @@ -323,6 +327,38 @@ func TestPreparedQuery_Execute(t *testing.T) { } }) + // Ensure the proper params are set when no special args are passed + httpTest(t, func(srv *HTTPServer) { + m := MockPreparedQuery{} + if err := srv.agent.InjectEndpoint("PreparedQuery", &m); err != nil { + t.Fatalf("err: %v", err) + } + + m.executeFn = func(args *structs.PreparedQueryExecuteRequest, reply *structs.PreparedQueryExecuteResponse) error { + if args.Source.Node != "" { + t.Fatalf("expect node to be empty, got %q", args.Source.Node) + } + expect := structs.QuerySource{ + Datacenter: srv.agent.config.Datacenter, + Node: srv.agent.config.NodeName, + } + if !reflect.DeepEqual(args.Agent, expect) { + t.Fatalf("expect: %#v\nactual: %#v", expect, args.Agent) + } + return nil + } + + req, err := http.NewRequest("GET", "/v1/query/my-id/execute", nil) + if err != nil { + t.Fatalf("err: %v", err) + } + + resp := httptest.NewRecorder() + if _, err := srv.PreparedQuerySpecific(resp, req); err != nil { + t.Fatalf("err: %v", err) + } + }) + httpTest(t, func(srv *HTTPServer) { body := bytes.NewBuffer(nil) req, err := http.NewRequest("GET", "/v1/query/not-there/execute", body) @@ -357,6 +393,10 @@ func TestPreparedQuery_Explain(t *testing.T) { Datacenter: "dc1", Node: "my-node", }, + Agent: structs.QuerySource{ + Datacenter: srv.agent.config.Datacenter, + Node: srv.agent.config.NodeName, + }, QueryOptions: structs.QueryOptions{ Token: "my-token", RequireConsistent: true, diff --git a/consul/acl.go b/consul/acl.go index 24cedf8fc0..fa3f558a6c 100644 --- a/consul/acl.go +++ b/consul/acl.go @@ -180,7 +180,14 @@ func (c *aclCache) lookupACL(id, authDC string) (acl.ACL, error) { if strings.Contains(err.Error(), aclNotFound) { return nil, errors.New(aclNotFound) } else { - c.logger.Printf("[ERR] consul.acl: Failed to get policy for '%s': %v", id, err) + s := id + // Print last 3 chars of the token if long enough, otherwise completly hide it + if len(s) > 3 { + s = fmt.Sprintf("token ending in '%s'", s[len(s)-3:]) + } else { + s = redactedToken + } + c.logger.Printf("[ERR] consul.acl: Failed to get policy for %s: %v", s, err) } // Unable to refresh, apply the down policy diff --git a/consul/catalog_endpoint_test.go b/consul/catalog_endpoint_test.go index 28b22b2d0e..8171551121 100644 --- a/consul/catalog_endpoint_test.go +++ b/consul/catalog_endpoint_test.go @@ -34,7 +34,7 @@ func TestCatalogRegister(t *testing.T) { var out struct{} err := msgpackrpc.CallWithCodec(codec, "Catalog.Register", &arg, &out) - if err == nil || err.Error() != "No cluster leader" { + if err != nil { t.Fatalf("err: %v", err) } @@ -198,7 +198,7 @@ func TestCatalogDeregister(t *testing.T) { var out struct{} err := msgpackrpc.CallWithCodec(codec, "Catalog.Deregister", &arg, &out) - if err == nil || err.Error() != "No cluster leader" { + if err != nil { t.Fatalf("err: %v", err) } @@ -302,7 +302,7 @@ func TestCatalogListNodes(t *testing.T) { } var out structs.IndexedNodes err := msgpackrpc.CallWithCodec(codec, "Catalog.ListNodes", &args, &out) - if err == nil || err.Error() != "No cluster leader" { + if err != nil { t.Fatalf("err: %v", err) } @@ -621,7 +621,7 @@ func TestCatalogListServices(t *testing.T) { } var out structs.IndexedServices err := msgpackrpc.CallWithCodec(codec, "Catalog.ListServices", &args, &out) - if err == nil || err.Error() != "No cluster leader" { + if err != nil { t.Fatalf("err: %v", err) } @@ -810,7 +810,7 @@ func TestCatalogListServiceNodes(t *testing.T) { } var out structs.IndexedServiceNodes err := msgpackrpc.CallWithCodec(codec, "Catalog.ServiceNodes", &args, &out) - if err == nil || err.Error() != "No cluster leader" { + if err != nil { t.Fatalf("err: %v", err) } @@ -857,7 +857,7 @@ func TestCatalogListServiceNodes_DistanceSort(t *testing.T) { } var out structs.IndexedServiceNodes err := msgpackrpc.CallWithCodec(codec, "Catalog.ServiceNodes", &args, &out) - if err == nil || err.Error() != "No cluster leader" { + if err != nil { t.Fatalf("err: %v", err) } @@ -944,7 +944,7 @@ func TestCatalogNodeServices(t *testing.T) { } var out structs.IndexedNodeServices err := msgpackrpc.CallWithCodec(codec, "Catalog.NodeServices", &args, &out) - if err == nil || err.Error() != "No cluster leader" { + if err != nil { t.Fatalf("err: %v", err) } @@ -1001,7 +1001,7 @@ func TestCatalogRegister_FailedCase1(t *testing.T) { var out struct{} err := msgpackrpc.CallWithCodec(codec, "Catalog.Register", &arg, &out) - if err == nil || err.Error() != "No cluster leader" { + if err != nil { t.Fatalf("err: %v", err) } diff --git a/consul/config.go b/consul/config.go index 78b3fc8753..8e252b6351 100644 --- a/consul/config.go +++ b/consul/config.go @@ -224,6 +224,13 @@ type Config struct { // are willing to apply in one period. After this limit we will issue a // warning and discard the remaining updates. CoordinateUpdateMaxBatches int + + // RPCHoldTimeout is how long an RPC can be "held" before it is errored. + // This is used to paper over a loss of leadership by instead holding RPCs, + // so that the caller experiences a slow response rather than an error. + // This period is meant to be long enough for a leader election to take + // place, and a small jitter is applied to avoid a thundering herd. + RPCHoldTimeout time.Duration } // CheckVersion is used to check if the ProtocolVersion is valid @@ -286,6 +293,9 @@ func DefaultConfig() *Config { CoordinateUpdatePeriod: 5 * time.Second, CoordinateUpdateBatchSize: 128, CoordinateUpdateMaxBatches: 5, + + // Hold an RPC for up to 5 seconds by default + RPCHoldTimeout: 5 * time.Second, } // Increase our reap interval to 3 days instead of 24h. diff --git a/consul/prepared_query/walk_test.go b/consul/prepared_query/walk_test.go index db9a75c1cb..05294e3b65 100644 --- a/consul/prepared_query/walk_test.go +++ b/consul/prepared_query/walk_test.go @@ -20,6 +20,7 @@ func TestWalk_ServiceQuery(t *testing.T) { Failover: structs.QueryDatacenterOptions{ Datacenters: []string{"dc1", "dc2"}, }, + Near: "_agent", Tags: []string{"tag1", "tag2", "tag3"}, } if err := walk(service, fn); err != nil { @@ -30,6 +31,7 @@ func TestWalk_ServiceQuery(t *testing.T) { ".Service:the-service", ".Failover.Datacenters[0]:dc1", ".Failover.Datacenters[1]:dc2", + ".Near:_agent", ".Tags[0]:tag1", ".Tags[1]:tag2", ".Tags[2]:tag3", diff --git a/consul/prepared_query_endpoint.go b/consul/prepared_query_endpoint.go index d30b6c10dd..47b6fe1a77 100644 --- a/consul/prepared_query_endpoint.go +++ b/consul/prepared_query_endpoint.go @@ -368,10 +368,45 @@ func (p *PreparedQuery) Execute(args *structs.PreparedQueryExecuteRequest, // Shuffle the results in case coordinates are not available if they // requested an RTT sort. reply.Nodes.Shuffle() - if err := p.srv.sortNodesByDistanceFrom(args.Source, reply.Nodes); err != nil { + + // Build the query source. This can be provided by the client, or by + // the prepared query. Client-specified takes priority. + qs := args.Source + if qs.Datacenter == "" { + qs.Datacenter = args.Agent.Datacenter + } + if query.Service.Near != "" && qs.Node == "" { + qs.Node = query.Service.Near + } + + // Respect the magic "_agent" flag. + if qs.Node == "_agent" { + qs.Node = args.Agent.Node + } + + // Perform the distance sort + err = p.srv.sortNodesByDistanceFrom(qs, reply.Nodes) + if err != nil { return err } + // If we applied a distance sort, make sure that the node queried for is in + // position 0, provided the results are from the same datacenter. + if qs.Node != "" && reply.Datacenter == qs.Datacenter { + for i, node := range reply.Nodes { + if node.Node.Node == qs.Node { + reply.Nodes[0], reply.Nodes[i] = reply.Nodes[i], reply.Nodes[0] + break + } + + // Put a cap on the depth of the search. The local agent should + // never be further in than this if distance sorting was applied. + if i == 9 { + break + } + } + } + // Apply the limit if given. if args.Limit > 0 && len(reply.Nodes) > args.Limit { reply.Nodes = reply.Nodes[:args.Limit] diff --git a/consul/prepared_query_endpoint_test.go b/consul/prepared_query_endpoint_test.go index cb10eb8f8c..49075ed928 100644 --- a/consul/prepared_query_endpoint_test.go +++ b/consul/prepared_query_endpoint_test.go @@ -1607,6 +1607,225 @@ func TestPreparedQuery_Execute(t *testing.T) { t.Fatalf("unique shuffle ratio too low: %d/100", len(uniques)) } + // Set the query to return results nearest to node3. This is the only + // node with coordinates, and it carries the service we are asking for, + // so node3 should always show up first. + query.Op = structs.PreparedQueryUpdate + query.Query.Service.Near = "node3" + if err := msgpackrpc.CallWithCodec(codec1, "PreparedQuery.Apply", &query, &query.Query.ID); err != nil { + t.Fatalf("err: %v", err) + } + + // Now run the query and make sure the sort looks right. + { + req := structs.PreparedQueryExecuteRequest{ + Agent: structs.QuerySource{ + Datacenter: "dc1", + Node: "node3", + }, + Datacenter: "dc1", + QueryIDOrName: query.Query.ID, + QueryOptions: structs.QueryOptions{Token: execToken}, + } + + var reply structs.PreparedQueryExecuteResponse + + for i := 0; i < 10; i++ { + if err := msgpackrpc.CallWithCodec(codec1, "PreparedQuery.Execute", &req, &reply); err != nil { + t.Fatalf("err: %v", err) + } + if n := len(reply.Nodes); n != 10 { + t.Fatalf("expect 10 nodes, got: %d", n) + } + if node := reply.Nodes[0].Node.Node; node != "node3" { + t.Fatalf("expect node3 first, got: %q", node) + } + } + } + + // Query again, but this time set a client-supplied query source. This + // proves that we allow overriding the baked-in value with ?near. + { + // Set up the query with a non-existent node. This will cause the + // nodes to be shuffled if the passed node is respected, proving + // that we allow the override to happen. + req := structs.PreparedQueryExecuteRequest{ + Source: structs.QuerySource{ + Datacenter: "dc1", + Node: "foo", + }, + Agent: structs.QuerySource{ + Datacenter: "dc1", + Node: "node3", + }, + Datacenter: "dc1", + QueryIDOrName: query.Query.ID, + QueryOptions: structs.QueryOptions{Token: execToken}, + } + + var reply structs.PreparedQueryExecuteResponse + + shuffled := false + for i := 0; i < 10; i++ { + if err := msgpackrpc.CallWithCodec(codec1, "PreparedQuery.Execute", &req, &reply); err != nil { + t.Fatalf("err: %v", err) + } + if n := len(reply.Nodes); n != 10 { + t.Fatalf("expect 10 nodes, got: %d", n) + } + if node := reply.Nodes[0].Node.Node; node != "node3" { + shuffled = true + break + } + } + + if !shuffled { + t.Fatalf("expect nodes to be shuffled") + } + } + + // If the exact node we are sorting near appears in the list, make sure it + // gets popped to the front of the result. + { + req := structs.PreparedQueryExecuteRequest{ + Source: structs.QuerySource{ + Datacenter: "dc1", + Node: "node1", + }, + Datacenter: "dc1", + QueryIDOrName: query.Query.ID, + QueryOptions: structs.QueryOptions{Token: execToken}, + } + + var reply structs.PreparedQueryExecuteResponse + + for i := 0; i < 10; i++ { + if err := msgpackrpc.CallWithCodec(codec1, "PreparedQuery.Execute", &req, &reply); err != nil { + t.Fatalf("err: %v", err) + } + if n := len(reply.Nodes); n != 10 { + t.Fatalf("expect 10 nodes, got: %d", n) + } + if node := reply.Nodes[0].Node.Node; node != "node1" { + t.Fatalf("expect node1 first, got: %q", node) + } + } + } + + // Bake the magic "_agent" flag into the query. + query.Query.Service.Near = "_agent" + if err := msgpackrpc.CallWithCodec(codec1, "PreparedQuery.Apply", &query, &query.Query.ID); err != nil { + t.Fatalf("err: %v", err) + } + + // Check that we sort the local agent first when the magic flag is set. + { + req := structs.PreparedQueryExecuteRequest{ + Agent: structs.QuerySource{ + Datacenter: "dc1", + Node: "node3", + }, + Datacenter: "dc1", + QueryIDOrName: query.Query.ID, + QueryOptions: structs.QueryOptions{Token: execToken}, + } + + var reply structs.PreparedQueryExecuteResponse + + for i := 0; i < 10; i++ { + if err := msgpackrpc.CallWithCodec(codec1, "PreparedQuery.Execute", &req, &reply); err != nil { + t.Fatalf("err: %v", err) + } + if n := len(reply.Nodes); n != 10 { + t.Fatalf("expect 10 nodes, got: %d", n) + } + if node := reply.Nodes[0].Node.Node; node != "node3" { + t.Fatalf("expect node3 first, got: %q", node) + } + } + } + + // Check that the query isn't just sorting "node3" first because we + // provided it in the Agent query source. Proves that we use the + // Agent source when the magic "_agent" flag is passed. + { + req := structs.PreparedQueryExecuteRequest{ + Agent: structs.QuerySource{ + Datacenter: "dc1", + Node: "foo", + }, + Datacenter: "dc1", + QueryIDOrName: query.Query.ID, + QueryOptions: structs.QueryOptions{Token: execToken}, + } + + var reply structs.PreparedQueryExecuteResponse + + // Expect the set to be shuffled since we have no coordinates + // on the "foo" node. + shuffled := false + for i := 0; i < 10; i++ { + if err := msgpackrpc.CallWithCodec(codec1, "PreparedQuery.Execute", &req, &reply); err != nil { + t.Fatalf("err: %v", err) + } + if n := len(reply.Nodes); n != 10 { + t.Fatalf("expect 10 nodes, got: %d", n) + } + if node := reply.Nodes[0].Node.Node; node != "node3" { + shuffled = true + break + } + } + + if !shuffled { + t.Fatal("expect nodes to be shuffled") + } + } + + // Shuffles if the response comes from a non-local DC. Proves that the + // agent query source does not interfere with the order. + { + req := structs.PreparedQueryExecuteRequest{ + Source: structs.QuerySource{ + Datacenter: "dc2", + Node: "node3", + }, + Agent: structs.QuerySource{ + Datacenter: "dc1", + Node: "node3", + }, + Datacenter: "dc1", + QueryIDOrName: query.Query.ID, + QueryOptions: structs.QueryOptions{Token: execToken}, + } + + var reply structs.PreparedQueryExecuteResponse + + shuffled := false + for i := 0; i < 10; i++ { + if err := msgpackrpc.CallWithCodec(codec1, "PreparedQuery.Execute", &req, &reply); err != nil { + t.Fatalf("err: %v", err) + } + if n := len(reply.Nodes); n != 10 { + t.Fatalf("expect 10 nodes, got: %d", n) + } + if reply.Nodes[0].Node.Node != "node3" { + shuffled = true + break + } + } + + if !shuffled { + t.Fatal("expect node shuffle for remote results") + } + } + + // Un-bake the near parameter. + query.Query.Service.Near = "" + if err := msgpackrpc.CallWithCodec(codec1, "PreparedQuery.Apply", &query, &query.Query.ID); err != nil { + t.Fatalf("err: %v", err) + } + // Update the health of a node to mark it critical. setHealth := func(node string, health string) { req := structs.RegisterRequest{ @@ -1683,7 +1902,6 @@ func TestPreparedQuery_Execute(t *testing.T) { } // Make the query more picky so it excludes warning nodes. - query.Op = structs.PreparedQueryUpdate query.Query.Service.OnlyPassing = true if err := msgpackrpc.CallWithCodec(codec1, "PreparedQuery.Apply", &query, &query.Query.ID); err != nil { t.Fatalf("err: %v", err) diff --git a/consul/rpc.go b/consul/rpc.go index 6105e3ae55..fc040b22cc 100644 --- a/consul/rpc.go +++ b/consul/rpc.go @@ -10,6 +10,7 @@ import ( "time" "github.com/armon/go-metrics" + "github.com/hashicorp/consul/consul/agent" "github.com/hashicorp/consul/consul/state" "github.com/hashicorp/consul/consul/structs" "github.com/hashicorp/consul/lib" @@ -39,7 +40,8 @@ const ( // jitterFraction is a the limit to the amount of jitter we apply // to a user specified MaxQueryTime. We divide the specified time by - // the fraction. So 16 == 6.25% limit of jitter + // the fraction. So 16 == 6.25% limit of jitter. This same fraction + // is applied to the RPCHoldTimeout jitterFraction = 16 // Warn if the Raft command is larger than this. @@ -189,6 +191,8 @@ func (s *Server) handleConsulConn(conn net.Conn) { // forward is used to forward to a remote DC or to forward to the local leader // Returns a bool of if forwarding was performed, as well as any error func (s *Server) forward(method string, info structs.RPCInfo, args interface{}, reply interface{}) (bool, error) { + var firstCheck time.Time + // Handle DC forwarding dc := info.RequestDatacenter() if dc != s.config.Datacenter { @@ -201,20 +205,51 @@ func (s *Server) forward(method string, info structs.RPCInfo, args interface{}, return false, nil } - // Handle leader forwarding - if !s.IsLeader() { - err := s.forwardLeader(method, args, reply) +CHECK_LEADER: + // Find the leader + isLeader, remoteServer := s.getLeader() + + // Handle the case we are the leader + if isLeader { + return false, nil + } + + // Handle the case of a known leader + if remoteServer != nil { + err := s.forwardLeader(remoteServer, method, args, reply) return true, err } - return false, nil + + // Gate the request until there is a leader + if firstCheck.IsZero() { + firstCheck = time.Now() + } + if time.Now().Sub(firstCheck) < s.config.RPCHoldTimeout { + jitter := lib.RandomStagger(s.config.RPCHoldTimeout / jitterFraction) + select { + case <-time.After(jitter): + goto CHECK_LEADER + case <-s.shutdownCh: + } + } + + // No leader found and hold time exceeded + return true, structs.ErrNoLeader } -// forwardLeader is used to forward an RPC call to the leader, or fail if no leader -func (s *Server) forwardLeader(method string, args interface{}, reply interface{}) error { +// getLeader returns if the current node is the leader, and if not +// then it returns the leader which is potentially nil if the cluster +// has not yet elected a leader. +func (s *Server) getLeader() (bool, *agent.Server) { + // Check if we are the leader + if s.IsLeader() { + return true, nil + } + // Get the leader leader := s.raft.Leader() if leader == "" { - return structs.ErrNoLeader + return false, nil } // Lookup the server @@ -222,6 +257,12 @@ func (s *Server) forwardLeader(method string, args interface{}, reply interface{ server := s.localConsuls[leader] s.localLock.RUnlock() + // Server could be nil + return false, server +} + +// forwardLeader is used to forward an RPC call to the leader, or fail if no leader +func (s *Server) forwardLeader(server *agent.Server, method string, args interface{}, reply interface{}) error { // Handle a missing server if server == nil { return structs.ErrNoLeader diff --git a/consul/rpc_test.go b/consul/rpc_test.go new file mode 100644 index 0000000000..e77f6ea10f --- /dev/null +++ b/consul/rpc_test.go @@ -0,0 +1,72 @@ +package consul + +import ( + "os" + "testing" + "time" + + "github.com/hashicorp/consul/consul/structs" + "github.com/hashicorp/consul/testutil" + "github.com/hashicorp/net-rpc-msgpackrpc" +) + +func TestRPC_NoLeader_Fail(t *testing.T) { + dir1, s1 := testServerWithConfig(t, func(c *Config) { + c.RPCHoldTimeout = 1 * time.Millisecond + }) + defer os.RemoveAll(dir1) + defer s1.Shutdown() + codec := rpcClient(t, s1) + defer codec.Close() + + arg := structs.RegisterRequest{ + Datacenter: "dc1", + Node: "foo", + Address: "127.0.0.1", + } + var out struct{} + + // Make sure we eventually fail with a no leader error, which we should + // see given the short timeout. + err := msgpackrpc.CallWithCodec(codec, "Catalog.Register", &arg, &out) + if err.Error() != structs.ErrNoLeader.Error() { + t.Fatalf("bad: %v", err) + } + + // Now make sure it goes through. + testutil.WaitForLeader(t, s1.RPC, "dc1") + err = msgpackrpc.CallWithCodec(codec, "Catalog.Register", &arg, &out) + if err != nil { + t.Fatalf("bad: %v", err) + } +} + +func TestRPC_NoLeader_Retry(t *testing.T) { + dir1, s1 := testServerWithConfig(t, func(c *Config) { + c.RPCHoldTimeout = 10 * time.Second + }) + defer os.RemoveAll(dir1) + defer s1.Shutdown() + codec := rpcClient(t, s1) + defer codec.Close() + + arg := structs.RegisterRequest{ + Datacenter: "dc1", + Node: "foo", + Address: "127.0.0.1", + } + var out struct{} + + // This isn't sure-fire but tries to check that we don't have a + // leader going into the RPC, so we exercise the retry logic. + if ok, _ := s1.getLeader(); ok { + t.Fatalf("should not have a leader yet") + } + + // The timeout is long enough to ride out any reasonable leader + // election. + err := msgpackrpc.CallWithCodec(codec, "Catalog.Register", &arg, &out) + if err != nil { + t.Fatalf("bad: %v", err) + } +} diff --git a/consul/structs/prepared_query.go b/consul/structs/prepared_query.go index b1b20c9ed3..5e9c31847b 100644 --- a/consul/structs/prepared_query.go +++ b/consul/structs/prepared_query.go @@ -34,6 +34,12 @@ type ServiceQuery struct { // discarded) OnlyPassing bool + // Near allows the query to always prefer the node nearest the given + // node. If the node does not exist, results are returned in their + // normal randomly-shuffled order. Supplying the magic "_agent" value + // is supported to sort near the agent which initiated the request. + Near string + // Tags are a set of required and/or disallowed tags. If a tag is in // this list it must be present. If the tag is preceded with "!" then // it is disallowed. @@ -177,6 +183,10 @@ type PreparedQueryExecuteRequest struct { // network coordinates. Source QuerySource + // Agent is used to carry around a reference to the agent which initiated + // the execute request. Used to distance-sort relative to the local node. + Agent QuerySource + // QueryOptions (unfortunately named here) controls the consistency // settings for the query lookup itself, as well as the service lookups. QueryOptions diff --git a/contrib/bash-completion/_consul b/contrib/bash-completion/_consul new file mode 100644 index 0000000000..bb74802049 --- /dev/null +++ b/contrib/bash-completion/_consul @@ -0,0 +1,264 @@ +# This completion file has been inspired by the completion files of the Git and +# the Docker projects. + +__consulcomp() { + local all c s=$'\n' IFS=' '$'\t'$'\n' + local cur="${COMP_WORDS[COMP_CWORD]}" + + for c in $1; do + case "$c$4" in + --*=*) all="$all$c$4$s" ;; + *) all="$all$c$4 $s" ;; + esac + done + IFS=$s + COMPREPLY=($(compgen -P "$2" -W "$all" -- "$cur")) + return +} + +__consul_agent() { + local subcommands=" + -advertise + -advertise-wan + -atlas + -atlas-join + -atlas-token + -atlas-endpoint + -bootstrap + -bind + -http-port + -bootstrap-expect + -client + -config-file + -config-dir + -data-dir + -recursor + -dc + -encrypt + -join + -join-wan + -retry-join + -retry-interval + -retry-max-wan + -log-level + -node + -protocol + -rejoin + -server + -syslog + -ui + -ui-dir + -pid-file + " + __consulcomp "-help $subcommands" +} + +__consul_configtest() { + local subcommands=" + -config-file + -config-dir + " + __consulcomp "-help $subcommands" +} + +__consul_event() { + local subcommands=" + -http-addr + -datacenter + -name + -node + -service + -tag + -token + " + __consulcomp "-help $subcommands" +} + +__consul_exec() { + local subcommands=" + -http-addr + -datacenter + -prefix + -node + -service + -tag + -wait + -wait-repl + -token + " + + __consulcomp "-help $subcommands" +} + +__consul_force_leave() { + __consulcomp "-help -rpc-addr" +} + +__consul_info() { + __consulcomp "-help -rpc-addr" +} + +__consul_join() { + local subcommands=" + -rpc-addr + -wan + " + + __consulcomp "-help $subcommands" +} + +__consul_keygen() { + # NOTE: left empty on purpose. + return +} + +__consul_keyring() { + local subcommands=" + -install + -list + -remove + -token + -use + -rpc-addr + " + + __consulcomp "-help $subcommands" +} + +__consul_leave() { + __consulcomp "-help -rpc-addr" +} + +__consul_lock() { + local subcommands=" + -http-addr + -n + -name + -token + -pass-stdin + -try + -monitor-retry + -verbose + " + + __consulcomp "-help $subcommands" +} + +__consul_maint() { + local subcommands=" + -enable + -disable + -reason + -service + -token + -http-addr + " + + __consulcomp "-help $subcommands" +} + +__consul_members() { + local subcommands=" + -detailed + -rpc-addr + -status + -wan + " + + __consulcomp "-help $subcommands" +} + +__consul_monitor() { + local subcommands=" + -log-level + -rpc-addr + " + + __consulcomp "-help $subcommands" +} + +__consul_reload() { + __consulcomp "-help -rpc-addr" +} + +__consul_rtt() { + local subcommands=" + -wan + -http-addr + " + + __consulcomp "-help $subcommands" +} + +__consul_version() { + # NOTE: left empty on purpose. + return +} + +__consul_watch() { + local subcommands=" + -http-addr + -datacenter + -token + -key + -name + -passingonly + -prefix + -service + -state + -tag + -type + " + + __consulcomp "-help $subcommands" +} + +__consul() { + local c=1 command + while [ $c -lt $COMP_CWORD ]; do + cmd="${COMP_WORDS[c]}" + case "$cmd" in + -*) ;; + *) command="$cmd" + esac + c=$((++c)) + done + + local cmds=" + agent + configtest + event + exec + force-leave + info + join + keygen + keyring + leave + lock + maint + members + monitor + reload + rtt + version + watch + " + + local globalflags="--help --version" + + # Complete a command. + if [ $c -eq $COMP_CWORD -a -z "$command" ]; then + case "${COMP_WORDS[COMP_CWORD]}" in + -*|--*) __consulcomp "$globalflags" ;; + *) __consulcomp "$cmds" ;; + esac + return + fi + + # Command options. + local completion_func="__consul_${command//-/_}" + declare -f $completion_func >/dev/null && $completion_func && return +} + +complete -o default -o nospace -F __consul consul diff --git a/terraform/aws/consul.tf b/terraform/aws/consul.tf index ce5d6f0a7b..3deeb4f51c 100644 --- a/terraform/aws/consul.tf +++ b/terraform/aws/consul.tf @@ -16,7 +16,7 @@ resource "aws_instance" "server" { } provisioner "file" { - source = "${path.module}/scripts/${lookup(var.service_conf, var.platform)}" + source = "${path.module}/../shared/scripts/${lookup(var.service_conf, var.platform)}" destination = "/tmp/${lookup(var.service_conf_dest, var.platform)}" } @@ -30,9 +30,9 @@ resource "aws_instance" "server" { provisioner "remote-exec" { scripts = [ - "${path.module}/scripts/install.sh", - "${path.module}/scripts/service.sh", - "${path.module}/scripts/ip_tables.sh", + "${path.module}/../shared/scripts/install.sh", + "${path.module}/../shared/scripts/service.sh", + "${path.module}/../shared/scripts/ip_tables.sh", ] } } diff --git a/terraform/aws/variables.tf b/terraform/aws/variables.tf index c1e3f760bf..f00e4b80fd 100644 --- a/terraform/aws/variables.tf +++ b/terraform/aws/variables.tf @@ -8,6 +8,7 @@ variable "user" { ubuntu = "ubuntu" rhel6 = "ec2-user" centos6 = "centos" + centos7 = "centos" rhel7 = "ec2-user" } } @@ -28,6 +29,8 @@ variable "ami" { us-west-2-centos6 = "ami-1255b321" us-east-1-rhel7 = "ami-2051294a" us-west-2-rhel7 = "ami-775e4f16" + us-east-1-centos7 = "ami-6d1c2007" + us-west-1-centos7 = "ami-af4333cf" } } @@ -36,6 +39,7 @@ variable "service_conf" { ubuntu = "debian_upstart.conf" rhel6 = "rhel_upstart.conf" centos6 = "rhel_upstart.conf" + centos7 = "rhel_consul.service" rhel7 = "rhel_consul.service" } } @@ -44,6 +48,7 @@ variable "service_conf_dest" { ubuntu = "upstart.conf" rhel6 = "upstart.conf" centos6 = "upstart.conf" + centos7 = "consul.service" rhel7 = "consul.service" } } diff --git a/terraform/google/README.md b/terraform/google/README.md new file mode 100644 index 0000000000..0369ef4072 --- /dev/null +++ b/terraform/google/README.md @@ -0,0 +1,33 @@ +## Running the Google Cloud Platform templates to set up a Consul cluster + +The platform variable defines the target OS, default is `ubuntu`. + +Supported Machine Images: +- Ubuntu 14.04 (`ubuntu`) +- RHEL6 (`rhel6`) +- RHEL7 (`rhel7`) +- CentOS6 (`centos6`) +- CentOS7 (`centos7`) + +For Google Cloud provider, set up your environment as outlined here: https://www.terraform.io/docs/providers/google/index.html + +To set up a Ubuntu based cluster, replace `key_path` with actual value and run: + + +```shell +terraform apply -var 'key_path=/Users/xyz/consul.pem' +``` + +_or_ + +```shell +terraform apply -var 'key_path=/Users/xyz/consul.pem' -var 'platform=ubuntu' +``` + +To run RHEL6, run like below: + +```shell +terraform apply -var 'key_path=/Users/xyz/consul.pem' -var 'platform=rhel6' +``` + +**Note:** For RHEL and CentOS based clusters, you need to have a [SSH key added](https://console.cloud.google.com/compute/metadata/sshKeys) for the user `root`. \ No newline at end of file diff --git a/terraform/google/consul.tf b/terraform/google/consul.tf new file mode 100644 index 0000000000..066f586c30 --- /dev/null +++ b/terraform/google/consul.tf @@ -0,0 +1,68 @@ +resource "google_compute_instance" "consul" { + count = "${var.servers}" + + name = "consul-${count.index}" + zone = "${var.region_zone}" + tags = ["${var.tag_name}"] + + machine_type = "${var.machine_type}" + + disk { + image = "${lookup(var.machine_image, var.platform)}" + } + + network_interface { + network = "default" + + access_config { + # Ephemeral + } + } + + service_account { + scopes = ["https://www.googleapis.com/auth/compute.readonly"] + } + + connection { + user = "${lookup(var.user, var.platform)}" + key_path = "${var.key_path}" + } + + provisioner "file" { + source = "${path.module}/../shared/scripts/${lookup(var.service_conf, var.platform)}" + destination = "/tmp/${lookup(var.service_conf_dest, var.platform)}" + } + + provisioner "remote-exec" { + inline = [ + "echo ${var.servers} > /tmp/consul-server-count", + "echo ${google_compute_instance.consul.0.network_interface.0.address} > /tmp/consul-server-addr", + ] + } + + provisioner "remote-exec" { + scripts = [ + "${path.module}/../shared/scripts/install.sh", + "${path.module}/../shared/scripts/service.sh", + "${path.module}/../shared/scripts/ip_tables.sh", + ] + } +} + +resource "google_compute_firewall" "consul_ingress" { + name = "consul-internal-access" + network = "default" + + allow { + protocol = "tcp" + ports = [ + "8300", # Server RPC + "8301", # Serf LAN + "8302", # Serf WAN + "8400", # RPC + ] + } + + source_tags = ["${var.tag_name}"] + target_tags = ["${var.tag_name}"] +} diff --git a/terraform/google/outputs.tf b/terraform/google/outputs.tf new file mode 100644 index 0000000000..66d031cb22 --- /dev/null +++ b/terraform/google/outputs.tf @@ -0,0 +1,4 @@ +output "server_address" { + value = "${google_compute_instance.consul.0.network_interface.0.address}" +} + diff --git a/terraform/google/variables.tf b/terraform/google/variables.tf new file mode 100644 index 0000000000..d877e359d7 --- /dev/null +++ b/terraform/google/variables.tf @@ -0,0 +1,72 @@ +variable "platform" { + default = "ubuntu" + description = "The OS Platform" +} + +variable "user" { + default = { + ubuntu = "ubuntu" + rhel6 = "root" + rhel7 = "root" + centos6 = "root" + centos7 = "root" + } +} + +variable "machine_image" { + default = { + ubuntu = "ubuntu-os-cloud/ubuntu-1404-trusty-v20160314" + rhel6 = "rhel-cloud/rhel-6-v20160303" + rhel7 = "rhel-cloud/rhel-7-v20160303" + centos6 = "centos-cloud/centos-6-v20160301" + centos7 = "centos-cloud/centos-7-v20160301" + } +} + +variable "service_conf" { + default = { + ubuntu = "debian_upstart.conf" + rhel6 = "rhel_upstart.conf" + rhel7 = "rhel_consul.service" + centos6 = "rhel_upstart.conf" + centos7 = "rhel_consul.service" + } +} +variable "service_conf_dest" { + default = { + ubuntu = "upstart.conf" + rhel6 = "upstart.conf" + rhel7 = "consul.service" + centos6 = "upstart.conf" + centos7 = "consul.service" + } +} + +variable "key_path" { + description = "Path to the private key used to access the cloud servers" +} + +variable "region" { + default = "us-central1" + description = "The region of Google Cloud where to launch the cluster" +} + +variable "region_zone" { + default = "us-central1-f" + description = "The zone of Google Cloud in which to launch the cluster" +} + +variable "servers" { + default = "3" + description = "The number of Consul servers to launch" +} + +variable "machine_type" { + default = "f1-micro" + description = "Google Cloud Compute machine type" +} + +variable "tag_name" { + default = "consul" + description = "Name tag for the servers" +} diff --git a/terraform/aws/scripts/debian_upstart.conf b/terraform/shared/scripts/debian_upstart.conf similarity index 96% rename from terraform/aws/scripts/debian_upstart.conf rename to terraform/shared/scripts/debian_upstart.conf index 7c57a0efc2..eb52354a72 100644 --- a/terraform/aws/scripts/debian_upstart.conf +++ b/terraform/shared/scripts/debian_upstart.conf @@ -15,7 +15,7 @@ script # Make sure to use all our CPUs, because Consul can block a scheduler thread export GOMAXPROCS=`nproc` - # Get the public IP + # Get the local IP BIND=`ifconfig eth0 | grep "inet addr" | awk '{ print substr($2,6) }'` exec /usr/local/bin/consul agent \ diff --git a/terraform/aws/scripts/install.sh b/terraform/shared/scripts/install.sh similarity index 94% rename from terraform/aws/scripts/install.sh rename to terraform/shared/scripts/install.sh index 9c392606be..08e2fdffb0 100644 --- a/terraform/aws/scripts/install.sh +++ b/terraform/shared/scripts/install.sh @@ -36,7 +36,7 @@ then echo "Installing Upstart service..." sudo mkdir -p /etc/consul.d sudo mkdir -p /etc/service - sudo chown root:root /tmp/upstart.conf + sudo chown root:root /tmp/upstart.conf sudo mv /tmp/upstart.conf /etc/init/consul.conf sudo chmod 0644 /etc/init/consul.conf sudo mv /tmp/consul_flags /etc/service/consul @@ -44,7 +44,7 @@ then else echo "Installing Systemd service..." sudo mkdir -p /etc/systemd/system/consul.d - sudo chown root:root /tmp/consul.service + sudo chown root:root /tmp/consul.service sudo mv /tmp/consul.service /etc/systemd/system/consul.service sudo chmod 0644 /etc/systemd/system/consul.service sudo mv /tmp/consul_flags /etc/sysconfig/consul diff --git a/terraform/aws/scripts/ip_tables.sh b/terraform/shared/scripts/ip_tables.sh similarity index 85% rename from terraform/aws/scripts/ip_tables.sh rename to terraform/shared/scripts/ip_tables.sh index b304cd1a8c..acf853402e 100644 --- a/terraform/aws/scripts/ip_tables.sh +++ b/terraform/shared/scripts/ip_tables.sh @@ -4,6 +4,7 @@ set -e sudo iptables -I INPUT -s 0/0 -p tcp --dport 8300 -j ACCEPT sudo iptables -I INPUT -s 0/0 -p tcp --dport 8301 -j ACCEPT sudo iptables -I INPUT -s 0/0 -p tcp --dport 8302 -j ACCEPT +sudo iptables -I INPUT -s 0/0 -p tcp --dport 8400 -j ACCEPT if [ -d /etc/sysconfig ]; then sudo iptables-save | sudo tee /etc/sysconfig/iptables diff --git a/terraform/aws/scripts/rhel_consul.service b/terraform/shared/scripts/rhel_consul.service similarity index 100% rename from terraform/aws/scripts/rhel_consul.service rename to terraform/shared/scripts/rhel_consul.service diff --git a/terraform/aws/scripts/rhel_upstart.conf b/terraform/shared/scripts/rhel_upstart.conf similarity index 100% rename from terraform/aws/scripts/rhel_upstart.conf rename to terraform/shared/scripts/rhel_upstart.conf diff --git a/terraform/aws/scripts/service.sh b/terraform/shared/scripts/service.sh similarity index 100% rename from terraform/aws/scripts/service.sh rename to terraform/shared/scripts/service.sh diff --git a/website/source/docs/agent/basics.html.markdown b/website/source/docs/agent/basics.html.markdown index e86fe8958c..502e96a888 100644 --- a/website/source/docs/agent/basics.html.markdown +++ b/website/source/docs/agent/basics.html.markdown @@ -55,8 +55,8 @@ There are several important messages that [`consul agent`](/docs/commands/agent. * **Datacenter**: This is the datacenter in which the agent is configured to run. Consul has first-class support for multiple datacenters; however, to work efficiently, - each node must be configured to report its datacenter. The [`-dc`](/docs/agent/options.html#_dc) flag - can be used to set the datacenter. For single-DC configurations, the agent + each node must be configured to report its datacenter. The [`-datacenter`](/docs/agent/options.html#_datacenter) + flag can be used to set the datacenter. For single-DC configurations, the agent will default to "dc1". * **Server**: This indicates whether the agent is running in server or client mode. diff --git a/website/source/docs/agent/http/agent.html.markdown b/website/source/docs/agent/http/agent.html.markdown index 1e017ffd7e..796c7e6e8d 100644 --- a/website/source/docs/agent/http/agent.html.markdown +++ b/website/source/docs/agent/http/agent.html.markdown @@ -394,7 +394,7 @@ body must look like: } ``` -The `Name` field is mandatory, If an `ID` is not provided, it is set to `Name`. +The `Name` field is mandatory. If an `ID` is not provided, it is set to `Name`. You cannot have duplicate `ID` entries per agent, so it may be necessary to provide an ID in the case of a collision. diff --git a/website/source/docs/agent/http/query.html.markdown b/website/source/docs/agent/http/query.html.markdown index bad3ae1288..d2737384d9 100644 --- a/website/source/docs/agent/http/query.html.markdown +++ b/website/source/docs/agent/http/query.html.markdown @@ -70,6 +70,7 @@ query, like this example: "Name": "my-query", "Session": "adf4238a-882b-9ddc-4a9d-5b6758e4159e", "Token": "", + "Near": "node1", "Service": { "Service": "redis", "Failover": { @@ -114,6 +115,16 @@ attribute which can be set on functions. This change in effect moves Consul from using `SECURITY DEFINER` by default to `SECURITY INVOKER` by default for new Prepared Queries. + +`Near` allows specifying a particular node to sort near based on distance +sorting using [Network Coordinates](/docs/internals/coordinates.html). The +nearest instance to the specified node will be returned first, and subsequent +nodes in the response will be sorted in ascending order of estimated round-trip +times. If the node given does not exist, the nodes in the response will +be shuffled. Using the magic `_agent` value is supported, and will automatically +return results nearest the agent servicing the request. If unspecified, the +response will be shuffled by default. + The set of fields inside the `Service` structure define the query's behavior. `Service` is the name of the service to query. This is required. @@ -365,8 +376,9 @@ blocking queries, but it does support all consistency modes. Adding the optional "?near=" parameter with a node name will sort the resulting list in ascending order based on the estimated round trip time from that node. Passing "?near=_agent" will use the agent's node for the sort. If this is not -present, then the nodes will be shuffled randomly and will be in a different -order each time the query is executed. +present, the default behavior will shuffle the nodes randomly each time the +query is executed. Passing this option will override the built-in +near parameter of a prepared query, if present. An optional "?limit=" parameter can be used to limit the size of the list to the given number of nodes. This is applied after any sorting or shuffling. diff --git a/website/source/docs/agent/options.html.markdown b/website/source/docs/agent/options.html.markdown index e5165a9fce..470f6ffea5 100644 --- a/website/source/docs/agent/options.html.markdown +++ b/website/source/docs/agent/options.html.markdown @@ -135,7 +135,7 @@ The options below are all specified on the command-line. prototyping or developing against the API. This mode is **not** intended for production use as it does not write any data to disk. -* `-dc` - This flag controls the datacenter in +* `-datacenter` - This flag controls the datacenter in which the agent is running. If not provided, it defaults to "dc1". Consul has first-class support for multiple datacenters, but it relies on proper configuration. Nodes in the same datacenter should be on a single @@ -449,7 +449,7 @@ Consul will not enable TLS for the HTTP API unless the `https` port has been ass [`-client` command-line flag](#_client). * `datacenter` Equivalent to the - [`-dc` command-line flag](#_dc). + [`-datacenter` command-line flag](#_datacenter). * `data_dir` Equivalent to the [`-data-dir` command-line flag](#_data_dir). diff --git a/website/source/docs/agent/services.html.markdown b/website/source/docs/agent/services.html.markdown index f8255daf16..a34dbbf62d 100644 --- a/website/source/docs/agent/services.html.markdown +++ b/website/source/docs/agent/services.html.markdown @@ -3,7 +3,7 @@ layout: "docs" page_title: "Service Definition" sidebar_current: "docs-agent-services" description: |- - One of the main goals of service discovery is to provide a catalog of available services. To that end, the agent provides a simple service definition format to declare the availability of a service and to potentially associate it with a health check. A health check is considered to be application level if it associated with a service. A service is defined in a configuration file or added at runtime over the HTTP interface. + One of the main goals of service discovery is to provide a catalog of available services. To that end, the agent provides a simple service definition format to declare the availability of a service and to potentially associate it with a health check. A health check is considered to be application level if it is associated with a service. A service is defined in a configuration file or added at runtime over the HTTP interface. --- # Services @@ -11,7 +11,7 @@ description: |- One of the main goals of service discovery is to provide a catalog of available services. To that end, the agent provides a simple service definition format to declare the availability of a service and to potentially associate it with -a health check. A health check is considered to be application level if it +a health check. A health check is considered to be application level if it is associated with a service. A service is defined in a configuration file or added at runtime over the HTTP interface. diff --git a/website/source/docs/upgrade-specific.html.markdown b/website/source/docs/upgrade-specific.html.markdown index d55e397b47..1a97794a2a 100644 --- a/website/source/docs/upgrade-specific.html.markdown +++ b/website/source/docs/upgrade-specific.html.markdown @@ -14,6 +14,21 @@ details provided for their upgrades as a result of new features or changed behavior. This page is used to document those details separately from the standard upgrade flow. +## Consul 0.7 + +Consul version 0.7 adds a feature which allows prepared queries to store a +["Near" parameter](/docs/agent/http/query.html#near) in the query definition +itself. This feature enables using the distance sorting features of prepared +queries without explicitly providing the node to sort near in requests, but +requires the agent servicing a request to send additional information about +itself to the Consul servers when executing the prepared query. Agents prior +to 0.7.0 do not send this information, which means they are unable to properly +execute prepared queries configured with a `Near` parameter. Similarly, any +server nodes prior to version 0.7.0 are unable to store the `Near` parameter, +making them unable to properly serve requests for prepared queries using the +feature. It is recommended that all agents be running version 0.7.0 prior to +using this feature. + ## Consul 0.6.4 Consul 0.6.4 made some substantial changes to how ACLs work with prepared diff --git a/website/source/downloads_tools.html.erb b/website/source/downloads_tools.html.erb index f034c04aa5..600491a909 100644 --- a/website/source/downloads_tools.html.erb +++ b/website/source/downloads_tools.html.erb @@ -53,6 +53,9 @@ description: |-
- Are you the author of a tool and you would like to be featured on this page? Email us at hello@hashicorp.com! + Are you the author of a tool and you would like to be featured on this page? The Consul website is open source and is embedded inside the Consul repository on GitHub. You can submit a Pull Request to add your tool to the list and we will gladly review it.