Commit Graph

308 Commits (838e49e7b8dfc05967bf4ddc95025550b80cd5cd)

Author SHA1 Message Date
Lukas Kämmerling b6955bf1ca
Add hetzner service discovery (#7822)
Signed-off-by: Lukas Kämmerling <lukas.kaemmerling@hetzner-cloud.de>
2020-08-21 15:49:19 +02:00
Julien Pivotto d867491364
Human-friendly durations in PromQL (#7713)
* Add support for user-friendly durations

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-08-04 21:12:41 +02:00
Lars Nielsen 019d031f3e
Updated documentation (#5390)
Updated documentation to include YAML example for file_sd_config

Signed-off-by: Lars Nielsen <nellemandela@gmail.com>
2020-08-03 15:36:33 +01:00
Julien Pivotto f482c7bdd7
Add per scrape-config targets limit (#7554)
* Add per scrape-config targets limit

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-30 14:20:24 +02:00
Julien Pivotto 924e7239b7
Docker Swarm SD: Support tasks and service without published ports (#7686)
* Support tasks and service without published ports

Mimics k8s discovery behaviour.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-29 20:56:30 +02:00
Julien Pivotto 88bdb13c55
DNS SD: add srv record target and port meta labels (#7678)
* DNS SD: add srv record target and port meta labels

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-28 22:09:01 +02:00
Julien Pivotto 9c599f1ee2
Add new SD's to alertmanager config (#7584)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-15 20:51:14 +02:00
Julien Pivotto be96951c56
Add Docker Swarm configuration example (#7542)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-09 20:05:21 +02:00
John Bampton 98a69b77d1
Fix spelling (#7512)
Signed-off-by: John Bampton <jbampton@users.noreply.github.com>
2020-07-04 14:54:26 +02:00
Julien Pivotto 74a6959d46
Docs: fix types (#7508)
I have batched a bunch of fixes around types in the documentation.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-02 23:27:14 +02:00
Steffen Neubauer 9c9b872087
OpenStack SD: Add availability config option, to choose endpoint type (#7494)
* OpenStack SD: Add availability config option, to choose endpoint type

In some environments Prometheus must query OpenStack via an alternative
endpoint type (gophercloud calls this `availability`.

This commit implements this option.

Co-Authored-By: Dennis Kuhn <d.kuhn@syseleven.de>
Signed-off-by: Steffen Neubauer <s.neubauer@syseleven.de>
2020-07-02 15:17:56 +01:00
Julien Pivotto 800c0aefcf
Fix types in k8s+dns docs (#7474)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-28 09:13:23 +02:00
Julien Pivotto 59de58d380
Docker Swarm service discovery (#7420)
* Docker Swarm service discovery

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-26 12:25:58 +02:00
Julien Pivotto 0444a419d7
Consul: document health meta label (#7466)
implemented in #5313

fixes #770

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-26 12:14:51 +02:00
Julien Pivotto c61141ce51
Add DigitalOcean service discovery (#7407)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-18 17:04:41 +02:00
Julien Pivotto 7b24bb3116
Docs: normalize bearer_token_file type (#7408)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-16 23:09:05 +02:00
Alex Vandiver 3c753aba5f
Add missing newline before inline-code block (#7401)
Sections with three backticks require a blank line before them.

Signed-off-by: Alex Vandiver <alex@chmrr.net>
2020-06-16 07:13:27 +02:00
Martin Lee b5d61fb66c
Add AMI to labels scraped during service discovery. (#7386)
Signed-off-by: Martin Lee <martin@martinlee.org>
2020-06-11 18:25:58 +01:00
Julien Pivotto ef4d8a38ca
Change metrics relabel terminology (#7362)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-09 05:40:45 +01:00
Julien Pivotto 2209fa98b4
Fix consul_sd_config to follow types convention (#7316)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-05-29 21:20:37 +02:00
Jop Zinkweg 1f69c38ba4
Add discovery support for triton compute nodes (#7250)
Added optional configuration item role, defaults to 'container' (backwards-compatible).
Setting role to 'cn' will discover compute nodes instead.

Human-friendly compute node hostname discovery depends on cmon 1.7.0:
c1a2aeca36

Adjust testcases to use discovery config per case as two different types are now supported.

Updated documentation:
* new role setting
* clarify what the name 'container' covers as triton uses different names in different locations

Signed-off-by: jzinkweg <jzinkweg@gmail.com>
2020-05-22 16:19:21 +01:00
Callum Styan 386aea7774
Add missing remote write/read config name to docs. (#7105)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2020-04-14 09:27:33 -07:00
Frederic Hemberger fe47c9c86e
[Docs] consul_sd_config: Add default value for `allow_stale` (#7075)
Ref: https://github.com/prometheus/prometheus/blob/master/discovery/consul/consul.go#L97
Signed-off-by: Frederic Hemberger <mail@frederic-hemberger.de>
2020-03-31 18:55:25 +01:00
Deepjyoti Mondal c38ca2ca95
Fix #6999 : Add architecture meta label for EC2 (#7000)
This PR adds architecture meta labels for EC2 instances

Signed-off-by: Deepjyoti Mondal <djmdeveloper060796@gmail.com>
2020-03-28 20:41:37 +00:00
coding3min 4dfbf328f2
[OpenStack SD] Add HypervisorID meta labels about id (#6962)
Add extra meta labels which will be useful in the case
Prometheus discovery hypervisor .

Signed-off-by: pzqu <pzqu@qq.com>

Co-authored-by: pzqu <pzqu@example.com>
2020-03-11 08:38:14 +00:00
Alex Gaganov df92a00838
Expose EC2 instance lifecycle as label (#6914)
Signed-off-by: Alex Gaganov <alex.gaganov@fiverr.com>
2020-03-03 08:03:16 +00:00
李国忠 029b45aa30
add service type metadata to kubernetes_sd_config service role #6496 (#6684)
* [service discovery] add service type metadata to kubernetes_sd_config service role

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>

* [fix] ServiceType -> string

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>

* [fix] fix testcase

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>

* [style]

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>

* [doc] add service type

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>

* [doc] sort

Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>
2020-02-25 09:22:14 +01:00
Aleksandra Gacek 8e53c19f9c discovery/kubernetes: expose label_selector and field_selector
Close #6807

Co-authored-by @shuttie
Signed-off-by: Aleksandra Gacek <algacek@google.com>
2020-02-15 14:57:56 +01:00
Grebennikov Roman b4445ff03f discovery/kubernetes: expose label_selector and field_selector
Closes #6096

Signed-off-by: Grebennikov Roman <grv@dfdx.me>
2020-02-15 14:57:38 +01:00
Julien Pivotto 9d9bc524e5 Add query log (#6520)
* Add query log, make stats logged in JSON like in the API

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-08 13:28:43 +00:00
li mengyang 1c6d2194c4 fix spelling mistakes in docs (#5952)
Signed-off-by: hwdef <hwdef97@gmail.com>
2019-08-27 11:33:40 -06:00
Chris Marchbanks a6a55c433c Improve desired shards calculation (#5763)
The desired shards calculation now properly keeps track of the rate of
pending samples, and uses the previously unused integralAccumulator to
adjust for missing information in the desired shards calculation.

Also, configure more capacity for each shard.  The default 10 capacity
causes shards to block on each other while
sending remote requests. Default to a 500 sample capacity and explain in
the documentation that having more capacity will help throughput.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2019-08-13 10:10:21 +01:00
Dan P a9dea68ee6 removed document reference to meta label that doesnt exist in the kubernetes_sd (#5821)
Signed-off-by: Dan Potepa <dan@danpotepa.co.uk>
2019-08-01 12:34:23 +01:00
beorn7 5973acd65d Clarifying honor_labels documentation
Previously, the wording could be misunderstood as setting honor_labels
to "false" for federation.

This also adds scraping the Pushgateway as a typical use case for
honor_labels=true.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-07-02 13:23:20 +02:00
Svend Sorensen 8d54650d06 Document behavior of empty ec2_sd_config region (#5711)
Document the behavior of an empty `ec2_sd_config` `region` setting. If this is
omitted or blank, the region is discovered from the instance metadata, if available.
If it is blank and instance region metadata is not available, an error will
result ("EC2 SD configuration requires a region").

Signed-off-by: Svend Sorensen <svend@svends.net>
2019-06-27 18:35:54 +01:00
Max Leonard Inden 41c22effbe
config&notifier: Add option to use Alertmanager API v2
With v0.16.0 Alertmanager introduced a new API (v2). This patch adds a
configuration option for Prometheus to send alerts to the v2 endpoint
instead of the defautl v1 endpoint.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2019-06-21 16:33:53 +02:00
Björn Rabenstein f3f016d464
Merge pull request #5604 from cstyan/default-capacity-docs
Update queue config documentation
2019-06-17 13:05:14 +02:00
Callum Styan e9129abeff Remove max_retries from queue_config since it's not used in remote write
anymore.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-06-10 12:43:08 -07:00
sh0rez 8ba23fb336
fix(style): container_is_init to container_init
Removes 'is' keyword to comply style guide

Signed-off-by: sh0rez <me@shorez.de>
2019-05-29 16:16:19 +02:00
sh0rez 88b79bae64
chore(style): Comply with style guide, order list
Signed-off-by: sh0rez <me@shorez.de>
2019-05-29 11:22:10 +02:00
Callum Styan babb8a0572 Update queue config documentation to reflect default value change for capacity.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-05-28 14:12:57 -07:00
sh0rez 1b144e499f
doc(discovery/kubernetes): container_is_init meta label
Signed-off-by: sh0rez <me@shorez.de>
2019-05-28 16:52:13 +02:00
Frederic Branczyk 04f22700b7
Merge pull request #5571 from simonpasquier/extend-k8s-endpoint-metadata
discovery/kubernetes: add node name and hostname to endpoints
2019-05-16 20:19:29 +02:00
Samuel Alfageme 425b07f3c4 Updated the 'consistency-modes' consul.io/api link to point to its new location (#5572)
Ref: 626392eb62

Signed-off-by: Samuel Alfageme <samuel@alfage.me>
2019-05-16 15:52:35 +01:00
Simon Pasquier 3441ecdea1 discovery/kubernetes: add node name and hostname to endpoints
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-05-16 10:49:13 +02:00
EarthmanT 35be8c9e25 Add azure public ip label (#5475)
* Update Azure SD Config with Public IP label

Signed-off-by: earthmant <trammell@cloudify.co>
2019-04-17 16:05:44 +01:00
Simon Pasquier dafd1632a2 discovery/kubernetes: add present labels for labels/annotations (#5443)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-04-10 13:21:42 +01:00
Kien Nguyen-Tuan 813b58367a [OpenStack SD] Add ProjectID and UserID meta labels (#5431)
Add extra meta labels which will be useful in the case
Prometheus discovery instances from all projects.

Signed-off-by: Kien Nguyen <kiennt2609@gmail.com>
2019-04-04 10:02:31 +01:00
Julien Pivotto 4397916cb2 Add honor_timestamps (#5304)
Fixes #5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2019-03-15 10:04:15 +00:00
Callum Styan 83c46fd549 update Consul vendor code so that catalog.ServiceMultipleTags can be (#5151)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-03-12 10:31:27 +00:00
tuanvcw 9de0ab3c8a Update remaining deprecated links in docs (#5271)
Signed-off-by: Vu Cong Tuan <tuanvc@vn.fujitsu.com>
2019-02-26 10:16:38 +00:00
LongKB e4a741cb7d Replacing 'HTTP' by 'HTTPS' for securing links (#5252)
Currently, when we access the modified pages with **HTTP**, it is
redirected to **HTTPS** automatically. So this commit aims to
replace **HTTP** to **HTTPs** for security.

Co-Authored-By: Nguyen Phuong An <AnNP@vn.fujitsu.com>
Signed-off-by: Kim Bao Long <longkb@vn.fujitsu.com>
2019-02-22 14:33:02 +01:00
LongKB 23480bef43 Remove the duplicated words (#5251)
Although it is spelling mistakes, it might make an affects while reading.

Co-Authored-By: Nguyen Phuong An <AnNP@vn.fujitsu.com>
Signed-off-by: Kim Bao Long <longkb@vn.fujitsu.com>
2019-02-22 14:32:34 +01:00
Simon Pasquier c8a1a5a93c
discovery/kubernetes: fix support for password_file and bearer_token_file (#5211)
* discovery/kubernetes: fix support for password_file

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Create and pass custom RoundTripper to Kubernetes client

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use inline HTTPClientConfig

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-20 11:22:34 +01:00
Kevin Bulebush 718344434c openstack_sd: Supporting application credential for authentication. (#4968)
* openstack_sd: Support application credentials for authentication.
Updated gophercloud

Signed-off-by: Kevin Bulebush <kmbulebu@gmail.com>
2019-01-09 15:18:58 +00:00
Fabian Reinartz 93d13c59d0 Sort
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2019-01-03 13:10:57 +01:00
Fabian Reinartz 7a41038695 Add Azure tenant and subscription ID labels
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2019-01-03 13:09:13 +01:00
Julien Pivotto 2e725a195a Niptick about relabel config (#4994)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2018-12-21 10:42:18 +00:00
Marcel D. Juhnke c7d83b2b6a discovery: add support for Managed Identity authentication in Azure SD (#4590)
Signed-off-by: Marcel Juhnke <marrat@marrat.de>
2018-12-19 10:03:33 +00:00
Tariq Ibrahim de6f3b6af7 expose kubernetes service cluster ip (#4940)
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2018-12-18 15:17:34 +00:00
Samuel Alfageme 240321acee Add taggedAddress to the labels in ConsulSD (#5001)
Useful when multiple (tagged) addresses for a node are exposed on the catalog API
Ref. https://www.consul.io/api/catalog.html#taggedaddresses

Signed-off-by: Samuel Alfageme <samuel@alfage.me>
2018-12-18 11:51:05 +01:00
Tariq Ibrahim e3bdc463fa Revert "add logic to check if an azure VM is deallocated or not (#4908)" (#4980)
This reverts commit 61cf4365

Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-12-12 09:27:12 +01:00
Ryota Arai 135d580ab2 Introduce min_shards for remote write to set minimum number of shards. (#4924)
Signed-off-by: Ryota Arai <ryota.arai@gmail.com>
2018-12-04 17:32:14 +00:00
Tariq Ibrahim 61cf4365d6 add logic to check if an azure VM is deallocated or not (#4908)
* add logic to check if an azure VM is deallocated or not
* update documentation  with the new azure power state label

Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-11-30 11:32:40 +00:00
Serghei Anicheev 8e659a5109 Adding private_dns_name to the list of ec2 labels which can be used i… (#4693)
* Adding private_dns_name to the list of ec2 labels which can be used in node naming for dynamic environments

Signed-off-by: Serghei Anicheev <serghei@rentalcover.com>
2018-11-30 11:11:06 +00:00
Bryan Boreham cf37e1feb4 Add __meta_kubernetes_pod_phase label in discovery (#4824)
This lets you add a relabel rule to drop scrapes for pods which are
not running.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-11-06 14:40:24 +00:00
Silvio Gissi 6100f160ad EC2 Platform meta label (#4663)
Set __meta_ec2_platform label with the instance platform string. Set to 'windows' on Windows servers and absent otherwise.


Signed-off-by: Silvio Gissi <silvio@gissilabs.com>
2018-11-06 14:39:48 +00:00
Timo Beckers 36143be234
docs - refer to documentation/examples/prometheus-marathon.yml
Signed-off-by: Timo Beckers <timo@incline.eu>
2018-10-25 18:02:59 +02:00
Kien Nguyen-Tuan 9c5370fdfe Support discover instances from all projects (#4682)
By default, OpenStack SD only queries for instances
from specified project. To discover instances from other
projects, users have to add more openstack_sd_configs for
each project.

This patch adds `all_tenants` <bool> options to
openstack_sd_configs. For example:

- job_name: 'openstack_all_instances'
  openstack_sd_configs:
    - role: instance
      region: RegionOne
      identity_endpoint: http://<identity_server>/identity/v3
      username: <username>
      password: <super_secret_password>
      domain_name: Default
      all_tenants: true

Co-authored-by: Kien Nguyen <kiennt2609@gmail.com>
Signed-off-by: dmatosl <danielmatos.lima@gmail.com>
2018-10-17 13:01:33 +01:00
Brian Brazil 468e49417c Update remote_write queue docs to present defaults. (#4715)
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-10-10 18:51:27 +01:00
Richard Kiene b537f6047a Add ability to filter triton_sd targets by pre-defined groups (#4701)
Additionally, add triton groups metadata to the discovery reponse
and correct a documentation error regarding the triton server id
metadata.

Signed-off-by: Richard Kiene <richard.kiene@joyent.com>
2018-10-10 10:03:34 +01:00
Simon Pasquier a2a78d0a09 discovery/openstack: discover all interfaces (#4649)
* discovery/openstack: discover all interfaces
* Add address pool label

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-10-09 16:17:08 +01:00
Simon Pasquier ff08c40091 discovery/openstack: support tls_config
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-25 14:31:32 +02:00
Tariq Ibrahim f708fd5c99 Adding support for multiple azure environments (#4569)
Signed-off-by: Tariq Ibrahim <tariq.ibrahim@microsoft.com>
2018-09-04 17:55:40 +02:00
Javier Kohen 1c89984778 Expose EC2 instance owner as a discovery label.
This exposes the OwnerID field of the DescribeInstances respons as .

Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-17 11:30:18 -04:00
Javier Kohen 2d4bcb3ee1 Document the new __meta_gce_instance_id discovery label.
Signed-off-by: Javier Kohen <jkohen@google.com>
2018-08-10 11:59:22 -04:00
Johannes Scheuermann 7608ee87d0 Inital support for Azure VMSS (#4202)
* Inital support for Azure VMSS

Signed-off-by: Johannes Scheuermann <johannes.scheuermann@inovex.de>

* Add documentation for the newly introduced label

Signed-off-by: Johannes M. Scheuermann <joh.scheuer@gmail.com>
2018-08-01 12:52:21 +01:00
José Martínez 791c13b142 discovery/ec2: Add primary_subnet_id label
Signed-off-by: José Martínez <xosemp@gmail.com>
2018-07-25 09:20:58 +01:00
Jannick Fahlbusch ฏ๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎๎ 0be25f92e2 EC2 Discovery: Allow to set a custom endpoint (#4333)
Allowing to set a custom endpoint makes it easy to monitor targets on non AWS providers with EC2 compliant APIs.

Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
2018-07-18 10:48:14 +01:00
Romain Baugue b41be4ef52 Discovery consul service meta (#4280)
* Upgrade Consul client
* Add ServiceMeta to the labels in ConsulSD

Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>
2018-07-18 05:06:56 +01:00
Simon Pasquier ed99af0b05 docs: fix OpenStack SD for the hypervisor role
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-07-15 12:37:57 +01:00
Damien Lespiau e64037053d Expose controller kind and name to labelling rules
Relabelling rules can use this information to attach the name of the controller
that has created a pod.

In turn, this can be used to slice metrics by workload at query time, ie.
"Give me all metrics that have been created by the $name Deployment"

Signed-off-by: Damien Lespiau <damien@weave.works>
2018-05-09 11:51:37 +02:00
Nathan Graves 5b27996cb3 Include GCE labels during service discovery. Updated vendor files for Google API. (#4150)
Signed-off-by: Nathan Graves <nathan.graves@kofile.us>
2018-05-08 17:37:47 +01:00
Daisy T b424eb42e3 document remote write queue parameters (#4126) 2018-04-30 20:08:45 +02:00
Adam Shannon 809881d7f5 support reading basic_auth password_file for HTTP basic auth (#4077)
Issue: https://github.com/prometheus/prometheus/issues/4076

Signed-off-by: Adam Shannon <adamkshannon@gmail.com>
2018-04-25 18:19:06 +01:00
Philippe Laflamme 2aba238f31 Use common HTTPClientConfig for marathon_sd configuration (#4009)
This adds support for basic authentication which closes #3090

The support for specifying the client timeout was removed as discussed in https://github.com/prometheus/common/pull/123. Marathon was the only sd mechanism doing this and configuring the timeout is done through `Context`.

DC/OS uses a custom `Authorization` header for authenticating. This adds 2 new configuration properties to reflect this.

Existing configuration files that use the bearer token will no longer work. More work is required to make this backwards compatible.
2018-04-05 09:08:18 +01:00
albatross0 0245fd55bf Add a machine type label to GCE SD (#4032) 2018-03-31 09:20:19 +01:00
Kristiyan Nikolov be85ba3842 discovery/ec2: Support filtering instances in discovery (#4011) 2018-03-31 07:51:11 +01:00
Corentin Chary 60dafd425c consul: improve consul service discovery (#3814)
* consul: improve consul service discovery

Related to #3711

- Add the ability to filter by tag and node-meta in an efficient way (`/catalog/services`
  allow filtering by node-meta, and returns a `map[string]string` or `service`->`tags`).
  Tags and nore-meta are also used in `/catalog/service` requests.
- Do not require a call to the catalog if services are specified by name. This is important
  because on large cluster `/catalog/services` changes all the time.
- Add `allow_stale` configuration option to do stale reads. Non-stale
  reads can be costly, even more when you are doing them to a remote
  datacenter with 10k+ targets over WAN (which is common for federation).
- Add `refresh_interval` to minimize the strain on the catalog and on the
  service endpoint. This is needed because of that kind of behavior from
  consul: https://github.com/hashicorp/consul/issues/3712 and because a catalog
  on a large cluster would basically change *all* the time. No need to discover
  targets in 1sec if we scrape them every minute.
- Added plenty of unit tests.

Benchmarks
----------

```yaml
scrape_configs:

- job_name: prometheus
  scrape_interval: 60s
  static_configs:
    - targets: ["127.0.0.1:9090"]

- job_name: "observability-by-tag"
  scrape_interval: "60s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      tag: marathon-user-observability  # Used in After
      refresh_interval: 30s             # Used in After+delay
  relabel_configs:
    - source_labels: [__meta_consul_tags]
      regex: ^(.*,)?marathon-user-observability(,.*)?$
      action: keep

- job_name: "observability-by-name"
  scrape_interval: "60s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      services:
        - observability-cerebro
        - observability-portal-web

- job_name: "fake-fake-fake"
  scrape_interval: "15s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      services:
        - fake-fake-fake
```

Note: tested with ~1200 services, ~5000 nodes.

| Resource | Empty | Before | After | After + delay |
| -------- |:-----:|:------:|:-----:|:-------------:|
|/service-discovery size|5K|85MiB|27k|27k|27k|
|`go_memstats_heap_objects`|100k|1M|120k|110k|
|`go_memstats_heap_alloc_bytes`|24MB|150MB|28MB|27MB|
|`rate(go_memstats_alloc_bytes_total[5m])`|0.2MB/s|28MB/s|2MB/s|0.3MB/s|
|`rate(process_cpu_seconds_total[5m])`|0.1%|15%|2%|0.01%|
|`process_open_fds`|16|*1236*|22|22|
|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="services"}[5m])`|~0|1|1|*0.03*|
|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="service"}[5m])`|0.1|*80*|0.5|0.5|
|`prometheus_target_sync_length_seconds{quantile="0.9",scrape_job="observability-by-tag"}`|N/A|200ms|0.2ms|0.2ms|
|Network bandwidth|~10kbps|~2.8Mbps|~1.6Mbps|~10kbps|

Filtering by tag using relabel_configs uses **100kiB and 23kiB/s per service per job** and quite a lot of CPU. Also sends and additional *1Mbps* of traffic to consul.
Being a little bit smarter about this reduces the overhead quite a lot.
Limiting the number of `/catalog/services` queries per second almost removes the overhead of service discovery.

* consul: tweak `refresh_interval` behavior

`refresh_interval` now does what is advertised in the documentation,
there won't be more that one update per `refresh_interval`. It now
defaults to 30s (which was also the current waitTime in the consul query).

This also make sure we don't wait another 30s if we already waited 29s
in the blocking call by substracting the number of elapsed seconds.

Hopefully this will do what people expect it does and will be safer
for existing consul infrastructures.
2018-03-23 14:48:43 +00:00
Yecheng Fu 56ed29fbf7 Map target infos of endpoints to prometheus meta labels. (#3770) 2018-03-09 10:07:00 +00:00
Pedro Araújo 575f665944 Add OS type meta label to Azure SD (#3863)
There is currently no way to differentiate Windows instances from Linux
ones. This is needed when you have a mix of node_exporters /
wmi_exporters for OS-level metrics and you want to have them in separate
scrape jobs.

This change allows you to do just that. Example:

```
  - job_name: 'node'
    azure_sd_configs:
      - <azure_sd_config>
    relabel_configs:
      - source_labels: [__meta_azure_machine_os_type]
        regex: Linux
        action: keep
```

The way the vendor'd AzureSDK provides to get the OsType is a bit
awkward - as far as I can tell, this information can only be gotten from
the startup disk. Newer versions of the SDK appear to improve this a
bit (by having OS information in the InstanceView), but the current way
still works.
2018-02-19 15:40:57 +00:00
Andrea Giardini 3a9637fa3c docs: Fix remote_read/remote_timeout default (#3829) 2018-02-12 12:52:33 +00:00
zemek 8a01a0fbed Set consul server default to localhost:8500 (#3703) 2018-01-24 12:14:32 +00:00
James Turnbull 00f4821178 Added missing ingress from role list (#3666) 2018-01-08 21:23:01 +00:00
Brian Brazil fba80da635
Fix default of read_recent to be false. (#3617)
This is what is documented in the migration guide, and the default settings
should make sense for a true long term storage.

Document the setting.
2017-12-23 17:21:38 +00:00
Brian Brazil 9083d41d3a
Add 2.0 stability guarantees (#3484)
As discussed generally consider SDs as unstable, as realistically they
are never going to be. Drop the words "experimental/beta" from most
places in the docs, as users are getting the wrong impression from this.
2017-12-14 12:54:32 +00:00
Simon Pasquier aa25dff1ea Update the openstack_sd_config section
openstack_sd_config requires a 'role' parameter which wasn't documented.
2017-12-14 12:20:28 +00:00
Tobias Schmidt 28205f5ca9 Remove wrong statement about alertmanager URL configuration 2017-12-14 12:20:28 +00:00
Brian Brazil e0711c2e9b Document consul sd tls_config (#3440)
Fixes https://github.com/prometheus/docs/issues/681
2017-12-14 12:20:28 +00:00
James Turnbull 330735aca6 Added another full link to the configuration docs (#3553) 2017-12-07 08:31:15 +00:00
Amy Holt 607a675617 Add prefix to relative 3 URLs (#3551) 2017-12-06 21:16:53 +00:00
James Turnbull 47311bf005 Update configuration.md (#3513)
1. Removed https://prometheus.io prefix 
2. Fixed broken file discovery link.
2017-11-27 14:52:32 +00:00
Tom Wilkie 7d4f7c4b71 Update docs for __meta_kubernetes_pod_uid 2017-11-24 15:02:53 +00:00
Tobias Schmidt 7098c56474 Add remote read filter option
For special remote read endpoints which have only data for specific
queries, it is desired to limit the number of queries sent to the
configured remote read endpoint to reduce latency and performance
overhead.
2017-11-13 23:30:01 +01:00
Brian Brazil a5b7955ace Tweak marathon wording around clustering. 2017-11-02 13:03:19 +00:00
Goutham Veeramachaneni 646e33242e docs: Fix minor issues with the docs. (#3389)
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-11-01 15:35:50 +00:00
Brian Brazil b6494960d1
docs: Document new recording rule format (#3378) 2017-11-01 12:58:32 +00:00
Tobias Schmidt f432b8176d Consolidate configuration and rules docs in docs/configuration/ 2017-10-27 09:54:02 +02:00