prometheus

Commit Graph

Author	SHA1	Message	Date
Callum Styan	67838643ee	Add config option for remote job name (#6043 ) * Track remote write queues via a map so we don't care about index. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Support a job name for remote write/read so we can differentiate between them using the name. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Remote write/read has Name to not confuse the meaning of the field with scrape job names. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Split queue/client label into remote_name and url labels. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Don't allow for duplicate remote write/read configs. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Ensure we restart remote write queues if the hash of their config has not changed, but the remote name has changed. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Include name in remote read/write config hashes, simplify duplicates check, update test accordingly. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-12-12 12:47:23 -08:00
johncming	8d3083e256	config: add test case for scrape interval larger than timeout. (#6037 ) Signed-off-by: johncming <johncming@yahoo.com>	2019-09-23 13:26:56 +02:00
Bartek Plotka	f0863a604e	Removed extra tsdb/testutil after merge. Signed-off-by: Bartek Plotka <bwplotka@gmail.com>	2019-08-14 10:12:32 +01:00
Chris Marchbanks	529ccff07b	Remove all usages of stretchr/testify Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>	2019-08-08 19:49:27 -06:00
Max Leonard Inden	41c22effbe	config&notifier: Add option to use Alertmanager API v2 With v0.16.0 Alertmanager introduced a new API (v2). This patch adds a configuration option for Prometheus to send alerts to the v2 endpoint instead of the defautl v1 endpoint. Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2019-06-21 16:33:53 +02:00
Callum Styan	5603b857a9	Check if label value is valid when unmarhsaling external labels from YAML, add a test to config_tests for valid/invalid external label value. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-03-18 20:31:12 +00:00
Tom Wilkie	c7b3535997	Use pkg/relabelling in remote write. - Unmarshall external_labels config as labels.Labels, add tests. - Convert some more uses of model.LabelSet to labels.Labels. - Remove old relabel pkg (fixes #3647). - Validate external label names. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2019-03-18 20:31:12 +00:00
Julien Pivotto	4397916cb2	Add honor_timestamps (#5304 ) Fixes #5302 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2019-03-15 10:04:15 +00:00
Callum Styan	83c46fd549	update Consul vendor code so that catalog.ServiceMultipleTags can be (#5151 ) Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-03-12 10:31:27 +00:00
Simon Pasquier	027d2ece14	config: resolve more file paths (#5284 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-03-12 10:24:15 +00:00
Simon Pasquier	e72c875e63	config: fix Kubernetes config with empty API server (#5256 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-02-22 15:51:47 +01:00
Simon Pasquier	c8a1a5a93c	discovery/kubernetes: fix support for password_file and bearer_token_file (#5211 ) * discovery/kubernetes: fix support for password_file Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Create and pass custom RoundTripper to Kubernetes client Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Use inline HTTPClientConfig Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-02-20 11:22:34 +01:00
Simon Pasquier	f678e27eb6	: use latest release of staticcheck (#5057 ) : use latest release of staticcheck It also fixes a couple of things in the code flagged by the additional checks. Signed-off-by: Simon Pasquier <spasquie@redhat.com> Use official release of staticcheck Also run 'go list' before staticcheck to avoid failures when downloading packages. Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-01-04 14:47:38 +01:00
Marcel D. Juhnke	c7d83b2b6a	discovery: add support for Managed Identity authentication in Azure SD (#4590 ) Signed-off-by: Marcel Juhnke <marrat@marrat.de>	2018-12-19 10:03:33 +00:00
Bartek Płotka	62c8337e77	Moved configuration into `relabel` package. (#4955 ) Adapted top dir relabel to use pkg relabel structs. Removal of this in a separate tracked here: https://github.com/prometheus/prometheus/issues/3647 Signed-off-by: Bartek Plotka <bwplotka@gmail.com>	2018-12-18 11:26:36 +00:00
Julius Volz	d28246e337	Fix config loading panics on nil pointer slice elements (#4942 ) Fixes https://github.com/prometheus/prometheus/issues/4902 Fixes https://github.com/prometheus/prometheus/issues/4889 Signed-off-by: Julius Volz <julius.volz@gmail.com>	2018-12-03 18:09:02 +08:00
mengnan	a5d39361ab	discovery/azure: Fail hard when Azure authentication parameters are missing (#4907 ) * discovery/azure: fail hard when client_id/client_secret is empty Signed-off-by: mengnan <supernan1994@gmail.com> * discovery/azure: fail hard when authentication parameters are missing Signed-off-by: mengnan <supernan1994@gmail.com> * add unit test Signed-off-by: mengnan <supernan1994@gmail.com> * add unit test Signed-off-by: mengnan <supernan1994@gmail.com> * format code Signed-off-by: mengnan <supernan1994@gmail.com>	2018-11-29 16:47:59 +01:00
Ben Kochie	c6399296dc	Fix spelling/typos (#4921 ) * Fix spelling/typos Fix spelling/typos reported by codespell/misspell. * UK -> US spelling changes. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-11-27 17:44:29 +01:00
Simon Pasquier	ff08c40091	discovery/openstack: support tls_config Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-09-25 14:31:32 +02:00
Simon Pasquier	128ff546b8	config: add test for OpenStack SD (#4594 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-09-13 21:44:27 +05:30
Tariq Ibrahim	f708fd5c99	Adding support for multiple azure environments (#4569 ) Signed-off-by: Tariq Ibrahim <tariq.ibrahim@microsoft.com>	2018-09-04 17:55:40 +02:00
Daisy T	7d01ead689	change time.duration to model.duration for standardization (#4479 ) Signed-off-by: Daisy T <daisyts@gmx.com>	2018-08-24 16:55:21 +02:00
Paul Gier	d24d2acd11	config: set target group source index during unmarshalling (#4245 ) * config: set target group source index during unmarshalling Fixes issue #4214 where the scrape pool is unnecessarily reloaded for a config reload where the config hasn't changed. Previously, the discovery manager changed the static config after loading which caused the in-memory config to differ from a freshly reloaded config. Signed-off-by: Paul Gier <pgier@redhat.com> * [issue #4214] Test that static targets are not modified by discovery manager Signed-off-by: Paul Gier <pgier@redhat.com>	2018-06-13 16:34:59 +01:00
Philippe Laflamme	2aba238f31	Use common HTTPClientConfig for marathon_sd configuration (#4009 ) This adds support for basic authentication which closes #3090 The support for specifying the client timeout was removed as discussed in https://github.com/prometheus/common/pull/123. Marathon was the only sd mechanism doing this and configuring the timeout is done through `Context`. DC/OS uses a custom `Authorization` header for authenticating. This adds 2 new configuration properties to reflect this. Existing configuration files that use the bearer token will no longer work. More work is required to make this backwards compatible.	2018-04-05 09:08:18 +01:00
Manos Fokas	25f929b772	Yaml UnmarshalStrict implementation. (#4033 ) * Updated yaml vendor package. * remove checkOverflow duplicate in rulefmt * remove duplicated HTTPClientConfig.Validate() * Added yaml static check.	2018-04-04 09:07:39 +01:00
Kristiyan Nikolov	be85ba3842	discovery/ec2: Support filtering instances in discovery (#4011 )	2018-03-31 07:51:11 +01:00
Corentin Chary	60dafd425c	consul: improve consul service discovery (#3814 ) * consul: improve consul service discovery Related to #3711 - Add the ability to filter by tag and node-meta in an efficient way (`/catalog/services` allow filtering by node-meta, and returns a `map[string]string` or `service`->`tags`). Tags and nore-meta are also used in `/catalog/service` requests. - Do not require a call to the catalog if services are specified by name. This is important because on large cluster `/catalog/services` changes all the time. - Add `allow_stale` configuration option to do stale reads. Non-stale reads can be costly, even more when you are doing them to a remote datacenter with 10k+ targets over WAN (which is common for federation). - Add `refresh_interval` to minimize the strain on the catalog and on the service endpoint. This is needed because of that kind of behavior from consul: https://github.com/hashicorp/consul/issues/3712 and because a catalog on a large cluster would basically change all the time. No need to discover targets in 1sec if we scrape them every minute. - Added plenty of unit tests. Benchmarks ---------- ```yaml scrape_configs: - job_name: prometheus scrape_interval: 60s static_configs: - targets: ["127.0.0.1:9090"] - job_name: "observability-by-tag" scrape_interval: "60s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 tag: marathon-user-observability # Used in After refresh_interval: 30s # Used in After+delay relabel_configs: - source_labels: [__meta_consul_tags] regex: ^(.,)?marathon-user-observability(,.)?$ action: keep - job_name: "observability-by-name" scrape_interval: "60s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 services: - observability-cerebro - observability-portal-web - job_name: "fake-fake-fake" scrape_interval: "15s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 services: - fake-fake-fake ``` Note: tested with ~1200 services, ~5000 nodes. \| Resource \| Empty \| Before \| After \| After + delay \| \| -------- \|:-----:\|:------:\|:-----:\|:-------------:\| \|/service-discovery size\|5K\|85MiB\|27k\|27k\|27k\| \|`go_memstats_heap_objects`\|100k\|1M\|120k\|110k\| \|`go_memstats_heap_alloc_bytes`\|24MB\|150MB\|28MB\|27MB\| \|`rate(go_memstats_alloc_bytes_total[5m])`\|0.2MB/s\|28MB/s\|2MB/s\|0.3MB/s\| \|`rate(process_cpu_seconds_total[5m])`\|0.1%\|15%\|2%\|0.01%\| \|`process_open_fds`\|16\|1236\|22\|22\| \|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="services"}[5m])`\|~0\|1\|1\|0.03\| \|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="service"}[5m])`\|0.1\|80\|0.5\|0.5\| \|`prometheus_target_sync_length_seconds{quantile="0.9",scrape_job="observability-by-tag"}`\|N/A\|200ms\|0.2ms\|0.2ms\| \|Network bandwidth\|~10kbps\|~2.8Mbps\|~1.6Mbps\|~10kbps\| Filtering by tag using relabel_configs uses 100kiB and 23kiB/s per service per job and quite a lot of CPU. Also sends and additional 1Mbps of traffic to consul. Being a little bit smarter about this reduces the overhead quite a lot. Limiting the number of `/catalog/services` queries per second almost removes the overhead of service discovery. * consul: tweak `refresh_interval` behavior `refresh_interval` now does what is advertised in the documentation, there won't be more that one update per `refresh_interval`. It now defaults to 30s (which was also the current waitTime in the consul query). This also make sure we don't wait another 30s if we already waited 29s in the blocking call by substracting the number of elapsed seconds. Hopefully this will do what people expect it does and will be safer for existing consul infrastructures.	2018-03-23 14:48:43 +00:00
pasquier-s	fc8cf08f42	Prevent invalid label names with labelmap (#3868 ) This change ensures that the relabeling configurations using labelmap can't generate invalid label names.	2018-02-21 10:02:22 +00:00
Shubheksha Jalan	0471e64ad1	Use shared types from the `common` repo (#3674 ) * refactor: use shared types from common repo, remove util/config * vendor: add common/config * fix nit	2018-01-11 16:10:25 +01:00
Shubheksha Jalan	ec94df49d4	Refactor SD configuration to remove `config` dependency (#3629 ) * refactor: move targetGroup struct and CheckOverflow() to their own package * refactor: move auth and security related structs to a utility package, fix import error in utility package * refactor: Azure SD, remove SD struct from config * refactor: DNS SD, remove SD struct from config into dns package * refactor: ec2 SD, move SD struct from config into the ec2 package * refactor: file SD, move SD struct from config to file discovery package * refactor: gce, move SD struct from config to gce discovery package * refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil * refactor: consul, move SD struct from config into consul discovery package * refactor: marathon, move SD struct from config into marathon discovery package * refactor: triton, move SD struct from config to triton discovery package, fix test * refactor: zookeeper, move SD structs from config to zookeeper discovery package * refactor: openstack, remove SD struct from config, move into openstack discovery package * refactor: kubernetes, move SD struct from config into kubernetes discovery package * refactor: notifier, use targetgroup package instead of config * refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup * refactor: retrieval, use targetgroup package instead of config.TargetGroup * refactor: storage, use config util package * refactor: discovery manager, use targetgroup package instead of config.TargetGroup * refactor: use HTTPClient and TLS config from configUtil instead of config * refactor: tests, use targetgroup package instead of config.TargetGroup * refactor: fix tagetgroup.Group pointers that were removed by mistake * refactor: openstack, kubernetes: drop prefixes * refactor: remove import aliases forced due to vscode bug * refactor: move main SD struct out of config into discovery/config * refactor: rename configUtil to config_util * refactor: rename yamlUtil to yaml_config * refactor: kubernetes, remove prefixes * refactor: move the TargetGroup package to discovery/ * refactor: fix order of imports	2017-12-29 21:01:34 +01:00
Alberto Cortés	29da2fb9cd	testutil: update to go1.9 testing.Helper	2017-12-08 19:06:53 +01:00
Alberto Cortés	8f6a9f7833	config: simplify tests by using testutil.NotOk (#3289 ) Also include filename in all LoadFile errors Also add mesage to testuitl.NotOk so we can identify failing tests when using table driven tests.	2017-12-08 16:52:25 +00:00
Tobias Schmidt	7098c56474	Add remote read filter option For special remote read endpoints which have only data for specific queries, it is desired to limit the number of queries sent to the configured remote read endpoint to reduce latency and performance overhead.	2017-11-13 23:30:01 +01:00
Thibault Chataigner	bf4a279a91	Remote storage reads based on oldest timestamp in primary storage (#3129 ) Currently all read queries are simply pushed to remote read clients. This is fine, except for remote storage for wich it unefficient and make query slower even if remote read is unnecessary. So we need instead to compare the oldest timestamp in primary/local storage with the query range lower boundary. If the oldest timestamp is older than the mint parameter, then there is no need for remote read. This is an optionnal behavior per remote read client. Signed-off-by: Thibault Chataigner <t.chataigner@criteo.com>	2017-10-18 12:08:14 +01:00
Alberto Cortés	6c67296423	config: fix error message for unexpected result of yaml marshal	2017-10-12 19:50:07 +02:00
Alberto Cortés	0f3d8ea075	config: use testutil package	2017-10-12 19:50:07 +02:00
Fabian Reinartz	87918f3097	Merge branch 'master' into dev-2.0	2017-09-04 14:09:21 +02:00
Max Leonard Inden	1c96fbb992	Expose current Prometheus config via /status/config This PR adds the `/status/config` endpoint which exposes the currently loaded Prometheus config. This is the same config that is displayed on `/config` in the UI in YAML format. The response payload looks like such: ``` { "status": "success", "data": { "yaml": <CONFIG> } } ```	2017-08-13 22:21:18 +02:00
Fabian Reinartz	25f3e1c424	Merge branch 'master' into mergemaster	2017-08-10 17:04:25 +02:00
Yuki Ito	1bf3b91ae0	Make sure that url for remote_read/write is not nil (#3024 )	2017-08-07 08:49:45 +01:00
Tom Wilkie	5169f990f9	Review feedback: add yaml struct tags, don't embed queue config. Also, rename QueueManageConfig to QueueConfig, for consistency with tags.	2017-08-01 14:43:56 +01:00
Tom Wilkie	454b661145	Make queue manager configurable.	2017-07-25 13:47:34 +01:00
Fabian Reinartz	dba7586671	Merge branch 'master' into dev-2.0	2017-07-11 17:22:14 +02:00
Fuente, Pablo Andres	902fafb8e7	Fixing tests for Windows Fixing the config/config_test, the discovery/file/file_test and the promql/promql_test tests for Windows. For most of the tests, the fix involved correct handling of path separators. In the case of the promql tests, the issue was related to the removal of the temporal directories used by the storage. The issue is that the RemoveAll() call returns an error when it tries to remove a directory which is not empty, which seems to be true due to some kind of process that is still running after closing the storage. To fix it I added some retries to the remove of the temporal directories. Adding tags file from Universal Ctags to .gitignore	2017-07-09 01:59:30 -03:00
Fabian Reinartz	65b087bcc1	config: resolve file SD paths relative to config	2017-07-04 11:40:26 +02:00
Roman Vynar	dbe2eb2afc	Hide consul token on UI. (#2797 )	2017-06-01 22:14:23 +01:00
Julius Volz	240bb671e2	config: Fix overflow checking in global config (#2783 )	2017-05-30 20:58:06 +02:00
Conor Broderick	6766123f93	Replace regex with Secret type and remarshal config to hide secrets (#2775 )	2017-05-29 12:46:23 +01:00
Brian Akins	27d66628a1	Allow limiting Kubernetes service discover to certain namespaces Allow namespace discovery to be more easily extended in the future by using a struct rather than just a list. Rename fields for kubernetes namespace discovery	2017-04-27 07:41:36 -04:00
Julius Volz	525da88c35	Merge pull request #2479 from YKlausz/consul-tls Adding consul capability to connect via tls	2017-03-20 11:40:18 +01:00

1 2 3

142 Commits (b38b25e9e1efdacc2f0f793ce35a1f7e2848d552)