prometheus

Commit Graph

Author	SHA1	Message	Date
Björn Rabenstein	9c43ac451c	Merge pull request #9129 from PhilipGough/bz-1984365 mixin: Filter instance by selected job for Prometheus overview dashboard	2021-08-13 14:03:16 +02:00
TJ Hoplock	7baf084092	optimize Linode SD by polling for event changes during refresh (#8980 ) * optimize Linode SD by polling for event changes during refresh Most accounts are fairly "static", in the sense that they're not cycling through instances constantly. So rather than do a full refresh every interval and potentially make several behind-the-scenes paginated API calls, this will now poll the `/account/events/` endpoint every minute with a list of events that we care about. If a matching event is found, we then do a full refresh. Co-authored-by: William Smith <wsmith@linode.com> Signed-off-by: TJ Hoplock <t.hoplock@gmail.com> Signed-off-by: William Smith <wsmith@linode.com>	2021-08-04 12:05:49 +02:00
Philip Gough	751ca03fad	mixin: Filter instance by job for Prometheus overview dashboard Signed-off-by: Philip Gough <philip.p.gough@gmail.com>	2021-07-28 14:34:26 +01:00
Julius Volz	179b2155d1	Fix: Use json.Unmarshal() instead of json.Decoder (#9033 ) * Fix: Use json.Unmarshal() instead of json.Decoder See https://ahmet.im/blog/golang-json-decoder-pitfalls/ json.Decoder is for JSON streams, not single JSON objects / bodies. Signed-off-by: Julius Volz <julius.volz@gmail.com> * Revert modifications to targetgroup parsing Signed-off-by: Julius Volz <julius.volz@gmail.com>	2021-07-02 09:38:14 +01:00
Ben Kochie	7cb55d5732	Merge pull request #8802 from mwasilew2/yaml-linting Adds yamllinting to Makefile.common	2021-06-24 15:59:35 +02:00
Levi Harrison	4a4882d4c7	Replace godoc.org links Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-17 07:18:51 -04:00
Julien Duchesne	8855c2e626	Add `prometheus_tsdb_clean_start` metric (#8824 ) Add cleanup of the lockfile when the db is cleanly closed The metric describes the status of the lockfile on startup 0: Already existed 1: Did not exist -1: Disabled Therefore, if the min value over time of this metric is 0, that means that executions have exited uncleanly We can then use that metric to have a much lower threshold on the crashlooping alert: If the metric exists and it has been zero, two restarts is enough to trigger the alarm If it does not exist (old prom version for example), the current five restarts threshold remains Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com> * Change metric name + set unset value to -1 Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com> * Only check the last value of the clean start alert Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com> * Fix test + nit Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>	2021-06-16 15:03:02 +05:30
Michal Wasilewski	3f686cad8b	fixes yamllint errors Signed-off-by: Michal Wasilewski <mwasilewski@gmx.com>	2021-06-12 12:47:47 +02:00
Levi Harrison	b5f6f8fb36	Switched to go-kit/log Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-11 12:28:36 -04:00
Julien Pivotto	20c6739adc	Merge pull request #8833 from hanjm/feature/add-scape-read-body-limit Add body_size_limit to prevent bad targets response large body cause Prometheus server OOM (#8827)	2021-06-02 09:24:59 +02:00
TJ Hoplock	dc22c65349	Add Linode Service Discovery (#8846 ) * Add Linode Service Discovery Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>	2021-06-01 20:32:36 +02:00
hanjm	1df05bfd49	Add body_size_limit to prevent bad targets response large body cause Prometheus server OOM (#8827 ) Signed-off-by: hanjm <hanjinming@outlook.com>	2021-05-29 07:05:42 +08:00
Levi Harrison	2826fbeeb7	SD: Add target creation failure counter and change failure handling (#8786 ) * Added metric and changed failure/drop strategy Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-05-28 23:50:59 +02:00
Callum Styan	8fd73b1d28	Add Exemplar Remote Write support (#8296 ) * Write exemplars to the WAL and send them over remote write. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Update example for exemplars, print data in a more obvious format. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Add metrics for remote write of exemplars. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix incorrect slices passed to send in remote write. Signed-off-by: Callum Styan <callumstyan@gmail.com> * We need to unregister the new metrics. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address review comments Signed-off-by: Callum Styan <callumstyan@gmail.com> * Order of exemplar append vs write exemplar to WAL needs to change. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Several fixes to prevent sending uninitialized or incorrect samples with an exemplar. Fix dropping exemplar for missing series. Add tests for queue_manager sending exemplars Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Store both samples and exemplars in the same timeseries buffer to remove the alloc when building final request, keep sub-slices in separate buffers for re-use Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Condense sample/exemplar delivery tests to parameterized sub-tests Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Rename test methods for clarity now that they also handle exemplars Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Rename counter variable. Fix instances where metrics were not updated correctly Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Add exemplars to LoadWAL benchmark Signed-off-by: Callum Styan <callumstyan@gmail.com> * last exemplars timestamp metric needs to convert value to seconds with ms precision Signed-off-by: Callum Styan <callumstyan@gmail.com> * Process exemplar records in a separate go routine when loading the WAL. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address review comments related to clarifying comments and variable names. Also refactor sample/exemplar to enqueue prompb types. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Regenerate types proto with comments, update protoc version again. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Put remote write of exemplars behind a feature flag. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address some of Ganesh's review comments. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Move exemplar remote write feature flag to a config file field. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address Bartek's review comments. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Don't allocate exemplar buffers in queue_manager if we're not going to send exemplars over remote write. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Add ValidateExemplar function, validate exemplars when appending to head and log them all to WAL before adding them to exemplar storage. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address more reivew comments from Ganesh. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Add exemplar total label length check. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address a few last review comments Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Martin Disibio <mdisibio@gmail.com>	2021-05-06 13:53:52 -07:00
Damien Grisonnet	b50f9c1c84	Add label scrape limits (#8777 ) * scrape: add label limits per scrape Add three new limits to the scrape configuration to provide some mechanism to defend against unbound number of labels and excessive label lengths. If any of these limits are broken by a sample from a scrape, the whole scrape will fail. For all of these configuration options, a zero value means no limit. The `label_limit` configuration will provide a mechanism to bound the number of labels per-scrape of a certain sample to a user defined limit. This limit will be tested against the sample labels plus the discovery labels, but it will exclude the __name__ from the count since it is a mandatory Prometheus label to which applying constraints isn't meaningful. The `label_name_length_limit` and `label_value_length_limit` will prevent having labels of excessive lengths. These limits also skip the __name__ label for the same reasons as the `label_limit` option and will also make the scrape fail if any sample has a label name/value length that exceed the predefined limits. Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com> * scrape: add metrics and alert to label limits Add three gauge, one for each label limit to easily access the limit set by a certain scrape target. Also add a counter to count the number of targets that exceeded the label limits and thus were dropped. This is useful for the `PrometheusLabelLimitHit` alert that will notify the users that scraping some targets failed because they had samples exceeding the label limits defined in the scrape configuration. Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com> * scrape: apply label limits to __name__ label Apply limits to the __name__ label that was previously skipped and truncate the label names and values in the error messages as they can be very very long. Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com> * scrape: remove label limits gauges and refactor Remove `prometheus_target_scrape_pool_label_limit`, `prometheus_target_scrape_pool_label_name_length_limit`, and `prometheus_target_scrape_pool_label_value_length_limit` as they are not really useful since we don't have the information on the labels in it. Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>	2021-05-06 09:56:21 +01:00
Gezim Sejdiu	97acd170b2	Fix a broken link for the bcrypt ref. at the web-config.yml example Signed-off-by: Gezim Sejdiu <g.sejdiu@gmail.com>	2021-04-20 22:43:37 +02:00
zhangshj	1956f07197	update redirected url Signed-off-by: zhangshj <zhangshj@inspur.com>	2021-04-14 13:54:40 +08:00
Robert Jacob	b253056163	Implement Docker discovery (#8629 ) * Implement Docker discovery Signed-off-by: Robert Jacob <xperimental@solidproject.de>	2021-03-29 22:30:23 +02:00
Rémy Léone	f690b811c5	add support for scaleway service discovery (#8555 ) Co-authored-by: Patrik <patrik@ptrk.io> Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu> Signed-off-by: Rémy Léone <rleone@scaleway.com>	2021-03-10 15:10:17 +01:00
Julien Pivotto	432d5ebc6c	Rename default branch to main Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-02-22 20:28:02 +01:00
Julien Pivotto	8787f0aed7	Update common to support credentials type Most of the backwards compat tests is done in common. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-02-18 23:28:22 +01:00
Tom Wilkie	d479151f1f	Various enhancements and refactorings for remote write receiver: - Remove unrelated changes - Refactor code out of the API module - that is already getting pretty crowded. - Don't track reference for AddFast in remote write. This has the potential to consume unlimited server-side memory if a malicious client pushes a different label set for every series. For now, its easier and safer to always use the 'slow' path. - Return 400 on out of order samples. - Use remote.DecodeWriteRequest in the remote write adapters. - Put this behing the 'remote-write-server' feature flag - Add some (very) basic docs. - Used named return & add test for commit error propagation Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2021-02-08 20:41:23 +00:00
ravilr	adc8807851	Update remote-write alert rules mixin (#8423 ) Signed-off-by: ravilr <raviprasad_lr@yahoo.com>	2021-01-31 20:07:49 +00:00
Julien Pivotto	5bd7145e55	Merge pull request #8327 from roidelapluie/tlsexemple https: Add example configuration file	2021-01-15 09:50:52 +01:00
Julien Pivotto	08c259cda6	https: Add example configuration file Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-01-15 01:37:50 +01:00
Frederic Branczyk	62bc755733	mixin: Scope grafana config In its current form this configuration clashes in one of the most widely used configurations (kube-prometheus). This patch scopes the configuration to prevent this. Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>	2020-12-30 17:50:34 +01:00
Nicolas Lamirault	aa1ca13025	Add: Custom tags and prefix in Prometheus Mixin (#8287 ) * Add: custom tags and prefix Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com> * Fix: fmt Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>	2020-12-16 18:49:06 +01:00
Björn Rabenstein	511511324a	Merge pull request #8235 from Allex1/master Update remote-write grafana mixin	2020-12-08 14:50:47 +01:00
beorn7	553f904f2d	mixin: Add a capability to exclude non-prod AM instances Signed-off-by: beorn7 <beorn@grafana.com>	2020-12-03 20:59:53 +01:00
birca	3ec4161575	Update remote-write grafana mixin Signed-off-by: birca <birca@adobe.com>	2020-12-02 09:50:15 +02:00
beorn7	638e99c814	prometheus-mixin: Make PrometheusRemoteWriteBehind more generic Currently, it relies on `job, instance` being the labels completely identifying a Prometheus instance. However, what's intended is to simply not match on `remote_name, url`. Signed-off-by: beorn7 <beorn@grafana.com>	2020-11-17 13:29:49 +01:00
beorn7	371ca9ff46	prometheus-mixin: add HA-group aware alerts There is certainly a potential to add more of these. This is mostly meant to introduce the concept and cover a few critical parts. Signed-off-by: beorn7 <beorn@grafana.com>	2020-11-11 19:45:34 +01:00
Julien Pivotto	6c56a1faaa	Testify: move to require (#8122 ) * Testify: move to require Moving testify to require to fail tests early in case of errors. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * More moves Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-29 09:43:23 +00:00
like-inspur	29b551225b	add networking.k8s.io for ingress (#8091 ) * add networking.k8s.io for ingress level=error ts=2020-10-19T08:32:30.544Z caller=klog.go:96 component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:494: Failed to watch v1beta1.Ingress: failed to list v1beta1.Ingress: ingresses.networking.k8s.io is forbidden: User \"system:serviceaccount:monitoring:prometheus\" cannot list resource \"ingresses\" in API group \"networking.k8s.io\" at the cluster scope" Signed-off-by: root <likerj@inspur.com> * Update rbac-setup.yml Signed-off-by: root <likerj@inspur.com>	2020-10-22 15:08:12 -06:00
Julien Pivotto	4e5b1722b3	Move away from testutil, refactor imports (#8087 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-22 11:00:08 +02:00
Matthias Loibl	13ba013a24	Use absolute jsonnet import paths This should be the way forward when importing libraries in jsonnet. It's closer to how Go imports look and makes it more obvious where packages live. This is not breaking anything, as the old imports were already symlinks to the now directly used directories. Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>	2020-10-20 11:42:30 +02:00
Björn Rabenstein	d49f267f76	Merge pull request #8054 from simonpasquier/improve-not-ingesting-samples-alert documentation/prometheus-mixin: improve PrometheusNotIngestingSamples	2020-10-15 12:29:39 +02:00
Simon Pasquier	f381d8a9bd	documentation/prometheus-mixin: improve PrometheusNotIngestingSamples The alert shouldn't fire when there's no target and no rule configured. Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2020-10-15 11:13:17 +02:00
Julien Pivotto	4596abee4d	Mixin: Ignore unset remote write timestamp (#8046 ) * Mixin: Ignore unset remote write timestamp This pull request ignores the zero value of highest_sent_timestamp_seconds in Highest Timestamp In vs. Highest Timestamp Sent which just show that remote write has not been successful yet. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-15 09:15:59 +02:00
garanews	c38816828f	fix few typo (#8023 ) Signed-off-by: garanews <puntogtg@tiscali.it>	2020-10-07 16:51:31 +01:00
Luke Chen	3364875ae5	update the doc link in internal_arthitecture.md (#7966 ) * update the doc link in internal_arthitecture.md * address reviewer's comment to remove out-dated wrapper Signed-off-by: Luke Chen <showuon@gmail.com>	2020-09-24 09:10:41 +01:00
Julien Pivotto	e208afcc95	web: Remove APIv2 (#7935 ) * web: Remove APIv2 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-09-15 09:30:55 +02:00
kangwoo	7c0d5ae4e7	Add Eureka Service Discovery (#3369 ) Signed-off-by: kangwoo <kangwoo@gmail.com>	2020-08-26 17:36:59 +02:00
Simon Pasquier	e693af6c01	.circleci/config.yml: check mixins (#6895 ) * .circleci/config.yml: check mixins Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Run jsonnetfmt Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Install tools in the image instead of using coreos/jsonnet-ci The latter is deprecated Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Update jsonnetfile.json Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2020-08-25 15:59:41 +02:00
Lukas Kämmerling	b6955bf1ca	Add hetzner service discovery (#7822 ) Signed-off-by: Lukas Kämmerling <lukas.kaemmerling@hetzner-cloud.de>	2020-08-21 15:49:19 +02:00
Julien Pivotto	f482c7bdd7	Add per scrape-config targets limit (#7554 ) * Add per scrape-config targets limit Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-07-30 14:20:24 +02:00
Frederic Branczyk	9f9fb1ab33	documentation: Adapt Kubernetes RBAC to use metrics roles (#3661 )	2020-07-24 16:36:56 +02:00
Julien Pivotto	48140e5189	Improve docker swarm configuration exemple Improve to use the unix socket as this is what is enabled by default. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-07-20 13:42:57 +02:00
Julien Pivotto	be96951c56	Add Docker Swarm configuration example (#7542 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-07-09 20:05:21 +02:00
John Bampton	98a69b77d1	Fix spelling (#7512 ) Signed-off-by: John Bampton <jbampton@users.noreply.github.com>	2020-07-04 14:54:26 +02:00
Tom Wilkie	27b1009acd	Rename the dashboard in the mixin to 'Prometheus Overview'. (#7489 ) Due to https://github.com/grafana/grafana/issues/15642, this prevents users putting this dashboard in a Grafana folder called 'Prometheus'. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2020-06-30 15:45:44 +01:00
Julien Pivotto	c61141ce51	Add DigitalOcean service discovery (#7407 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-06-18 17:04:41 +02:00
Manuel Fontan	6e7554639b	Update Readme since jsonnetfmt is available in the jsonnet go implementation since v0.16.0 Signed-off-by: Manuel Fontan <mfontangarcia@slack-corp.com>	2020-06-16 10:41:58 +01:00
TakumaNakagame	7a541bd9a7	fix document rabbitmq example (#7297 ) * remove prometheus.io annotations and add scrape_configs Signed-off-by: TakumaNakagame <5129906+TakumaNakagame@users.noreply.github.com>	2020-05-27 11:34:05 +01:00
Bartlomiej Plotka	1d13a2cd2f	Updated different swagger output. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>	2020-04-23 16:52:14 +01:00
Marek Slabicki	8224ddec23	Capitalizing first letter of all log lines (#7043 ) Signed-off-by: Marek Slabicki <thaniri@gmail.com>	2020-04-11 09:22:18 +01:00
Callum Styan	5400e71b91	Update mixin dashboards and alerts for new remote write label names. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2020-04-08 12:56:00 -07:00
qinng	e31b7b2679	[Doc] Fix wrong description in kubernetes expamle (#7012 ) Signed-off-by: guoruyi1 <guoruyi1@xiaomi.com> Co-authored-by: guoruyi1 <guoruyi1@xiaomi.com>	2020-03-20 08:03:43 +00:00
Julien Pivotto	ef63d8d16d	Update vendors Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-02-25 10:33:41 +01:00
Marco Pracucci	1e1785690a	Fix queue in alerts annotation Signed-off-by: Marco Pracucci <marco@pracucci.com>	2020-02-12 12:48:13 +01:00
paulfantom	7321f1d227	documentation/prometheus-mixin: add dependency on grafonnet Signed-off-by: paulfantom <pawel@krupa.net.pl>	2020-01-11 23:18:04 +01:00
Josh Soref	91d76c8023	Spelling (#6517 ) * spelling: alertmanager Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: attributes Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: autocomplete Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: bootstrap Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: caught Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: chunkenc Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: compaction Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: corrupted Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: deletable Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: expected Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: fine-grained Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: initialized Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: iteration Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: javascript Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: multiple Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: number Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: overlapping Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: possible Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: postings Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: procedure Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: programmatic Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: queuing Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: querier Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: repairing Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: received Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: reproducible Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: retention Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: sample Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: segements Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: semantic Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: software [LICENSE] Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: staging Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: timestamp Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: unfortunately Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: uvarint Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: subsequently Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: ressamples Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>	2020-01-02 15:54:09 +01:00
Callum Styan	f4fb6dc208	Simplify remote write dashboard in mixin. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-11-18 19:58:07 -08:00
beorn7	9c8f9bfa63	Fix the description template for PrometheusRemoteWriteDesiredShards Signed-off-by: beorn7 <beorn@grafana.com>	2019-10-30 13:27:37 +01:00
Björn Rabenstein	7c039a6b3b	Merge pull request #6242 from prometheus/beorn7/mixin Fix PrometheusRemoteWriteDesiredShards	2019-10-29 16:01:09 +01:00
Benoit Gagnon	6d931a2195	Fix Windows support for custom-sd adapter (#6217 ) * add test to custom-sd/adapter writeOutput() function Signed-off-by: Benoit Gagnon <benoit.gagnon@ubisoft.com> * fix Adapter.writeOutput() function to work on Windows On that platform, files cannot be moved while a process holds a handle to them. Added an explicit Close() before that move. With this change, the unit test succeeds. Signed-off-by: Benoit Gagnon <benoit.gagnon@ubisoft.com> * add missing dot to comment Signed-off-by: Benoit Gagnon <benoit.gagnon@ubisoft.com>	2019-10-29 10:41:31 +01:00
beorn7	61617eb2d9	Fix PrometheusRemoteWriteDesiredShards This rule has the same labels on both sides. We don't want `group_right` and `on`, we want nothing. Signed-off-by: beorn7 <beorn@grafana.com>	2019-10-29 00:23:39 +01:00
Callum Styan	da6d46625f	Repeat shards panels on the queue label. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-10-21 11:03:50 -07:00
Callum Styan	818974ff8f	Rewrite remote write dashboard using base grafonnet. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-10-17 15:40:58 -07:00
Callum Styan	81fa63006c	Add additional shards/segment graphs to remote write dashboard. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-10-09 09:59:02 -07:00
Simon Pasquier	e36ab7e192	prometheus-mixin: improve description of sample alerts (#6050 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-09-24 17:44:27 +02:00
Björn Rabenstein	3b3eaf3496	Merge pull request #5787 from cstyan/reshard-max-logging Add metrics for max/min/desired shards to queue manager.	2019-09-09 22:32:54 +02:00
Callum Styan	a98599bea8	Update remote write max shards alert; properly template/query for max shards in description. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-09-09 12:01:11 -07:00
李国忠	d89e783217	[bugfix] custom SD: when ip out of order, reflect.deepEqual can not correctly identify whether there is a change (#5856 ) * [bugfix] custom SD: when ip out of order, reflect.deepEqual can not correctly identify whether there is a change Signed-off-by: fuling <fuling.lgz@alibaba-inc.com> * [format] makefile:Makefile.common:116: common-style Signed-off-by: fuling <fuling.lgz@alibaba-inc.com> * [bugfix] custom sd: simonpasquier comment,It would be simpler to sort the targets alphabetically and keep reflect.DeepEqual. Signed-off-by: fuling <fuling.lgz@alibaba-inc.com> * [bugfix]custom SD:fix sort Signed-off-by: fuling <fuling.lgz@alibaba-inc.com> * [bugfix] custom SD : adapter.go need an empty line after "sort" Signed-off-by: fuling <fuling.lgz@alibaba-inc.com> * [bugfix]custom SD:test sign-off Signed-off-by: fuling <fuling.lgz@alibaba-inc.com> * [bugfix]custom SD: fix adaper_test.go Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>	2019-08-22 11:49:45 +02:00
Ganesh Vernekar	5ecef3542d	Cleanup after merging tsdb into prometheus Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2019-08-13 14:04:14 +05:30
Callum Styan	3b75614892	Add a warning alert, since the remote write behind alert will probably already be going off, about desired shards being higher than max shards. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-08-08 06:45:46 -07:00
Simon Pasquier	dd174963a2	prometheus-mixin: remove PrometheusTSDBWALCorruptions The counter is only increased when tsdb.Open() is called which Prometheus does only once in its lifetime (when it initializes). If the corruption can't be recovered, tsdb.Open() returns an error and Prometheus exits. Hence the metric is either 0 (no corruption) or 1 (corruption detected and repaired). If the latter, the alert isn't actionable and the only way to resolve it is to restart Prometheus which would reset the counter. Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-08-06 14:36:56 +02:00
Vadym Martsynovskyy	a9970a47ef	Fix incorrect examples in docs Signed-off-by: Vadym Martsynovskyy <vmartsynovskyy@gmail.com>	2019-08-04 16:42:42 -07:00
Matthias Loibl	20d12ff1c7	Fix prometheus-mixin dashboards to use grafanaDashboards Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>	2019-07-11 15:40:26 +02:00
beorn7	4825585834	Tweak tenses Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-28 17:37:49 +02:00
beorn7	9a2177949d	Protect gauge-based alerts against failed scrapes Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-28 16:46:19 +02:00
beorn7	52707535b8	Remove/improve unused variables and weird doc comments Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-28 15:41:31 +02:00
beorn7	7a25a2586d	Sync with alerts from kube-prometheus While doing so, re-introduce the summary/description annotations. Also, add a few more rules and tweak a few of the existing ones. Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-27 23:50:26 +02:00
beorn7	ded0705bdc	Update remote repo for grafana-builder dependency Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-27 14:39:38 +02:00
beorn7	1336a28848	Use a config variable for the Prometheus name Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-27 14:34:11 +02:00
beorn7	613cb5430c	Add a "work in progress" disclaimer. Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-26 23:24:22 +02:00
beorn7	e34af6d4d3	Address various comments from the review Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-26 23:22:16 +02:00
beorn7	23c03207e9	Fixed indentation Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-26 20:31:05 +02:00
beorn7	d5845ad05b	Fix formatting This is the outcome of `make fmt`. Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-26 16:23:25 +02:00
beorn7	d45e8a0f61	Adjust to jsonnet v0.13 Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-26 16:22:21 +02:00
beorn7	5c04ef3935	Make README.md immediately useful Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-26 16:12:59 +02:00
beorn7	ddfabda152	Add Makefile and suitable jsonnet files This makes the mixins usable as abvertised. Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-26 15:30:55 +02:00
beorn7	e943803a3c	Add .gitignore file Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-26 15:22:23 +02:00
Björn Rabenstein	498d31e178	Merge pull request #5681 from prometheus/beorn7/mixin Merge master into mixin	2019-06-19 23:17:41 +02:00
Callum Styan	a5762f3681	Add dashboard for remote write to prometheus-mixin. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-06-17 15:02:42 -07:00
beorn7	5639aaf0a4	Merge branch 'master' into mixin	2019-06-17 13:07:11 +02:00
Romain Baugue	95193fa027	Exhaust every request body before closing it (#5166 ) (#5479 ) From the documentation: > The default HTTP client's Transport may not > reuse HTTP/1.x "keep-alive" TCP connections if the Body is > not read to completion and closed. This effectively enable keep-alive for the fixed requests. Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>	2019-04-18 09:50:37 +01:00
qinng	cc75c27580	Fix multiple response.WriteHeader calls error in remote read adapter (#5159 ) * fix multiple response.WriteHeader calls in remote read adapter * remove useless return Signed-off-by: qinng <guoruyi1@xiaomi.com>	2019-04-10 13:25:35 +01:00
Tariq Ibrahim	8fdfa8abea	refine error handling in prometheus (#5388 ) i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors. ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives. iii) Does away with the use of fmt package for errors in favour of pkg/errors Signed-off-by: tariqibrahim <tariq181290@gmail.com>	2019-03-26 00:01:12 +01:00
Tom Wilkie	38a9bbbec2	Loosen off PrometheusRemoteWriteBehind alert. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2019-03-04 12:47:24 +00:00
Tom Wilkie	b615069289	Update metric names. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2019-03-01 07:39:48 -08:00
LongKB	23480bef43	Remove the duplicated words (#5251 ) Although it is spelling mistakes, it might make an affects while reading. Co-Authored-By: Nguyen Phuong An <AnNP@vn.fujitsu.com> Signed-off-by: Kim Bao Long <longkb@vn.fujitsu.com>	2019-02-22 14:32:34 +01:00
Nguyen Hai Truong	5fbda4c9d7	Secure http links (#5244 ) Fix http link to https link for secure, modify http to https in the links of project. Have some http links doesn't redirect into https. Co-Authored-By: Nguyen Van Trung trungnv@vn.fujitsu.com Signed-off-by: Nguyen Hai Truong <truongnh@vn.fujitsu.com>	2019-02-21 10:48:47 +01:00
Kim Bao Long	94f5352951	Trivial fix: Fix some typos in comments Co-Authored-By: Nguyen Phuong An <AnNP@vn.fujitsu.com> Signed-off-by: Kim Bao Long <longkb@vn.fujitsu.com>	2019-02-21 09:07:49 +07:00
Tom Wilkie	e248ffb220	Add alert for WAL remote write falling behind. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2019-02-12 15:22:58 +00:00
Callum Styan	5358f76c5c	update remote write path proto so that Labels/Timeseries can't be nil (#4957 ) Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-01-15 19:13:39 +00:00
Simon Pasquier	375ad1185c	: bump gRPC dependencies (#5075 ) : bump gRPC dependencies This change updates the gRPC dependencies to more recent versions: github.com/gogo/protobuf => v1.2.0 * github.com/grpc-ecosystem/grpc-gateway => v1.6.3 * google.golang.org/grpc => v1.17.0 In addition scripts/genproto.sh leverages Go modules information instead of hardcoding SHA1 commits. This ensures that the code is generated from the exact same sources. Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Run 'make proto' in CI Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Revert tabs -> spaces change Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Fix 'make proto' step Signed-off-by: Simon Pasquier <spasquie@redhat.com> * 'go get' grpc/protobuf dependencies Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Prepopulate cache with go mod download Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-01-15 15:32:05 +01:00
Simon Pasquier	f678e27eb6	: use latest release of staticcheck (#5057 ) : use latest release of staticcheck It also fixes a couple of things in the code flagged by the additional checks. Signed-off-by: Simon Pasquier <spasquie@redhat.com> Use official release of staticcheck Also run 'go list' before staticcheck to avoid failures when downloading packages. Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-01-04 14:47:38 +01:00
Thomas J. Fox	11a93b2b37	fix link in docs/examples/k8s-rabbitmq readme (#4967 ) Signed-off-by: Thomas J. Fox <thomasjfox1@gmail.com>	2018-12-18 11:53:55 +01:00
Yaqiang Wang	8b85d876f2	fix file_sd never stop update 'custom_sd.json' file in adapter.go (#4567 ) Signed-off-by: wangyaqiang1 <wangyaqiang1@jd.com>	2018-11-30 10:32:17 +01:00
Ben Kochie	c6399296dc	Fix spelling/typos (#4921 ) * Fix spelling/typos Fix spelling/typos reported by codespell/misspell. * UK -> US spelling changes. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-11-27 17:44:29 +01:00
Alex Yu	5dcce32ef8	update promlog to latest version (#4876 ) * update promlog to latest version Signed-off-by: Alex Yu <yu.alex96@gmail.com> * Update api tests, fix main setup Signed-off-by: Alex Yu <yu.alex96@gmail.com> * tidy go.sum Signed-off-by: Alex Yu <yu.alex96@gmail.com> * revendor prometheus/common Signed-off-by: Alex Yu <yu.alex96@gmail.com> * only initialize config; use kingpin for remote_storage_adapter Signed-off-by: Alex Yu <yu.alex96@gmail.com> * actually parse the flags Signed-off-by: Alex Yu <yu.alex96@gmail.com> * clean up imports Signed-off-by: Alex Yu <yu.alex96@gmail.com>	2018-11-23 14:22:40 +01:00
Tom Wilkie	638204c775	Typo Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2018-11-19 12:23:42 +00:00
Simon Pasquier	ed19373a78	: remove use of golang.org/x/net/context (#4869 ) : remove use of golang.org/x/net/context Signed-off-by: Simon Pasquier <spasquie@redhat.com> scrape: fix TestTargetScrapeScrapeCancel Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-11-19 12:31:16 +01:00
Tom Wilkie	8f42192e52	Add Prometheus alerts from kube-prometheus, remove the alertmanager alerts. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2018-11-19 11:22:55 +00:00
Tom Wilkie	dfbdf8d3bb	Add a basic readme with link to the mixin docs. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2018-11-16 17:23:14 +00:00
Tom Wilkie	5fd712b210	copypasta. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2018-11-16 17:17:47 +00:00
Tom Wilkie	50861d586a	Alert if more than 1% of alerts fail for a given integration. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2018-11-16 17:17:47 +00:00
Tom Wilkie	266ba185fe	Remove PromScrapeFailed alert. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2018-11-16 17:17:47 +00:00
Tom Wilkie	e8a8ce5654	Basic Prometheus dashboard. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2018-11-16 17:17:47 +00:00
Tom Wilkie	ee1427faad	Prometheus monitoring mixin for Prometheus itself. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2018-11-16 17:17:47 +00:00
Timo Beckers	b12ed54f95	documentation - add marathon-sd example configuration Signed-off-by: Timo Beckers <timo@incline.eu>	2018-10-25 18:02:59 +02:00
Brian Pandola	3241c527d0	Fix typo (#4760 ) Signed-off-by: Brian Pandola <bpandola@hsdp.io>	2018-10-18 21:19:21 +01:00
Simon Pasquier	07152ecc48	Merge pull request #4575 from Nexucis/bugfix/fix-unregistered-source [ServiceDiscovery] Unregister source when the target is empty	2018-09-27 09:12:01 +02:00
Augustin Husson	f60620ec0b	format comment Signed-off-by: Augustin Husson <husson.augustin@gmail.com>	2018-09-26 10:48:35 +02:00
Tom Wilkie	d3a1ff1abf	Reduce memory usage of remote read by reducing pointer usage. (#4655 ) Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	2018-09-25 19:14:00 +01:00
Augustin Husson	9e6dc6f96c	fix targetGroup that disappear totally Signed-off-by: Augustin Husson <husson.augustin@gmail.com>	2018-09-25 19:05:02 +02:00
Augustin Husson	3c0b130e5e	apply review Signed-off-by: Augustin Husson <husson.augustin@gmail.com>	2018-09-18 11:08:38 +02:00
beorn7	4fb59d1e61	Remove use of deprecated prometheus.Handler Signed-off-by: beorn7 <beorn@soundcloud.com>	2018-09-17 13:05:43 +02:00
Augustin Husson	e03869de76	add unit test and isolate the method that generate the target Signed-off-by: Augustin Husson <husson.augustin@gmail.com>	2018-09-16 23:50:10 +02:00
Augustin Husson	97950a3fae	remove group if the target is empty at adapter level Signed-off-by: Augustin Husson <husson.augustin@gmail.com>	2018-09-16 14:38:21 +02:00
Julius Volz	8fbe1b5133	Handle a bunch of unchecked errors (#4461 ) There are many more (mostly finalizers like Close/Stop/etc.), but most of the others seemed like one couldn't do much about them anyway. Signed-off-by: Julius Volz <julius.volz@gmail.com>	2018-08-17 17:24:35 +02:00
Harsh Agarwal	6a464ae174	expose log.level for promlog for remote_storage_adapter (#4195 ) * expose log.level for promlog for remote_storage_adapter Signed-off-by: sipian <cs15btech11019@iith.ac.in> * replace flag description Signed-off-by: Harsh Agarwal <cs15btech11019@iith.ac.in>	2018-07-22 16:11:38 +05:30
Julius Volz	d8153ac5d5	Update internal architecture diagram (#4398 ) Signed-off-by: Julius Volz <julius.volz@gmail.com>	2018-07-18 22:10:23 +02:00
Julius Volz	a215aed9b6	Document internal Prometheus server architecture (#4295 ) * Document internal Prometheus server architecture Signed-off-by: Julius Volz <julius.volz@gmail.com> * Review fixups Signed-off-by: Julius Volz <julius.volz@gmail.com>	2018-07-18 10:06:41 +02:00
Peter Gallerani	a9d5034add	Fix missing 'msg' in remote storage adapter main.go .Log info message (#4377 ) Signed-off-by: Peter Gallerani <peter.gallerani@gmail.com>	2018-07-12 20:54:21 +02:00
Callum Styan	d0ee4da932	fix minor issues in custom SD example (#4278 ) Signed-off-by: Callum Styan <callumstyan@gmail.com>	2018-06-18 16:08:02 +01:00
Callum Styan	03578d5df8	add example usage of SD adapter for converting unsupported SD type to filesd (#3720 ) Signed-off-by: Callum Styan <callumstyan@gmail.com>	2018-05-30 13:14:34 +01:00
Bartek Plotka	03a9e7f72e	example: Commented out annotation examples as they are meant only for example not as an idiomatic way of relabelling. Signed-off-by: Bartek Plotka <bwplotka@gmail.com>	2018-05-02 13:42:23 +01:00
Paul Gier	85a3c974b7	minor yaml indentation consistency fix in example configs (#3946 )	2018-03-11 23:06:13 +00:00
ferhat elmas	ffa673f7d8	General simplifications (#3887 ) Another try as in #1516	2018-02-26 07:58:10 +00:00
Goutham Veeramachaneni	3de10e3b44	Add CleanTombstones API endpoint Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>	2017-11-30 19:51:44 +05:30
root	80e5867a87	Fixed RBAC Example, added ingress privileges; @brian-brazil	2017-11-21 11:04:07 +01:00
Matthew Pound	e6dcc72f9a	Fix instructions on updating prometheus.yml for Remote Write Adapter Example (#3422 )	2017-11-07 20:02:43 +00:00
James Turnbull	4db6592d01	Removing external_labels from example conf file (#3409 ) It's unclear why this is in the example configuration file. Probably best to keep that super simple, c.f. https://github.com/prometheus/docs/pull/895#discussion_r148924390	2017-11-06 16:11:04 +00:00
David	e3b926c03b	Fix typo in AM config field static_configs (#3415 ) * typo in prometheus.yml field causes prometheus to throw an error Fixes #3414	2017-11-06 09:46:09 +00:00
Krasi Georgiev	5d8f93a22a	now using only github.com/gogo/protobuf bumped all grpc-gateway packages to v1.2.2 updated and run the denproto.sh script	2017-11-02 11:31:57 +00:00
Julien Pivotto	3382f39046	Explicitely add alertmanager to example config (#3383 ) As alertmanager needs to be configured in the config file in Prometheus 2, I think it is useful to have it in the example config. Also renamed the rules in the example config so they are explicitely yml files.	2017-10-31 22:02:08 +00:00
Julius Volz	099df0c5f0	Migrate "golang.org/x/net/context" -> "context" (#3333 ) In some places, where ctxhttp or gRPC are concerned, we still need to use the old contexts.	2017-10-24 21:21:42 -07:00
Fabian Reinartz	abf7c975c9	Merge branch 'master' into dev-2.0	2017-10-07 13:37:21 +02:00
Jack Neely	128b31d058	Log failure to send NaN values to remote store as Debug (#3235 ) This was a warning and can be a frequent occurrence. Let's not fill up logs unless we are asked to.	2017-10-06 11:22:55 +01:00
Goutham Veeramachaneni	3f0267c548	Merge branch 'dev-2.0' into go-kit/log Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>	2017-09-15 23:15:27 +05:30
Goutham Veeramachaneni	f5aed810f9	logging: Port to common/promlog Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>	2017-09-15 12:40:50 +05:30
Fabian Reinartz	e746282772	Merge branch 'master' into dev-2.0	2017-09-11 10:55:19 +02:00
Fabian Reinartz	d21f149745	*: migrate to go-kit/log	2017-09-08 22:01:51 +05:30
Fabian Reinartz	87918f3097	Merge branch 'master' into dev-2.0	2017-09-04 14:09:21 +02:00
Johannes 'fish' Ziemke	70f3d1e9f9	k8s: Support discovery of ingresses (#3111 ) * k8s: Support discovery of ingresses * Move additional labels below allocation This makes it more obvious why the additional elements are allocated. Also fix allocation for node where we only set a single label. * k8s: Remove port from ingress discovery * k8s: Add comment to ingress discovery example	2017-09-04 13:10:44 +02:00
Julius Volz	aa5cdcb11e	Remove extra space in log output	2017-08-29 15:24:00 +02:00
gdmello	35c952e344	Added logging for remote storage adapter (#3106 ) * Added logging for remote storage adapter on startup and on any error condition during /read or /write. * CR feedback.	2017-08-29 15:22:56 +02:00
Fabian Reinartz	25f3e1c424	Merge branch 'master' into mergemaster	2017-08-10 17:04:25 +02:00
Felicity	f30b10223a	documentation: update Kubernetes example for 1.7 (#2918 ) Kubernetes 1.7+ no longer exposes cAdvisor metrics on the Kubelet metrics endpoint. Update the example configuration to scrape cAdvisor in addition to Kubelet. The provided configuration works for 1.7.3+ and commented notes are given for 1.7.2 and earlier versions. Also remove the comment about node (Kubelet) CA not matching the master CA. Since the example no longer connects directly to the nodes, it doesn't matter what CA they're using. References: - https://github.com/kubernetes/kubernetes/issues/48483 - https://github.com/kubernetes/kubernetes/pull/49079	2017-07-21 14:10:02 +02:00
Tom Wilkie	cf105f9d57	Update example remote adapters for change in proto location.	2017-07-19 16:39:02 +01:00
Fabian Reinartz	32226e30f5	Guard reload and quit endpoints by flag	2017-07-11 14:25:07 +02:00
Fabian Reinartz	ccf9e62972	*: add admin grpc API	2017-07-10 09:14:14 +02:00
Julius Volz	e0f046396a	Fix InfluxDB retention policy usage in read adapter (#2781 )	2017-05-29 16:24:24 +02:00
Tom Wilkie	3141a6b36b	Compress remote storage requests and responses with unframed/raw snappy. (#2696 ) * Compress remote storage requests and responses with unframed/raw snappy, for compatibility with other languages. * Remove backwards compatibility code from remote_storage_adapter, update example_write_adapter * Add /documentation/examples/remote_storage/example_write_adapter/example_writer_adapter to .gitignore	2017-05-10 16:42:59 +02:00
Jorrit Salverda	14d0604aba	Kubernetes config scrape node via api proxy (#2641 ) * scrape kubelet metrics via api node proxy * add manifests to setup serviceaccount, clusterrole and clusterrolebinding to work with rbac * removed .cluster.local and added newline to address comments	2017-05-09 13:57:49 +02:00
Svend Sorensen	94a3e863e4	Document what ports are scraped by default in k8s example The Kubernetes pod SD creates a target for each declared port, as documented: https://prometheus.io/docs/operating/configuration/#pod > The pod role discovers all pods and exposes their containers as targets. For > each declared port of a container, a single target is generated. If a > container has no specified ports, a port-free target per container is created > for manually adding a port via relabeling. This results in the default port being the declared port, or no port if none are declared.	2017-05-01 15:58:48 -07:00
Brian Brazil	0e0fc5a7f4	Correct example name to adapter. (#2590 )	2017-04-10 17:24:53 +01:00
Brian Brazil	c813c824d4	Separate out remote read responses. Fixes #2574	2017-04-06 15:49:47 +01:00
Julius Volz	3581057ea4	Update remote storage bridge README.md	2017-04-03 01:42:49 +02:00
Julius Volz	b391cbb808	Add InfluxDB read-back support to remote storage bridge	2017-04-03 01:42:43 +02:00
Julius Volz	b5b0e00923	Merge pull request #2499 from prometheus/remote-read Remote Read	2017-03-27 14:43:44 +02:00
Julius Volz	428e1ad42c	Remove PromDash from architecture diagram	2017-03-23 13:11:05 +01:00
Julius Volz	815762a4ad	Move retrieval.NewHTTPClient -> httputil.NewClientFromConfig	2017-03-20 14:17:04 +01:00
Stephen Soltesz	3f29324e04	Fix kubernetes host:port relabel regex. This change corrects a bug introduced by PR https://github.com/prometheus/prometheus/pull/2427 The regex uses three groups: the hostname, an optional port, and the prefered port from a kubernetes annotation. Previously, the second group should have been ignored if a :port was not present in the input. However, making the port group optional with the "?" had the unintended side-effect of allowing the hostname regex "(.+)" to match greedily, which included the ":port" patterns up to the ";" separating the hostname from the kubernetes port annotation. This change updates the regex for the hostname to match any non-":" characters. This forces the regex to stop if a ":port" is present and allow the second group to match the optional port.	2017-02-16 14:46:04 -05:00
Stephen Soltesz	0b1790ee44	Match addresses with or without declared ports. This change updates port relabeling for pod and service discovery so the relabeling regex matches addresses with or without declared ports. As well, this change uses a consistent style in the replacement pattern for the two expressions. Previously, for both services or pods that did not have declared ports, the relabel config regex would fail to match: __meta_kubernetes_service_annotation_prometheus_io_port regex: (.+)(?::\d+);(\d+) __meta_kubernetes_pod_annotation_prometheus_io_port regex: (.+):(?:\d+);(\d+) Both regexes expected a <host>:<port> pattern. The new regex matches addresses with or without declared ports by making the :<port> pattern optional. __meta_kubernetes_service_annotation_prometheus_io_port __meta_kubernetes_pod_annotation_prometheus_io_port regex: (.+)(?::\d+)?;(\d+)	2017-02-14 20:12:38 -05:00
Julius Volz	beb3c4b389	Remove legacy remote storage implementations This removes legacy support for specific remote storage systems in favor of only offering the generic remote write protocol. An example bridge application that translates from the generic protocol to each of those legacy backends is still provided at: documentation/examples/remote_storage/remote_storage_bridge See also https://github.com/prometheus/prometheus/issues/10 The next step in the plan is to re-add support for multiple remote storages.	2017-02-14 17:52:05 +01:00
Svend Sorensen	3a96d0e267	Kubernetes SD: Fix namespace meta label Replace one more instance of `__meta_kubernetes_service_namespace` with `__meta_kubernetes_namespace`.	2017-02-06 13:28:12 -08:00
Julius Volz	b16371595d	Add standalone remote storage bridge example In preparation for removing specific remote storage implementations, this offers an example of how to achieve the same in a separate process. Rather than having three separate bridges for OpenTSDB, InfluxDB, and Graphite, I decided to support all in one binary. For now, this is in the example documenation directory, but perhaps we will want to make a first-class project / repository out of it.	2017-02-01 13:22:41 +01:00
beorn7	5770d9e545	Kubernetes SD: More fixes to example config - Avoid mentioning the `in_cluster` option. (It doesn't exist anymore.) - Replace `__meta_kubernetes_service_namespace` and `__meta_kubernetes_pod_namespace` (which don't exist anymore) by `__meta_kubernetes_namespace`.	2016-11-29 18:42:35 +01:00
gambrose	52c762e9f1	The defaults stated in the example config where wrong (#2110 ) * The stated defaults where wrong * Update prometheus.yml	2016-11-21 09:53:59 +01:00
Jimmi Dyson	473dd5b89a	Kubernetes SD: Add endpoints role to API servers job to actually discover some API servers	2016-11-10 09:46:36 +00:00
Jimmi Dyson	da23543f29	Kubernetes SD: Update example config to use endpoints role for API server discovery	2016-11-02 20:48:01 +00:00
Jimmi Dyson	4d37dca669	Kubernetes SD: Update config for discovery in 1.3	2016-11-02 15:06:20 +00:00
Julius Volz	b5163351bf	Simplify and fix remote write example After removing gRPC, this can be simplified again. Also, the configuration for the remote storage moved from flags to the config file.	2016-10-05 17:53:01 +02:00
Tom Wilkie	d83879210c	Switch back to protos over HTTP, instead of GRPC. My aim is to support the new grpc generic write path in Frankenstein. On the surface this seems easy - however I've hit a number of problems that make me think it might be better to not use grpc just yet. The explanation of the problems requires a little background. At weave, traffic to frankenstein need to go through a couple of services first, for SSL and to be authenticated. So traffic goes: internet -> frontend -> authfe -> frankenstein - The frontend is Nginx, and adds/removes SSL. Its done this way for legacy reasons, so the certs can be managed in one place, although eventually we imagine we'll merge it with authfe. All traffic from frontend is sent to authfe. - Authfe checks the auth tokens / cookie etc and then picks the service to forward the RPC to. - Frankenstein accepts the reads and does the right thing with them. First problem I hit was Nginx won't proxy http2 requests - it can accept them, but all calls downstream are http1 (see https://trac.nginx.org/nginx/ticket/923). This wasn't such a big deal, so it now looks like: internet --(grpc/http2)--> frontend --(grpc/http1)--> authfe --(grpc/http1)--> frankenstein Next problem was golang grpc server won't accept http1 requests (see https://groups.google.com/forum/#!topic/grpc-io/JnjCYGPMUms). It is possible to link a grpc server in with a normal go http mux, as long as the mux server is serving over SSL, as the golang http client & server won't do http2 over anything other than an SSL connection. This would require making all our service to service comms SSL. So I had a go a writing a grpc http1 server, and got pretty far. But is was a bit of a mess. So finally I thought I'd make a separate grpc frontend for this, running in parallel with the frontend/authfe combo on a different port - and first up I'd need a grpc reverse proxy. Ideally we'd have some nice, generic reverse proxy that only knew about a map from service names -> downstream service, and didn't need to decode & re-encode every request as it went through. It seems like this can't be done with golang's grpc library - see https://github.com/mwitkow/grpc-proxy/issues/1. And then I was surprised to find you can't do grpc from browsers! See http://www.grpc.io/faq/ - not important to us, but I'm starting to question why we decided to use grpc in the first place? It would seem we could have most of the benefits of grpc with protos over HTTP, and this wouldn't preclude moving to grpc when its a bit more mature? In fact, the grcp FAQ even admits as much: > Why is gRPC better than any binary blob over HTTP/2? > This is largely what gRPC is on the wire.	2016-09-15 23:21:54 +01:00
Julius Volz	aa3f2b7216	Generic write cleanups and changes. - fold metric name into labels - return initialization errors back to main - add snappy compression - better context handling - pre-allocation of labels - remove generic naming - other cleanups	2016-08-30 17:24:48 +02:00
Brian Brazil	36d2c4bd0b	Add generic write path using grpc. This uses a new proto format, with scope for multiple samples per timeseries in future. This will allow users to pump samples out to whatever they like without having to change the core Prometheus code. There's also an example receiver to save users figuring out the boilerplate themselves.	2016-08-30 17:19:18 +02:00
Fabian Reinartz	9a269b5507	Clarify comment on rule evaluation Fixes #1866	2016-08-03 08:29:51 +02:00
Audun Fauchald Strand	50e044bb00	added path to pods scrape job	2016-07-27 15:13:53 +02:00
William Stewart	f97cd29e47	Drop '__meta_kubernetes_role' since we have role in the config	2016-07-21 15:46:14 +02:00
William Stewart	599fafd2aa	Add node job	2016-07-21 15:45:42 +02:00
William Martin Stewart	58a3771e49	Add roles to prometheus kubernetes example Needed with Prometheus 1.0	2016-07-21 13:16:23 +02:00
Jimmi Dyson	5733de0dfe	Kubernetes SD: Update example config with TLS options	2016-06-27 14:38:51 +01:00
beorn7	44aa7ec46d	doc: Update scrape config in example prometheus.yml	2016-06-14 09:57:03 +02:00
Pieter Lange	427b322078	Minor typo	2016-05-24 11:12:42 +02:00
Patrick Bogen	ae413704e8	kubernetes pod-level discovery	2016-05-18 17:18:52 -07:00
Julius Volz	657d65d6d6	Remove invalid scrape timeout from example config. It can't be greater than the scrape interval. Let's just remove it.	2016-02-24 21:06:36 +01:00
Julius Volz	e3baa35e9f	Fix typo in documentation/examples/kubernetes-rabbitmq/README.md	2016-02-08 02:00:10 +01:00

... 2 3 4 5 6 ...

388 Commits (a574335d6bd09f301c94a909200fce0ae940db8b)