prometheus

Commit Graph

Author	SHA1	Message	Date
Julius Volz	1a2c645dfa	Correctly handle error unwrapping in rules and remote write receiver errors.Unwrap() actually dangerously returns nil if the error does not have an Unwrap() method, which is the case in at least one of these places where I noticed that no error was being logged at all when it should have. Signed-off-by: Julius Volz <julius.volz@gmail.com>	2 years ago
Dimitar Dimitrov	03ab8dcca0	Add comments on EvalTimestamp Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>	2 years ago
Ganesh Vernekar	46b26c4f09	Fix notifier relabel changing the labels of active alerts (#11427 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Dimitar Dimitrov	3fb881af26	Simplify rule group's EvalTimestamp formula I found it hard to understand how EvalTimestamp works, so I wanted to simplify the math there. This PR should be a noop. Current formula is: ``` offset = g.hash % g.interval adjNow = startTime - offset base = adjNow - (adjNow % g.interval) EvalTimestamp = base + offset ``` I simplify `EvalTimestamp` ``` EvalTimestamp = base + offset # expand base = adjNow - (adjNow % g.interval) + offset # expand adjNow = startTime - offset - ((startTime - offset) % g.interval) + offset # cancel out offset = startTime - ((startTime - offset) % g.interval) # expand A+B (mod M) = (A (mod M) + B (mod M)) (mod M) = startTime - (startTime % g.interval - offset % g.interval) % g.interval # expand offset = startTime - (startTime % g.interval - ((g.hash % g.interval) % g.interval)) % g.interval # remove redundant mod g.interval = startTime - (startTime % g.interval - g.hash % g.interval) % g.interval # simplify (A (mod M) + B (mod M)) (mod M) = A+B (mod M) = startTime - (startTime - g.hash) % g.interval offset = (startTime - g.hash) % g.interval EvalTimestamp = startTime - offset ``` Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>	2 years ago
Bryan Boreham	8297f5cb6b	rules: in tests use labels.FromStrings And a number of `EmptyLabels()` instead of `nil`. Replacing code which assumes the internal structure of `Labels`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Cosrider	bef6556ca5	delete redundant alias (#11180 ) Signed-off-by: Cosrider <cosrider7@gmail.com> Signed-off-by: Cosrider <cosrider7@gmail.com>	2 years ago
Bryan Boreham	8b863c42dd	Optimise relabeling by re-using memory (#11147 ) * model/relabel: Add benchmark Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * model/relabel: re-use Builder across relabels Saves memory allocations. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * labels.Builder: allow re-use of result slice This reduces memory allocations where the caller has a suitable slice available. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * model/relabel: re-use source values slice To reduce memory allocations. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Unwind one change causing test failures Restore original behaviour in PopulateLabels, where we must not overwrite the input set. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * relabel: simplify values optimisation Use a stack-based array for up to 16 source labels, which will be the vast majority of cases. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * lint Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Jimmie Han	a5fea2cdd0	Use atomic field avoid (AlertingRule).mtx wait when template expanding (#10858 ) Use atomic field avoid (*AlertingRule).mtx wait when template expanding (#10703) Signed-off-by: hanjm <hanjinming@outlook.com>	2 years ago
Matthieu MOREL	ddfa9a7cc5	refactor (rules): move from github.com/pkg/errors to 'errors' and 'fmt' (#10855 ) * refactor (rules): move from github.com/pkg/errors to 'errors' and 'fmt' Signed-off-by: Matthieu MOREL <mmorel-35@users.noreply.github.com>	2 years ago
Julien Pivotto	3a56817a30	Rules: set otel status to ERROR when a rule fails (#10745 ) Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	3 years ago
Julien Pivotto	0d94cdf107	rules: remove classic UI code (#10730 ) Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	3 years ago
Łukasz Mierzwa	d3c9c4f574	Stop rule manager before TSDB is stopped (#10680 ) During shutdown TSDB is stopped before rule manager is stopped. Since TSDB shutdown can take a long time (minutes or 10s of minutes) it keeps rule manager running while parts of Prometheus are already stopped (most notebly scrape manager). This can cause false positive alerts to fire, mostly those that rely on absent() calls since new sample appends will stop while alert queries are still evaluated. Stop rules before stopping TSDB and scrape manager to avoid this problem. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	3 years ago
Matthieu MOREL	e2ede285a2	refactor: move from io/ioutil to io and os packages (#10528 ) * refactor: move from io/ioutil to io and os packages * use fs.DirEntry instead of os.FileInfo after os.ReadDir Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>	3 years ago
Wilbert Guo	83a2e52bc2	Add SyncForState Implementation for Ruler HA (#10070 ) * continuously syncing activeAt for alerts Signed-off-by: Yijie Qin <qinyijie@amazon.com> Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * add import Signed-off-by: Yijie Qin <qinyijie@amazon.com> Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Refactor SyncForState and add unit tests Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Format code Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * Add hook for syncForState Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix go lint Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Refactor syncForState override implementation Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Add syncForState override func as argument to Update() Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix go formatting Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Fix circleci test errors Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> Remove overrideFunc as argument to run() Signed-off-by: Wilbert Guo <wilbeguo@amazon.com> * remove the syncForState Signed-off-by: Yijie Qin <qinyijie@amazon.com> * use the override function to decide if need to replace the activeAt or not Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix test case Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix format Signed-off-by: Yijie Qin <qinyijie@amazon.com> * Trigger build Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fixing comments Signed-off-by: Yijie Qin <qinyijie@amazon.com> * return the result of map of alerts instead of single one Signed-off-by: Yijie Qin <qinyijie@amazon.com> * upper case the QueryforStateSeries Signed-off-by: Yijie Qin <qinyijie@amazon.com> * use a more generic rule group post process function type Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix indentation Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix gofmt Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix lint Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fixing naming Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fix comments Signed-off-by: Yijie Qin <qinyijie@amazon.com> * add the lastEvalTimestamp as parameter Signed-off-by: Yijie Qin <qinyijie@amazon.com> * fmt Signed-off-by: Yijie Qin <qinyijie@amazon.com> * change funcType to func Signed-off-by: Yijie Qin <qinyijie@amazon.com> Co-authored-by: Yijie Qin <qinyijie@amazon.com> Co-authored-by: Yijie Qin <63399121+qinxx108@users.noreply.github.com>	3 years ago
Alan Protasio	606ef33d91	Track and report Samples Queried per query We always track total samples queried and add those to the standard set of stats queries can report. We also allow optionally tracking per-step samples queried. This must be enabled both at the engine and query level to be tracked and rendered. The engine flag is exposed via a Prometheus feature flag, while the query flag is set when stats=all. Co-authored-by: Alan Protasio <approtas@amazon.com> Co-authored-by: Andrew Bloomgarden <blmgrdn@amazon.com> Co-authored-by: Harkishen Singh <harkishensingh@hotmail.com> Signed-off-by: Andrew Bloomgarden <blmgrdn@amazon.com>	3 years ago
Alvin Lin	cd739214dd	Log rule name when evaluating rule groups' Eval function logs anything (#10454 ) * Add benchingmark test for rule group eval Signed-off-by: Alvin Lin <alvinlin@amazon.com>	3 years ago
Matej Gera	2c61d29b2a	Tracing: Migrate to OpenTelemetry library (#9724 ) Signed-off-by: Matej Gera <matejgera@gmail.com>	3 years ago
Björn Rabenstein	7e42acd3b1	tsdb: Rework iterators (#9877 ) - Pick At... method via return value of Next/Seek. - Do not clobber returned buckets. - Add partial FloatHistogram suppert. Note that the promql package is now _only_ dealing with FloatHistograms, following the idea that PromQL only knows float values. As a byproduct, I have removed the histogramSeries metric. In my understanding, series can have both float and histogram samples, so that metric doesn't make sense anymore. As another byproduct, I have converged the sampleBuf and the histogramSampleBuf in memSeries into one. The sample type stored in the sampleBuf has been extended to also contain histograms even before this commit. Signed-off-by: beorn7 <beorn@grafana.com>	3 years ago
beorn7	c954cd9d1d	Move packages out of deprecated pkg directory This creates a new `model` directory and moves all data-model related packages over there: exemplar labels relabel rulefmt textparse timestamp value All the others are more or less utilities and have been moved to `util`: gate logging modetimevfs pool runtime Signed-off-by: beorn7 <beorn@grafana.com>	3 years ago
Bryan Boreham	26d8ae0e41	Rules: simplify map key for stale series detection The rules manager keeps a note of which series were generated by the last run, so it can write a stale marker to those that disappeared. Since the keys are not for human eyes, we can use a simpler format and save the effort of quoting label values. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	3 years ago
Yijie Qin	6fce45838a	Add access function for restoration state of alerting rule (#9665 )	3 years ago
Ganesh Vernekar	c8b267efd6	Get histograms from TSDB to the rate() function implementation Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	3 years ago
Mateusz Gozdek	1a6c2283a3	Format Go source files using 'gofumpt -w -s -extra' Part of #9557 Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	3 years ago
Levi Harrison	d81bbe154d	Rule alerts/series limit updates (#9541 ) * Add docs and do not limit inactive alerts. Signed-off-by: Levi Harrison <git@leviharrison.dev>	3 years ago
Levi Harrison	dc2f1993d8	Limit number of alerts or series produced by a rule (#9260 ) * Add limit to rules Signed-off-by: Levi Harrison <git@leviharrison.dev>	3 years ago
George Robinson	049b4f4f13	Support customization of template options in TemplateExpander (#9290 ) Signed-off-by: George Robinson <george.robinson@grafana.com>	3 years ago
Levi Harrison	8c29046ab2	Remove unneeded state modifications Signed-off-by: Levi Harrison <git@leviharrison.dev>	3 years ago
Michal Wasilewski	3f686cad8b	fixes yamllint errors Signed-off-by: Michal Wasilewski <mwasilewski@gmx.com>	3 years ago
Levi Harrison	b5f6f8fb36	Switched to go-kit/log Signed-off-by: Levi Harrison <git@leviharrison.dev>	4 years ago
Levi Harrison	26274527df	Updated/added tests Signed-off-by: Levi Harrison <git@leviharrison.dev>	4 years ago
Levi Harrison	17ea8d006a	Added external URL access Signed-off-by: Levi Harrison <git@leviharrison.dev>	4 years ago
Owen Diehl	23999df27c	expose rule metrics fields Signed-off-by: Owen Diehl <ow.diehl@gmail.com>	4 years ago
Goutham Veeramachaneni	2efdf660b1	Increase evaluation failures on Commit() (#8770 ) I think we should increment the metric here, we're setting the rule health anyways. This means even if the "evaluation" suceeded, none of the samples made it to storage. This is a simplified solution to: https://github.com/prometheus/prometheus/pull/8410/ Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>	4 years ago
Goutham Veeramachaneni	4b5ab80ca6	[rule] Update rule health for append/commit fails (#8619 ) * [rule] Update rule health for append/commit fails Similar to https://github.com/prometheus/prometheus/pull/8410 will provide more context. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Add test for updating health on append fails Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>	4 years ago
Tom Wilkie	7369561305	Combine Appender.Add and AddFast into a single Append method. (#8489 ) This moves the label lookup into TSDB, whilst still keeping the cached-ref optimisation for repeated Appends. This makes the API easier to consume and implement. In particular this change is motivated by the scrape-time-aggregation work, which I don't think is possible to implement without it as it needs access to label values. Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>	4 years ago
jessicagreben	75654715d3	fix panics Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	4 years ago
jessicagreben	6980bcf671	unexport backfiller Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	4 years ago
Julien Pivotto	6c56a1faaa	Testify: move to require (#8122 ) * Testify: move to require Moving testify to require to fail tests early in case of errors. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * More moves Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	4 years ago
Julien Pivotto	1282d1b39c	Refactor test assertions (#8110 ) * Refactor test assertions This pull request gets rid of assert.True where possible to use fine-grained assertions. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	4 years ago
Julien Pivotto	4e5b1722b3	Move away from testutil, refactor imports (#8087 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	4 years ago
Łukasz Mierzwa	19c190b406	Add a rule_group_samples metric (#7977 ) This new metric allows tracking how many samples did each rule group generate. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	4 years ago
李国忠	4a52faf2ae	Unnecessary go routine spawn. (#7951 ) Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>	4 years ago
jessicagreben	dfa510086b	add alignment, mv rule importer to promtool dir, add queryRange Signed-off-by: jessicagreben <jessicagrebens@gmail.com>	4 years ago
Anonymous	8219b442c8	rules: Remove redundant RLock to avoid double RLock (#7183 ) Signed-off-by: BurtonQin <bobbqqin@gmail.com>	4 years ago
johncming	2f2a51a43a	web/api/v1: make names consistent. (#7841 ) Signed-off-by: johncming <johncming@yahoo.com>	4 years ago
Goutham Veeramachaneni	cb830b0a9c	Label rule_group_iterations metric with group name (#7823 ) * Label rule_group_iterations metric with group name evalTotal and evalFailures having the label but iterations not having it is an odd mismatch. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Remove the metrics when a group is deleted. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Initialise the metrics Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>	4 years ago
johncming	362080ba28	rules: add evaluationTimestamp when copy state. (#7775 ) Signed-off-by: johncming <johncming@yahoo.com>	4 years ago
Callum Styan	c7a17f6491	Add additional tag/log to rules Manager Eval trace span. (#7708 ) Signed-off-by: Callum Styan <callumstyan@gmail.com>	4 years ago
Julien Pivotto	e76c436e9c	Goleak in discoveries, scrape, rules (#7662 ) * Add go leak tests for discoveries with goroutines Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Add go leak tests in rules Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Add go leak tests in scrape tests Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	4 years ago
Annanay	7f98a744e5	Add context to Appender interface Signed-off-by: Annanay <annanayagarwal@gmail.com>	4 years ago

1 2 3 4 5 ...

615 Commits (cebcdce78a7412c8821e9b1e794f0c2b5e714043)