prometheus

Commit Graph

Author	SHA1	Message	Date
Paschalis Tsilias	c173cd57c9	Add a header to count retried remote write requests (#12729 ) Header name is `Retry-Attempt`, only set when >0. Signed-off-by: Marc Tuduri <marctc@protonmail.com> Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>	1 year ago
zenador	69edd8709b	Add warnings (and annotations) to PromQL query results (#12152 ) Return annotations (warnings and infos) from PromQL queries This generalizes the warnings we have already used before (but only for problems with remote read) as "annotations". Annotations can be warnings or infos (the latter could be false positives). We do not treat them different in the API for now and return them all as "warnings". It would be easy to distinguish them and return infos separately, should that appear useful in the future. The new annotations are then used to create a lot of warnings or infos during PromQL evaluations. Partially these are things we have wanted for a long time (e.g. inform the user that they have applied `rate` to a metric that doesn't look like a counter), but the new native histograms have created even more needs for those annotations (e.g. if a query tries to aggregate float numbers with histograms). The annotations added here are not yet complete. A prominent example would be a warning about a range too short for a rate calculation. But such a warnings is more tricky to create with good fidelity and we will tackle it later. Another TODO is to take annotations into account when evaluating recording rules. --------- Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	1 year ago
Arve Knudsen	156222cc50	Add context argument to LabelQuerier.LabelValues (#12665 ) Add context argument to LabelQuerier.LabelValues and LabelQuerier.SortedLabelValues. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
Arve Knudsen	a964349e97	Add context argument to LabelQuerier.LabelNames (#12666 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
Arve Knudsen	4451ba10b4	Add context argument to IndexReader.Postings (#12667 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
Arve Knudsen	6ef9ed0bc3	Add context argument to DB.Delete (#12834 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
Arve Knudsen	6daee89e5f	Add context argument to Querier.Select (#12660 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
Ziqi Zhao	eaaa21aa7f	promtool tsdb dump support native histogram (#12775 ) Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>	1 year ago
Gregor Zeitlinger	f01718262a	Unit tests for native histograms (#12668 ) promql: Extend testing framework to support native histograms This includes both the internal testing framework as well as the rules unit test feature of promtool. This also adds a bunch of basic tests. Many of the code level tests can now be converted to tests within the framework, and more tests can be added easily. --------- Signed-off-by: Harold Dost <h.dost@criteo.com> Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com> Signed-off-by: Stephen Lang <stephen.lang@grafana.com> Co-authored-by: Harold Dost <h.dost@criteo.com> Co-authored-by: Stephen Lang <stephen.lang@grafana.com> Co-authored-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>	1 year ago
Sylvain Rabot	4399959f79	Remove native histograms / memory snapshot restriction Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>	1 year ago
Goutham Veeramachaneni	ad4f514e66	Add OTLP Ingestion endpoint (#12571 ) * Add OTLP Ingestion endpoint We copy files from the otel-collector-contrib. See the README in `storage/remote/otlptranslator/README.md`. This supersedes: https://github.com/prometheus/prometheus/pull/11965 Signed-off-by: gouthamve <gouthamve@gmail.com> * Return a 200 OK It is what the OTEL Golang SDK expect :( https://github.com/open-telemetry/opentelemetry-go/issues/4363 Signed-off-by: Goutham <gouthamve@gmail.com> --------- Signed-off-by: gouthamve <gouthamve@gmail.com> Signed-off-by: Goutham <gouthamve@gmail.com>	1 year ago
Julien Pivotto	b3b669fd9a	Add experimental flag and docs Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	1 year ago
Rob Skillington	e1ace8d00e	Add PromQL format and label matcher set/delete commands to promtool Signed-off-by: Rob Skillington <rob@chronosphere.io> Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	1 year ago
Justin Lei	32d87282ad	Add Zstandard compression option for wlog (#11666 ) Snappy remains as the default compression but there is now a flag to switch the compression algorithm. Signed-off-by: Justin Lei <justin.lei@grafana.com>	1 year ago
Bryan Boreham	578e2b6a3f	re-order imports for linter Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
Bryan Boreham	5255bf06ad	Replace sort.Slice with faster slices.SortFunc The generic version is more efficient. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
João Vilaça	81394ea1c5	Add --run flag to promtool test rules Signed-off-by: João Vilaça <jvilaca@redhat.com>	1 year ago
Julien Pivotto	1214d314c3	Merge pull request #12225 from fgouteroux/feat/promtool_check_rules_stdin promtool: read from stdin if no filenames are provided in check rules	1 year ago
Julien Pivotto	771f512757	Merge pull request #12299 from fgouteroux/promtool_push_metrics_cmd feat(promtool): add push metrics command	1 year ago
François Gouteroux	58d38c4c56	fix: apply suggested changes Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>	1 year ago
tyltr	0941ea4afc	typo Signed-off-by: tyltr <tylitianrui@126.com>	1 year ago
François Gouteroux	f676d4a756	feat refactoring checkrules func Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>	2 years ago
Nidhey Nitin Indurkar	a8772a4178	Feat: Get block by id directly on promtool analyze & get latest block if ID not provided (#12031 ) * feat: analyze latest block or block by ID in CLI (promtool) Signed-off-by: nidhey27 <nidhey.indurkar@infracloud.io> * address remarks Signed-off-by: nidhey60@gmail.com <nidhey.indurkar@infracloud.io> * address latest review comments Signed-off-by: nidhey60@gmail.com <nidhey.indurkar@infracloud.io> --------- Signed-off-by: nidhey27 <nidhey.indurkar@infracloud.io> Signed-off-by: nidhey60@gmail.com <nidhey.indurkar@infracloud.io>	2 years ago
François Gouteroux	6ae4a46845	feat: enhance stdin check and add tests parsing error Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>	2 years ago
François Gouteroux	4341b98eb2	fix: apply suggested changes Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>	2 years ago
François Gouteroux	934c5ddb8d	feat: make push metrics labels generic and repeatable Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>	2 years ago
François Gouteroux	3524a16aa0	feat: add suggested changes, tests, and stdin support Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>	2 years ago
Baskar Shanmugam	905a0bd63a	Added 'limit' query parameter support to /api/v1/status/tsdb endpoint (#12336 ) * Added 'topN' query parameter support to /api/v1/status/tsdb endpoint Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Updated query parameter for tsdb status to 'limit' Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Corrected Stats() parameter name from topN to limit Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Fixed p.Stats CI failure Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> --------- Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>	2 years ago
Callum Styan	0d2108ad79	[tsdb] re-implement WAL watcher to read via a "notification" channel (#11949 ) * WIP implement WAL watcher reading via notifications over a channel from the TSDB code Signed-off-by: Callum Styan <callumstyan@gmail.com> * Notify via head appenders Commit (finished all WAL logging) rather than on each WAL Log call Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix misspelled Notify plus add a metric for dropped Write notifications Signed-off-by: Callum Styan <callumstyan@gmail.com> * Update tests to handle new notification pattern Signed-off-by: Callum Styan <callumstyan@gmail.com> * this test maybe needs more time on windows? Signed-off-by: Callum Styan <callumstyan@gmail.com> * does this test need more time on windows as well? Signed-off-by: Callum Styan <callumstyan@gmail.com> * read timeout is already a time.Duration Signed-off-by: Callum Styan <callumstyan@gmail.com> * remove mistakenly commited benchmark data files Signed-off-by: Callum Styan <callumstyan@gmail.com> * address some review feedback Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix missed changes from previous commit Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix issues from wrapper function Signed-off-by: Callum Styan <callumstyan@gmail.com> * try fixing race condition in test by allowing tests to overwrite the read ticker timeout instead of calling the Notify function Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix linting Signed-off-by: Callum Styan <callumstyan@gmail.com> --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	2 years ago
Björn Rabenstein	37fe9b89dc	Merge pull request #12055 from leizor/leizor/prometheus/issues/12009 Adjust samplesPerChunk from 120 to 220	2 years ago
François Gouteroux	b1bab7bc54	feat(promtool): add push metrics command Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>	2 years ago
Matthieu MOREL	bae9a21200	Merge branch 'main' into linter/nilerr Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2 years ago
beorn7	5b53aa1108	style: Replace `else if` cascades with `switch` Wiser coders than myself have come to the conclusion that a `switch` statement is almost always superior to a statement that includes any `else if`. The exceptions that I have found in our codebase are just these two: * The `if else` is followed by an additional statement before the next condition (separated by a `;`). * The whole thing is within a `for` loop and `break` statements are used. In this case, using `switch` would require tagging the `for` loop, which probably tips the balance. Why are `switch` statements more readable? For one, fewer curly braces. But more importantly, the conditions all have the same alignment, so the whole thing follows the natural flow of going down a list of conditions. With `else if`, in contrast, all conditions but the first are "hidden" behind `} else if `, harder to spot and (for no good reason) presented differently from the first condition. I'm sure the aforemention wise coders can list even more reasons. In any case, I like it so much that I have found myself recommending it in code reviews. I would like to make it a habit in our code base, without making it a hard requirement that we would test on the CI. But for that, there has to be a role model, so this commit eliminates all `if else` occurrences, unless it is autogenerated code or fits one of the exceptions above. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
beorn7	c3c7d44d84	lint: Adjust to the lint warnings raised by current versions of golint-ci We haven't updated golint-ci in our CI yet, but this commit prepares for that. There are a lot of new warnings, and it is mostly because the "revive" linter got updated. I agree with most of the new warnings, mostly around not naming unused function parameters (although it is justified in some cases for documentation purposes – while things like mocks are a good example where not naming the parameter is clearer). I'm pretty upset about the "empty block" warning to include `for` loops. It's such a common pattern to do something in the head of the `for` loop and then have an empty block. There is still an open issue about this: https://github.com/mgechev/revive/issues/810 I have disabled "revive" altogether in files where empty blocks are used excessively, and I have made the effort to add individual `// nolint:revive` where empty blocks are used just once or twice. It's borderline noisy, though, but let's go with it for now. I should mention that none of the "empty block" warnings for `for` loop bodies were legitimate. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
Ben Ye	fd3630b9a3	add ctx to QueryEngine interface Signed-off-by: Ben Ye <benye@amazon.com>	2 years ago
Justin Lei	052993414a	Add storage.tsdb.samples-per-chunk flag Signed-off-by: Justin Lei <justin.lei@grafana.com>	2 years ago
Matthieu MOREL	fb3eb21230	enable gocritic, unconvert and unused linters Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2 years ago
beorn7	c0879d64cf	promql: Separate `Point` into `FPoint` and `HPoint` In other words: Instead of having a “polymorphous” `Point` that can either contain a float value or a histogram value, use an `FPoint` for floats and an `HPoint` for histograms. This seemingly small change has a _lot_ of repercussions throughout the codebase. The idea here is to avoid the increase in size of `Point` arrays that happened after native histograms had been added. The higher-level data structures (`Sample`, `Series`, etc.) are still “polymorphous”. The same idea could be applied to them, but at each step the trade-offs needed to be evaluated. The idea with this change is to do the minimum necessary to get back to pre-histogram performance for functions that do not touch histograms. Here are comparisons for the `changes` function. The test data doesn't include histograms yet. Ideally, there would be no change in the benchmark result at all. First runtime v2.39 compared to directly prior to this commit: ``` name old time/op new time/op delta RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 542µs ± 1% +38.58% (p=0.000 n=9+8) RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 617µs ± 2% +36.48% (p=0.000 n=10+10) RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.36ms ± 2% +21.58% (p=0.000 n=8+10) RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 8.94ms ± 1% +14.21% (p=0.000 n=10+10) RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.30ms ± 1% +10.67% (p=0.000 n=9+10) RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.10ms ± 1% +11.82% (p=0.000 n=10+10) RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 11.8ms ± 1% +12.50% (p=0.000 n=8+10) RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 87.4ms ± 1% +12.63% (p=0.000 n=9+9) RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 32.8ms ± 1% +8.01% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.6ms ± 2% +9.64% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 117ms ± 1% +11.69% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 876ms ± 1% +11.83% (p=0.000 n=9+10) ``` And then runtime v2.39 compared to after this commit: ``` name old time/op new time/op delta RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 547µs ± 1% +39.84% (p=0.000 n=9+8) RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 616µs ± 2% +36.15% (p=0.000 n=10+10) RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.26ms ± 1% +12.20% (p=0.000 n=8+10) RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 7.95ms ± 1% +1.59% (p=0.000 n=10+8) RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.38ms ± 2% +13.49% (p=0.000 n=9+10) RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.02ms ± 1% +9.80% (p=0.000 n=10+9) RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 10.8ms ± 1% +3.08% (p=0.000 n=8+10) RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 78.1ms ± 1% +0.58% (p=0.035 n=9+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 33.5ms ± 4% +10.18% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.0ms ± 1% +7.98% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 107ms ± 1% +1.92% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 775ms ± 1% -1.02% (p=0.019 n=9+9) ``` In summary, the runtime doesn't really improve with this change for queries with just a few steps. For queries with many steps, this commit essentially reinstates the old performance. This is good because the many-step queries are the one that matter most (longest absolute runtime). In terms of allocations, though, this commit doesn't make a dent at all (numbers not shown). The reason is that most of the allocations happen in the sampleRingIterator (in the storage package), which has to be addressed in a separate commit. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
François Gouteroux	8472596fd0	fix: apply suggested changes Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>	2 years ago
François Gouteroux	034eb2b3f2	promtool: read from stdin if no filenames are provided in check rules Signed-off-by: François Gouteroux <francois.gouteroux@gmail.com>	2 years ago
Julien Pivotto	391473141d	Check health & ready: move to flags (#12223 ) This makes it more consistent with other command like import rules. We don't have stricts rules and uniformity accross promtool unfortunately, but I think it's better to only have the http config on relevant check commands to avoid thinking Prometheus can e.g. check the config over the wire. Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2 years ago
Ganesh Vernekar	5588cab8b2	Merge pull request #12173 from bboreham/builder-no-empty-labels labels: simplify call to get Labels from Builder	2 years ago
Nidhey Nitin Indurkar	3f7beeecc6	feat: health and readiness check of prometheus server in CLI (promtool) (#12096 ) * feat: health and readiness check of prometheus server in CLI (promtool) Signed-off-by: nidhey27 <nidhey.indurkar@infracloud.io>	2 years ago
Łukasz Mierzwa	f2b9a39a48	Use a random port in cmd/prometheus tests There are a few tests that will run prometheus command. This can test if there's already something listening on port :9090 since --web.listen-address defaults to 0.0.0.0:9090. To fix that we can tell prometheus to use a random port on loopback interface. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2 years ago
Bryan Boreham	b987afa7ef	labels: simplify call to get Labels from Builder It took a `Labels` where the memory could be re-used, but in practice this hardly ever benefitted. Especially after converting `relabel.Process` to `relabel.ProcessBuilder`. Comparing the parameter to `nil` was a bug; `EmptyLabels` is not `nil` so the slice was reallocated multiple times by `append`. Lastly `Builder.Labels()` now estimates that the final size will depend on labels added and deleted. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Julien Pivotto	1922db0586	Document command line tools Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2 years ago
Bryan Boreham	b96b89ef8b	Merge pull request #12048 from bboreham/faster-targets Scraping targets are synced by creating the full set, then adding/removing any which have changed. This PR speeds up the process of creating the full set. I added a benchmark for `TargetsFromGroup`; it uses configuration from a typical Kubernetes SD. The crux of the change is to do relabeling inside labels.Builder instead of converting to labels.Labels and back again for every rule. The change is broken into several commits for easier review. This is a breaking change to `scrape.PopulateLabels()`, but `relabel.Process` is left as-is, with a new `relabel.ProcessBuilder` option.	2 years ago
Julien Pivotto	0c56e5d014	Update our own dependencies, support proxy from env Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2 years ago
Bryan Boreham	f4fd9b0d68	scrape: re-use memory in TargetsFromGroup Common service discovery mechanisms such as Kubernetes can generate a lot of target groups, so this function was allocating a lot of memory which then immediately became garbage. Re-using the structures across an entire Sync saves effort. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	5cfe759348	scrape: make TargetsFromGroup work with Builder not []Label Save work converting to `Labels` then to `Builder`. `PopulateLabels()` now takes as Builder as input. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Julien Pivotto	599b70a05d	Add include scrape configs Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2 years ago
Martin Chodur	f1de2cec3d	fix: set the http round tripper fro promtool import command Signed-off-by: Martin Chodur <m.chodur@seznam.cz>	2 years ago
Martin Chodur	3ebe4b48db	feat: add promtool http config support Signed-off-by: Martin Chodur <m.chodur@seznam.cz>	2 years ago
Amin Borjian	90d6873c7f	promtool: add support of selecting timeseries for TSDB dump Dumping without any limit on the data being dumped will generate a large amount of data. Also, sometimes it is necessary to dump only a part of the data in order to change or transfer it. This change allows to specify a part of the data to dump and by default works same as before. (no public API change) Signed-off-by: Amin Borjian <borjianamin98@outlook.com>	2 years ago
Marc Tudurí	9474610baf	Support FloatHistogram in TSDB (#11522 ) Extends Appender.AppendHistogram function to accept the FloatHistogram. TSDB supports appending, querying, WAL replay, for this new type of histogram. Signed-off-by: Marc Tudurí <marctc@protonmail.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Bryan Boreham	10b27dfb84	Simplify IndexReader.Series interface Instead of passing in a `ScratchBuilder` and `Labels`, just pass the builder and the caller can extract labels from it. In many cases the caller didn't use the Labels value anyway. Now in `Labels.ScratchBuilder` we need a slightly different API: one to assign what will be the result, instead of overwriting some other `Labels`. This is safer and easier to reason about. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	19f300e6f0	Update package cmd/promtool tests for new labels.Labels type Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	bf2c827d91	Update package cmd/promtool for new labels.Labels type Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	d3d96ec887	tsdb/index: use ScratchBuilder to create Labels This necessitates a change to the `tsdb.IndexReader` interface: `index.Reader` is used from multiple goroutines concurrently, so we can't have state in it. We do retain a `ScratchBuilder` in `blockBaseSeriesSet` which is iterator-like. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	3c7de69059	storage: allow re-use of iterators Patterned after `Chunk.Iterator()`: pass the old iterator in so it can be re-used to avoid allocating a new object. (This commit does not do any re-use; it is just changing all the method signatures so re-use is possible in later commits.) Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Ganesh Vernekar	6dd4e907a3	Update dependencies for 2.40 (#11524 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Ganesh Vernekar	04b370da00	Disable snapshot on shutdown if native histograms are enabled (#11473 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Ganesh Vernekar	05b7af28ee	Merge pull request #11450 from codesome/fix-conflict Sync sparsehistogram branch with main branch	2 years ago
Ganesh Vernekar	648be89822	Merge remote-tracking branch 'upstream/main' into fix-conflict Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Ganesh Vernekar	081ad2d690	Update help text for enable-feature to mention native-histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Ganesh Vernekar	3cbf87b83d	Enable protobuf negotiation only when histograms are enabled Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Ganesh Vernekar	46b26c4f09	Fix notifier relabel changing the labels of active alerts (#11427 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Jesus Vazquez	e934d0f011	Merge 'main' into sparsehistogram Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>	2 years ago
Ganesh Vernekar	f34aeefe6e	Allow overlapping blocks by default (#11331 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Paschalis Tsilias	f2ee959354	Remove 'metadata-storage' CLI flag (#11351 ) Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>	2 years ago
Jesus Vazquez	c1b669bf9b	Add out-of-order sample support to the TSDB (#11075 ) * Introduce out-of-order TSDB support This implementation is based on this design doc: https://docs.google.com/document/d/1Kppm7qL9C-BJB1j6yb6-9ObG3AbdZnFUBYPNNWwDBYM/edit?usp=sharing This commit adds support to accept out-of-order ("OOO") sample into the TSDB up to a configurable time allowance. If OOO is enabled, overlapping querying are automatically enabled. Most of the additions have been borrowed from https://github.com/grafana/mimir-prometheus/ Here is the list ist of the original commits cherry picked from mimir-prometheus into this branch: - `4b2198d7ec` - `2836e5513f` - `00b379c3a5` - `ff0dc75758` - `a632c73352` - `c6f3d4ab33` - `5e8406a1d4` - `abde1e0ba1` - `e70e769889` - `df59320886` Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Dieter Plaetinck <dieter@grafana.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * gofumpt files Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Add license header to missing files Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix OOO tests due to existing chunk disk mapper implementation Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix truncate int overflow Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Add Sync method to the WAL and update tests Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * remove useless sync Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Update minOOOTime after truncating Head * Update minOOOTime after truncating Head Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix lint Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Add a unit test Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Load OutOfOrderTimeWindow only once per appender Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix OOO Head LabelValues and PostingsForMatchers Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix replay of OOO mmap chunks Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Remove unnecessary err check Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Prevent panic with ApplyConfig Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Run OOO compaction after restart if there is OOO data from WBL Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Apply Bartek's suggestions Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Refactor OOO compaction Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Address comments and TODOs - Added a comment explaining why we need the allow overlapping compaction toggle - Clarified TSDBConfig OutOfOrderTimeWindow doc - Added an owner to all the TODOs in the code Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Run go format Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix remaining review comments Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix tests Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Change wbl reference when truncating ooo in TestHeadMinOOOTimeUpdate Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> * Fix TestWBLAndMmapReplay test failure on windows Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Address most of the feedback Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Refactor the block meta for out of order Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix windows error Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix review comments Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Ganesh Vernekar 15064823+codesome@users.noreply.github.com Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Dieter Plaetinck <dieter@grafana.com> Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>	2 years ago
Ganesh Vernekar	d354f20c2a	Add a feature flag to control native histogram ingestion (#11253 ) * Add runtime config to control native histogram ingestion Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Make the config into a CLI flag Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Bryan Boreham	c438b50133	cmd/promtool: in tests use labels.FromStrings Replacing code which assumes the internal structure of `Labels`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	735914f692	cmd/prometheus: in tests use labels.FromStrings Replacing code which assumes the internal structure of `Labels`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Cosrider	bef6556ca5	delete redundant alias (#11180 ) Signed-off-by: Cosrider <cosrider7@gmail.com> Signed-off-by: Cosrider <cosrider7@gmail.com>	2 years ago
Paschalis Tsilias	5a8e202f94	Append metadata to the WAL in the scrape loop (#10312 ) * Append metadata to the WAL Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove extra whitespace; Reword some docstrings and comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use RLock() for hasNewMetadata check Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use single byte for metric type in RefMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Update proposed WAL format for single-byte type metadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Address first round of review comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Amend description of metadata in wal.md Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Correct key used to retrieve metadata from cache When we're setting metadata entries in the scrapeCace, we're using the p.Help(), p.Unit(), p.Type() helpers, which retrieve the series name and use it as the cache key. When checking for cache entries though, we used p.Series() as the key, which included the metric name _with_ its labels. That meant that we were never actually hitting the cache. We're fixing this by utiling the __name__ internal label for correctly getting the cache entries after they've been set by setHelp(), setType() or setUnit(). Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Put feature behind a feature flag Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Reorder WAL format document Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix CR comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Extract logic about changing metadata in an anonymous function Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Implement new proposed WAL format and amend relevant tests Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use 'const' for metadata field names Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Apply metadata to head memSeries in Commit, not in AppendMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add docstring and rename extracted helper in scrape.go Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix review comments around TestMetadata* tests Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Rebase with merged TSDB changes; fix duplicate definitions after rebase Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove leftover changes on db_test.go Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Rename feature flag Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Simplify updateMetadata helper function Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove extra newline Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>	2 years ago
Bryan Boreham	8b863c42dd	Optimise relabeling by re-using memory (#11147 ) * model/relabel: Add benchmark Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * model/relabel: re-use Builder across relabels Saves memory allocations. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * labels.Builder: allow re-use of result slice This reduces memory allocations where the caller has a suitable slice available. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * model/relabel: re-use source values slice To reduce memory allocations. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Unwind one change causing test failures Restore original behaviour in PopulateLabels, where we must not overwrite the input set. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * relabel: simplify values optimisation Use a stack-based array for up to 16 source labels, which will be the vast majority of cases. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * lint Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
beorn7	c9fd3c235d	Merge branch 'main' into sparsehistogram	2 years ago
Levi Harrison	fa9bc5184a	Update and fix interface (#11131 ) Signed-off-by: Levi Harrison <git@leviharrison.dev>	2 years ago
Levi Harrison	d61459d826	`no-default-scrape-port` feature flag (#9523 ) * Add `no-default-scrape-port` flag Signed-off-by: Levi Harrison <git@leviharrison.dev>	2 years ago
Paschalis Tsilias	d1122e0743	Introduce TSDB changes for appending metadata to the WAL (#10972 ) * Append metadata to the WAL Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove extra whitespace; Reword some docstrings and comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use RLock() for hasNewMetadata check Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use single byte for metric type in RefMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Update proposed WAL format for single-byte type metadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Implementa MetadataAppender interface for the Agent Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Address first round of review comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Amend description of metadata in wal.md Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Correct key used to retrieve metadata from cache When we're setting metadata entries in the scrapeCace, we're using the p.Help(), p.Unit(), p.Type() helpers, which retrieve the series name and use it as the cache key. When checking for cache entries though, we used p.Series() as the key, which included the metric name _with_ its labels. That meant that we were never actually hitting the cache. We're fixing this by utiling the __name__ internal label for correctly getting the cache entries after they've been set by setHelp(), setType() or setUnit(). Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Put feature behind a feature flag Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix AppendMetadata docstring Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Reorder WAL format document Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Change error message of AppendMetadata; Fix access of s.meta in AppendMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Reuse temporary buffer in Metadata encoder Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Only keep latest metadata for each refID during checkpointing Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix test that's referencing decoding metadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Avoid creating metadata block if no new metadata are present Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add tests for corrupt metadata block and relevant record type Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix CR comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Extract logic about changing metadata in an anonymous function Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Implement new proposed WAL format and amend relevant tests Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use 'const' for metadata field names Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Apply metadata to head memSeries in Commit, not in AppendMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add docstring and rename extracted helper in scrape.go Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add tests for tsdb-related cases Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix linter issues vol1 Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix linter issues vol2 Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix Windows test by closing WAL reader files Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Use switch instead of two if statements in metadata decoding Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix review comments around TestMetadata* tests Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Add code for replaying WAL; test correctness of in-memory data after a replay Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Remove scrape-loop related code from PR Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Address first round of comments Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Simplify tests by sorting slices before comparison Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix test to use separate transactions Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Empty out buffer and record slices after encoding latest metadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix linting issue Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Update calculation for DroppedMetadata metric Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Rename MetadataAppender interface and AppendMetadata method to MetadataUpdater/UpdateMetadata Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Reuse buffer when encoding latest metadata for each series Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Fix review comments; Check all returned error values using two helpers Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Simplify use of helpers Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com> * Satisfy linter Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>	2 years ago
beorn7	28f028e938	Merge branch 'main' into sparsehistogram	2 years ago
Julien Pivotto	7a2d24b76a	Fix flakiness in windows tests (#10983 ) Our windows CI is too slow, process takes lots of time to start. Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2 years ago
Julien Pivotto	13bd4fd3c8	Fix promtool check config not erroring properly on failures (#10952 ) Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2 years ago
lixin18	735a07444a	Update main_unix_test.go (#10917 ) so->,so Signed-off-by: lixin18 <68135097+lixin963@users.noreply.github.com>	2 years ago
beorn7	40ad5e284a	Merge branch 'main' into beorn7/sparsehistogram	3 years ago
David Leadbeater	355b8bcf0b	Add --lint-fatal option (#10815 ) This keeps the previous behaviour of printing details about duplicate rules but doesn't exit with a fatal exit code unless turned on. Signed-off-by: David Leadbeater <dgl@dgl.cx>	3 years ago
Ben Kochie	9570924511	Merge pull request #9638 from prometheus/superq/agentMode Add agent mode identifier	3 years ago
Matthieu MOREL	36eee11434	refactor (package cmd): move from github.com/pkg/errors to 'errors' and 'fmt' packages (#10733 ) Signed-off-by: Matthieu MOREL <mmorel-35@users.noreply.github.com> Co-authored-by: Matthieu MOREL <mmorel-35@users.noreply.github.com>	3 years ago
Łukasz Mierzwa	44e5f220c0	Move prometheus_ready metric to web package (#10729 ) This moves prometheus_ready to the web package and links it with the ready variable that decides if HTTP requests should return 200 or 503. This is a follow up change from #10682 Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	3 years ago
Łukasz Mierzwa	070e409dba	Add prometheus_ready metric (#10682 ) When Prometheus starts it can take a long time before WAL is replayed and it can do anything useful. While it's starting it exposes metrics and other Prometheus servers can scrape it. We do have alerts that fire if any Prometheus server is not ingesting samples and so far we've been dealing with instances that are starting for a long time by adding a check on Prometheus process uptime. Relying on uptime isn't ideal because the time needed to start depends on the number of metrics scraped, and so on the amount of data in WAL. To help write better alerts it would be great if Prometheus exposed a metric that tells us it's fully started, that way any alert that suppose to notify us about any runtime issue can filter out starting instances. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	3 years ago
Ben Ye	af5ea213f7	promtool: support matchers when querying label values (#10727 ) * promtool: support matchers when querying label values Signed-off-by: Ben Ye <ben.ye@bytedance.com> * address review comment Signed-off-by: Ben Ye <ben.ye@bytedance.com>	3 years ago
Łukasz Mierzwa	d3c9c4f574	Stop rule manager before TSDB is stopped (#10680 ) During shutdown TSDB is stopped before rule manager is stopped. Since TSDB shutdown can take a long time (minutes or 10s of minutes) it keeps rule manager running while parts of Prometheus are already stopped (most notebly scrape manager). This can cause false positive alerts to fire, mostly those that rely on absent() calls since new sample appends will stop while alert queries are still evaluated. Stop rules before stopping TSDB and scrape manager to avoid this problem. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	3 years ago
Alban Hurtaud	41630b8e88	Add hidden flag to configure discovery loop interval (#10634 ) * Add hidden flag to configure discovery loop interval Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>	3 years ago
beorn7	3bc711e333	Merge branch 'main' into sparsehistogram	3 years ago
Matthieu MOREL	e2ede285a2	refactor: move from io/ioutil to io and os packages (#10528 ) * refactor: move from io/ioutil to io and os packages * use fs.DirEntry instead of os.FileInfo after os.ReadDir Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>	3 years ago
Filip Petkovski	1c1b174a8e	Add a --lint flag to the promtool check rules and check config commands (#10435 ) * Add a --lint flag to the promtool check rules and check config commands Checking rules with promtool emits warnings in the case of duplicate rules. These warnings do not result in a non-zero exit code and are difficult to spot in CI environments. Additionally, checking for duplicates is closer to a lint check rather than a syntax check. This commit adds a --lint flag to commands which include checking rules. The flag can be used to enable or disable certain linting options and cause the execution to return a non-zero exit code in case those options are not met. Signed-off-by: fpetkovski <filip.petkovsky@gmail.com> * Exit with status 3 on lint error Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>	3 years ago
beorn7	7ee1836ef5	Merge branch 'main' into sparsehistogram	3 years ago
Julien Pivotto	390956d317	Log gomaxprocs messages (#10506 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	3 years ago
TomasKohout	c0fd228bad	Add dependency on go.uber.org/automaxprocs (#10498 ) * add dependency on go.uber.org/automaxprocs Signed-off-by: Tomas Kohout <tomas.kohout1995@gmail.com> Co-authored-by: Peter Bourgon <peterbourgon@users.noreply.github.com> Co-authored-by: Julien Pivotto <roidelapluie@gmail.com>	3 years ago

1 2 3 4 5 ...

735 Commits (c8c1ab36dc0ce6fda90e0f0a266e6f3772384ca3)