prometheus

Commit Graph

Author	SHA1	Message	Date
Charles Korn	15fa680117	Add benchmark for query using timestamp() Signed-off-by: Charles Korn <charles.korn@grafana.com>	1 year ago
cui fliter	096ceca44f	remove repetitive words (#12556 ) Signed-off-by: cui fliter <imcusg@gmail.com>	1 year ago
beorn7	162612ea86	histograms: Improve comment Oversight during review of #12525. Signed-off-by: beorn7 <beorn@grafana.com>	1 year ago
Ziqi Zhao	42d9169ba1	enhance histogram_quantile to get min/max value Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>	1 year ago
Carrie Edwards	2f9bc98b8a	Add tests for min and max functions Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>	1 year ago
Carrie Edwards	bc0ee4a469	Implement native histogram min and max query functions Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>	1 year ago
Bryan Boreham	ce153e3fff	Replace sort.Sort with faster slices.SortFunc The generic version is more efficient. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
Giedrius Statkevičius	3f230fc9f8	promql: convert QueryOpts to interface Convert QueryOpts to an interface so that downstream projects like https://github.com/thanos-community/promql-engine could extend the query options with engine specific options that are not in the original engine. Will be used to enable query analysis per-query. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	1 year ago
Bryan Boreham	67d2ef004d	Placate lint I think the version using scoping was better, but I'm out of energy to fight the linter. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	bb0d8320dd	promql: include parsing in active-query tracking So that the max-concurrency limit is applied. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	71fc4f1516	promql: refactor: create query object before parsing Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	1f3821379c	promql: refactor: extract fn to wait on concurrency limit Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
zenador	191bf9055b	Handle more arithmetic operators for native histograms (#12262 ) Handle more arithmetic operators and aggregators for native histograms This includes operators for multiplication (formerly known as scaling), division, and subtraction. Plus aggregations for average and the avg_over_time function. Stdvar and stddev will (for now) ignore histograms properly (rather than counting them but adding a 0 for them). Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2 years ago
beorn7	9e500345f3	textparse/scrape: Add option to scrape both classic and native histograms So far, if a target exposes a histogram with both classic and native buckets, a native-histogram enabled Prometheus would ignore the classic buckets. With the new scrape config option `scrape_classic_histograms` set, both buckets will be ingested, creating all the series of a classic histogram in parallel to the native histogram series. For example, a histogram `foo` would create a native histogram series `foo` and classic series called `foo_sum`, `foo_count`, and `foo_bucket`. This feature can be used in a migration strategy from classic to native histograms, where it is desired to have a transition period during which both native and classic histograms are present. Note that two bugs in classic histogram parsing were found and fixed as a byproduct of testing the new feature: 1. Series created from classic _gauge_ histograms didn't get the _sum/_count/_bucket prefix set. 2. Values of classic _float_ histograms weren't parsed properly. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
Justin Lei	7bbf24b707	Make MemoizedSeriesIterator not implement chunkenc.Iterator Signed-off-by: Justin Lei <justin.lei@grafana.com>	2 years ago
Justin Lei	6985dcbe73	Optimize and test MemoizedSeriesIterator Signed-off-by: Justin Lei <justin.lei@grafana.com>	2 years ago
Matthieu MOREL	7e9acc2e46	golangci-lint: remove skip-cache and restore singleCaseSwitch rule Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2 years ago
beorn7	5b53aa1108	style: Replace `else if` cascades with `switch` Wiser coders than myself have come to the conclusion that a `switch` statement is almost always superior to a statement that includes any `else if`. The exceptions that I have found in our codebase are just these two: * The `if else` is followed by an additional statement before the next condition (separated by a `;`). * The whole thing is within a `for` loop and `break` statements are used. In this case, using `switch` would require tagging the `for` loop, which probably tips the balance. Why are `switch` statements more readable? For one, fewer curly braces. But more importantly, the conditions all have the same alignment, so the whole thing follows the natural flow of going down a list of conditions. With `else if`, in contrast, all conditions but the first are "hidden" behind `} else if `, harder to spot and (for no good reason) presented differently from the first condition. I'm sure the aforemention wise coders can list even more reasons. In any case, I like it so much that I have found myself recommending it in code reviews. I would like to make it a habit in our code base, without making it a hard requirement that we would test on the CI. But for that, there has to be a role model, so this commit eliminates all `if else` occurrences, unless it is autogenerated code or fits one of the exceptions above. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
beorn7	c3c7d44d84	lint: Adjust to the lint warnings raised by current versions of golint-ci We haven't updated golint-ci in our CI yet, but this commit prepares for that. There are a lot of new warnings, and it is mostly because the "revive" linter got updated. I agree with most of the new warnings, mostly around not naming unused function parameters (although it is justified in some cases for documentation purposes – while things like mocks are a good example where not naming the parameter is clearer). I'm pretty upset about the "empty block" warning to include `for` loops. It's such a common pattern to do something in the head of the `for` loop and then have an empty block. There is still an open issue about this: https://github.com/mgechev/revive/issues/810 I have disabled "revive" altogether in files where empty blocks are used excessively, and I have made the effort to add individual `// nolint:revive` where empty blocks are used just once or twice. It's borderline noisy, though, but let's go with it for now. I should mention that none of the "empty block" warnings for `for` loop bodies were legitimate. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
Ben Ye	fd3630b9a3	add ctx to QueryEngine interface Signed-off-by: Ben Ye <benye@amazon.com>	2 years ago
ianwoolf	79e4bdee8e	add Close for ActiveQueryTracker to close the file. Signed-off-by: ianwoolf <btw515wolf2@gmail.com>	2 years ago
Matthieu MOREL	fb3eb21230	enable gocritic, unconvert and unused linters Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2 years ago
beorn7	551de0346f	promql: Do not return nil slices to the pool Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
beorn7	817a2396cb	Name float values as "floats", not as "values" In the past, every sample value was a float, so it was fine to call a variable holding such a float "value" or "sample". With native histograms, a sample might have a histogram value. And a histogram value is still a value. Calling a float value just "value" or "sample" or "V" is therefore misleading. Over the last few commits, I already renamed many variables, but this cleans up a few more places where the changes are more invasive. Note that we do not to attempt naming in the JSON APIs or in the protobufs. That would be quite a disruption. However, internally, we can call variables as we want, and we should go with the option of avoiding misunderstandings. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
beorn7	c0879d64cf	promql: Separate `Point` into `FPoint` and `HPoint` In other words: Instead of having a “polymorphous” `Point` that can either contain a float value or a histogram value, use an `FPoint` for floats and an `HPoint` for histograms. This seemingly small change has a _lot_ of repercussions throughout the codebase. The idea here is to avoid the increase in size of `Point` arrays that happened after native histograms had been added. The higher-level data structures (`Sample`, `Series`, etc.) are still “polymorphous”. The same idea could be applied to them, but at each step the trade-offs needed to be evaluated. The idea with this change is to do the minimum necessary to get back to pre-histogram performance for functions that do not touch histograms. Here are comparisons for the `changes` function. The test data doesn't include histograms yet. Ideally, there would be no change in the benchmark result at all. First runtime v2.39 compared to directly prior to this commit: ``` name old time/op new time/op delta RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 542µs ± 1% +38.58% (p=0.000 n=9+8) RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 617µs ± 2% +36.48% (p=0.000 n=10+10) RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.36ms ± 2% +21.58% (p=0.000 n=8+10) RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 8.94ms ± 1% +14.21% (p=0.000 n=10+10) RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.30ms ± 1% +10.67% (p=0.000 n=9+10) RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.10ms ± 1% +11.82% (p=0.000 n=10+10) RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 11.8ms ± 1% +12.50% (p=0.000 n=8+10) RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 87.4ms ± 1% +12.63% (p=0.000 n=9+9) RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 32.8ms ± 1% +8.01% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.6ms ± 2% +9.64% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 117ms ± 1% +11.69% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 876ms ± 1% +11.83% (p=0.000 n=9+10) ``` And then runtime v2.39 compared to after this commit: ``` name old time/op new time/op delta RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 547µs ± 1% +39.84% (p=0.000 n=9+8) RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 616µs ± 2% +36.15% (p=0.000 n=10+10) RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.26ms ± 1% +12.20% (p=0.000 n=8+10) RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 7.95ms ± 1% +1.59% (p=0.000 n=10+8) RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.38ms ± 2% +13.49% (p=0.000 n=9+10) RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.02ms ± 1% +9.80% (p=0.000 n=10+9) RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 10.8ms ± 1% +3.08% (p=0.000 n=8+10) RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 78.1ms ± 1% +0.58% (p=0.035 n=9+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 33.5ms ± 4% +10.18% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.0ms ± 1% +7.98% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 107ms ± 1% +1.92% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 775ms ± 1% -1.02% (p=0.019 n=9+9) ``` In summary, the runtime doesn't really improve with this change for queries with just a few steps. For queries with many steps, this commit essentially reinstates the old performance. This is good because the many-step queries are the one that matter most (longest absolute runtime). In terms of allocations, though, this commit doesn't make a dent at all (numbers not shown). The reason is that most of the allocations happen in the sampleRingIterator (in the storage package), which has to be addressed in a separate commit. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
Łukasz Mierzwa	b6573353c1	Add query_samples_total metric query_samples_total is a counter that tracks the total number of samples loaded by all queries. The goal with this metric is to be able to see the amount of 'work' done by Prometheus to service queries. At the moment we have metrics with the number of queries, plus more detailed metrics showing how much time each step of a query takes. While those metrics do help they don't show us the whole picture. Queries that do load more samples are (in general) more expensive than queries that do load fewer samples. This means that looking only at the number of queries doesn't tell us how much 'work' Prometheus received. Adding a counter that tracks the total number of samples loaded allows us to see if there was a spike in the cost of queries, not just the number of them. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2 years ago
Oleg Zaytsev	6e2905a4d4	Use zeropool.Pool to workaround SA6002 (#12189 ) * Use zeropool.Pool to workaround SA6002 I built a tiny library called https://github.com/colega/zeropool to workaround the SA6002 staticheck issue. While searching for the references of that SA6002 staticheck issues on Github first results was Prometheus itself, with quite a lot of ignores of it. This changes the usages of `sync.Pool` to `zeropool.Pool[T]` where a pointer is not available. Also added a benchmark for HeadAppender Append/Commit when series already exist, which is one of the most usual cases IMO, as I didn't find any. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Improve BenchmarkHeadAppender with more cases Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * A little copying is better than a little dependency https://www.youtube.com/watch?v=PAAkCSZUG1c&t=9m28s Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Fix imports order Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Add license header Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Copyright should be on one of the first 3 lines Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Use require.Equal for testing I don't depend on testify in my lib, but here we have it available. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Avoid flaky test Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Also use zeropool for pointsPool in engine.go Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> --------- Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2 years ago
Bryan Boreham	f2fd85df82	promql: use faster heap method for topk/bottomk Call `Fix()` instead of `Pop()` followed by `Push()`. This is slightly faster. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	cf54a14f9c	promql: add a benchmark for topk with k > 1 I picked k = 5. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	b987afa7ef	labels: simplify call to get Labels from Builder It took a `Labels` where the memory could be re-used, but in practice this hardly ever benefitted. Especially after converting `relabel.Process` to `relabel.ProcessBuilder`. Comparing the parameter to `nil` was a bug; `EmptyLabels` is not `nil` so the slice was reallocated multiple times by `append`. Lastly `Builder.Labels()` now estimates that the final size will depend on labels added and deleted. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Filip Petkovski	3d7783e663	Add nolint for NewParser function Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>	2 years ago
Filip Petkovski	97c7fffbb8	parser: Allow parsing arbitrary functions In Thanos we would like to start experimenting with custom functions that are currently not part of the PromQL spec. We would do this by adding an implementation for those functions in the Thanos engine: https://github.com/thanos-community/promql-engine and allow users to decide which engine they want to use on a per-query basis. Since we use the PromQL parser from Prometheus, injecting functions in the global `Functions` variable would mean they also become available for the Prometheus engine. To avoid this side-effect, this commit exposes a Parser interface in which the supported functions can be injected as an option. If not functions are injected, the parser implementation will default to the functions defined in the global Functions variable. Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>	2 years ago
Trevor Whitney	dd94ebb87b	promql: set CounterResetHint after rate and sum Signed-off-by: Trevor Whitney <trevorjwhitney@gmail.com>	2 years ago
Justin Lei	60ad864667	Remove hacky promql.Test native histogram thing Signed-off-by: Justin Lei <justin.lei@grafana.com>	2 years ago
Justin Lei	c16b6a0185	Handle native histograms in remote read Signed-off-by: Justin Lei <justin.lei@grafana.com>	2 years ago
Julien Pivotto	1fd59791e1	Update tests Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2 years ago
Bryan Boreham	be4a9c25f0	promql: disable some slow cases in TestConcurrentRangeQueries TestConcurrentRangeQueries runs many queries, up to 4 at the same time, to try to expose any race conditions. This change stops four of them from running with a thousand or more steps: `holt_winters(a_X[1d], 0.3, 0.3)` `changes(a_X[1d])` `rate(a_X[1d])` `absent_over_time(a_X[1d])` Particularly when the test runs with `-race` in CI, this reduces the time and resources required. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
tyltr	24a9678dcc	typo 'efficcient' (#12090 ) Signed-off-by: tylitianrui <tylitianrui@126.com>	2 years ago
Justin Lei	af1d9e01c7	Refactor tsdbutil for tests/native histograms (#11948 ) * Add float histograms to ChunkFromSamplesGeneric Signed-off-by: Justin Lei <justin.lei@grafana.com> * Add GenerateSamples functions to tsdbutil Signed-off-by: Justin Lei <justin.lei@grafana.com> PR responses Signed-off-by: Justin Lei <justin.lei@grafana.com> --------- Signed-off-by: Justin Lei <justin.lei@grafana.com>	2 years ago
beorn7	1cfc8f65a3	histograms: Return actually useful counter reset hints This is a bit more conservative than we could be. As long as a chunk isn't the first in a block, we can be pretty sure that the previous chunk won't disappear. However, the incremental gain of returning NotCounterReset in these cases is probably very small and might not be worth the code complications. Wwith this, we now also pay attention to an explicitly set counter reset during ingestion. While the case doesn't show up in practice yet, there could be scenarios where the metric source knows there was a counter reset even if it might not be visible from the values in the histogram. It is also useful for testing. Signed-off-by: beorn7 <beorn@grafana.com>	2 years ago
Bryan Boreham	9ae3572d24	TestConcurrentRangeQueries: log query with error We've seen some timeouts in CI, and wanted to know what queries are involved. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Ganesh Vernekar	3c2ea91a83	tsdb: Test gauge float histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Bryan Boreham	80ac0d7c82	promql: add benchmark for match against blank string Blank strings are not handled efficiently by tsdb. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Marc Tudurí	49f775d8a0	histograms: Add missing float histograms tests for PromQL (#11780 ) * test: TestSparseHistogramRate * test: TestSparseHistogram_HistogramQuantile * test: TestSparseHistogram_HistogramFraction * test: TestSparseHistogram_HistogramFraction * test: TestSparseHistogram_Sum_Count_AddOperator * test: TestSparseHistogram_HistogramCountAndSum * tests: fix TestSparseHistogram_HistogramCountAndSum * linter * refactor TestSparseHistogram_HistogramCountAndSum * wrap TestSparseHistogram_HistogramCountAndSum Signed-off-by: Marc Tuduri <marctc@protonmail.com>	2 years ago
Marc Tudurí	9474610baf	Support FloatHistogram in TSDB (#11522 ) Extends Appender.AppendHistogram function to accept the FloatHistogram. TSDB supports appending, querying, WAL replay, for this new type of histogram. Signed-off-by: Marc Tudurí <marctc@protonmail.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	2 years ago
Bryan Boreham	1b0a29701b	promql: optimise aggregation with no labels For a query like 'sum (foo)', we can quickly skip to the empty labels that its result needs. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	aafef011b7	Promql: reuse LabelBuilder in aggregations We have a LabelBuilder in EvalNodeHelper; use it instead of creating a new one at every step. Need to take some care that different uses of enh.lb do not overlap. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	2c382f5e24	promql: extract function to initialize LabelBuilder Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	dbd7021cc2	promql: add test for race conditions in query engine (#11743 ) * promql: refactor BenchmarkRangeQuery so we can re-use test cases Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * promql: add test for race conditions in query engine Note we skip large count_values queries - `count_values` allocates a slice per unique value in the output, and this test has unique values on every step of every series so it adds up to a lot of slices. Add Go runtime overhead for checking `-race`, and it chews up many gigabytes. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * TestConcurrentRangeQueries: wait before starting goroutine Instead of starting 100 goroutines which just wait for the semaphore. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago
Bryan Boreham	aa634e0b7e	Update package promql tests for new labels.Labels type Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 years ago

1 2 3 4 5 ...

838 Commits (9e90b90eb3eddb119d694e20a8bf2d08c75b5eb5)