Commit Graph

77 Commits (ca3119bd24c451851c15e47642e85b449552d85c)

Author SHA1 Message Date
Björn Rabenstein 125a90899c
promqltest: Complete the tests for info annotations (#15429)
promqltest: Complete the tests for info annotations

So far, we did not test for the _absence_ of an info annotation
(because many tests triggered info annotations, which we haven't taken
into account so far).

The test for info annotations was also missed for range queries.

This completes the tests for info annotations (and refactors the many
`if` statements into a somewhat more compact `switch` statement).

It fixes most tests to not emit an info annotation anymore. Or it
changes the `eval` to `eval_info` where we actually want to test for
the info annotation.

It also fixes a few spelling errors in comments.

---------

Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-11-21 14:20:38 +01:00
Neeraj Gartia a6fb16fcb4
PromQL: Convert more native histogram tests to promql-test framework (#15419)
This converts `TestNativeHistogram_SubOperator` to the promql testing framework. It also removes `TestNativeHistogram_Sum_Count_Add_AvgOperator`, which got converted earlier.

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-11-20 11:41:36 +01:00
Björn Rabenstein 4ef1170868
Merge pull request #15422 from NeerajGartia21/promql-corrections
[BUGFIX] PromQL: Fix `count_values` for histograms
2024-11-20 11:27:32 +01:00
George Krajcsovits 5cd9855999
tests(promql/testdata): add regression test for and-on (#15425)
* tests(promql/testdata): add regression test for and-on

I'd like to use queries of the form "x and on() (vector(y)==1)" to be
able to include and exclude series for dashboards. This helps migration
to native histograms in dashboards by using a dashboard variable to
set "y" to either -1 or 1 to exclude or include the result.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-11-20 10:29:18 +01:00
Neeraj Gartia 048222867a fix count_values for histograms
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-11-20 02:07:31 +05:30
Charles Korn 62e6e55c07
promql: fix issues with comparison binary operations with `bool` modifier and native histograms (#15413)
* Fix issue where comparison operations with `bool` modifier and native histograms return histograms rather than 0 or 1

* Don't emit anything for comparisons between floats and histograms when `bool` modifier is used

* Don't emit anything for comparisons between floats and histograms when `bool` modifier is used between a vector and a scalar

---------

Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-11-19 09:13:34 +01:00
Neeraj Gartia 789c9b1a5e
[BUGFIX] PromQL: Corrects the behaviour of some operator and aggregators with Native Histograms (#15245)
PromQL: Correct the behaviour of some operator and aggregators with Native Histograms

---------

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-11-12 15:37:05 +01:00
Joshua Hesketh ed2668bbda
Merge branch 'main' into jhesketh/clamp
Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
2024-11-12 10:20:58 +11:00
Bryan Boreham 6979b237b9 Merge branch 'release-2.55' into merge-2.55-into-main2 2024-11-04 12:58:38 +00:00
Ben Ye b7aca45de7 fix round function ignoring enableDelayedNameRemoval feature flag
Signed-off-by: Ben Ye <benye@amazon.com>
2024-10-31 00:30:22 -07:00
Joshua Hesketh 14ef1ce8ab Round function should ignore native histograms
As per the documentation, native histograms are skipped. This is in line with other simpleFunc's.

Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
2024-10-17 15:39:48 +11:00
Joshua Hesketh 5a4e4f6936
Fix stddev/stdvar when aggregating histograms, NaNs, and infinities (#14941)
promql: Fix stddev/stdvar when aggregating histograms, NaNs, and Infs

Native histograms are ignored when calculating stddev or stdvar.

However, for the first series of each group, a `groupedAggregation` is
always created. If the first series that was encountered is a histogram
then it acts as the equivalent of a 0 point.

This change creates the first `groupedAggregation` with the `seen` field set to `false` if the point is a
histogram, thus ignoring it like the rest of the aggregation function does. A new `groupedAggregation`
will then be created once an actual float value is encountered.

This commit also sets the `floatValue` field of the `groupedAggregation` to `NaN`, if the first
float value of a group is `NaN` or `±Inf`, so that the outcome is consistently `NaN` once those
values are in the mix.

(The added tests fail without this change).

Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
Signed-off-by: beorn7 <beorn@grafana.com>

---------

Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
Signed-off-by: beorn7 <beorn@grafana.com>
Co-authored-by: beorn7 <beorn@grafana.com>
2024-10-16 15:00:46 +02:00
Joshua Hesketh 31d19381f6 Clamp functions should ignore native histograms
As per the documentation, native histograms are skipped. This is in line
with other `simpleFunc`'s.

Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
2024-10-16 15:10:54 +11:00
Neeraj Gartia d4b1f9eb33
Corrects the behaviour of binary opperators between histogram and float (#14726)
promql: corrects binary operators functioning for mixed sample with histogram and float

For invalid pairings of sample types, an annotation is added now.

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>

---------

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-10-15 14:44:36 +02:00
Fiona Liao 8650d25804
Add additional basic nhcb unit tests (#15086)
* Add additional basic nhcb unit tests
* Update promql/promqltest/testdata/histograms.test

Signed-off-by: Fiona Liao <fiona.liao@grafana.com>
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2024-10-08 14:34:32 +02:00
Björn Rabenstein df9916ef66
Merge pull request #14677 from prometheus/beorn7/histogram
promql(native histograms): Introduce exponential interpolation
2024-09-19 18:08:59 +02:00
Jan Fajerski 96e5a94d29 promql: rename holt_winters to double_exponential_smoothing
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-09-19 15:29:01 +02:00
Björn Rabenstein 1639450172 Merge pull request #14821 from charleskorn/nh-negative-multiplication-division
promql: correctly handle unary negation of native histograms and add tests for multiplication and division of native histograms by negative scalars
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-19 14:07:37 +01:00
beorn7 6fcd225aee promql(native histograms): Introduce exponential interpolation
The linear interpolation (assuming that observations are uniformly
distributed within a bucket) is a solid and simple assumption in lack
of any other information. However, the exponential bucketing used by
standard schemas of native histograms has been chosen to cover the
whole range of observations in a way that bucket populations are
spread out over buckets in a reasonably way for typical distributions
encountered in real-world scenarios.

This is the origin of the idea implemented here: If we divide a given
bucket into two (or more) smaller exponential buckets, we "most
naturally" expect that the samples in the original buckets will split
among those smaller buckets in a more or less uniform fashion. With
this assumption, we end up with an "exponential interpolation", which
therefore appears to be a better match for histograms with exponential
bucketing.

This commit leaves the linear interpolation in place for NHCB, but
changes the interpolation for exponential native histograms to
exponential. This affects `histogram_quantile` and
`histogram_fraction` (because the latter is more or less the inverse
of the former).

The zero bucket has to be treated specially because the assumption
above would lead to an "interpolation to zero" (the bucket density
approaches infinity around zero, and with the postulated uniform usage
of buckets, we would end up with an estimate of zero for all quantiles
ending up in the zero bucket). We simply fall back to linear
interpolation within the zero bucket.

At the same time, this commit makes the call to stick with the
assumption that the zero bucket only contains positive observations
for native histograms without negative buckets (and vice versa). (This
is an assumption relevant for interpolation. It is a mostly academic
point, as the zero bucket is supposed to be very small anyway.
However, in cases where it _is_ relevantly broad, the assumption helps
a lot in practice.)

This commit also updates and completes the documentation to match both
details about interpolation.

As a more high level note: The approach here attempts to strike a
balance between a more simplistic approach without any assumption, and
a more involved approach with more sophisticated assumptions. I will
shortly describe both for reference:

The "zero assumption" approach would be to not interpolate at all, but
_always_ return the harmonic mean of the bucket boundaries of the
bucket the quantile ends up in. This has the advantage of minimizing
the maximum possible relative error of the quantile estimation.
(Depending on the exact definition of the relative error of an
estimation, there is also an argument to return the arithmetic mean of
the bucket boundaries.) While limiting the maximum possible relative
error is a good property, this approach would throw away the
information if a quantile is closer to the upper or lower end of the
population within a bucket. This can be valuable trending information
in a dashboard. With any kind of interpolation, the maximum possible
error of a quantile estimation increases to the full width of a bucket
(i.e. it more than doubles for the harmonic mean approach, and
precisely doubles for the arithmetic mean approach). However, in
return the _expectation value_ of the error decreases. The increase of
the theoretical maximum only has practical relevance for pathologic
distributions. For example, if there are thousand observations within
a bucket, they could _all_ be at the upper bound of the bucket. If the
quantile calculation picks the 1st observation in the bucket as the
relevant one, an interpolation will yield a value close to the lower
bucket boundary, while the true quantile value is close to the upper
boundary.

The "fancy interpolation" approach would be one that analyses the
_actual_ distribution of samples in the histogram. A lot of statistics
could be applied based on the information we have available in the
histogram. This would include the population of neighboring (or even
all) buckets in the histogram. In general, the resolution of a native
histogram should be quite high, and therefore, those "fancy"
approaches would increase the computational cost quite a bit with very
little practical benefits (i.e. just tiny corrections of the estimated
quantile value). The results are also much harder to reason with.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-09-19 14:19:10 +02:00
Jan Fajerski 91608c002f Merge branch 'main' into release-3.0-beta.0
Conflicts:
	scrape/scrape_test.go
          Pick both changes.
2024-09-10 20:51:20 +02:00
Charles Korn e8c7482137
Return negative counts when multiplied or divided by a negative value
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-09-09 14:37:59 +10:00
Charles Korn 113de6301c
Add failing test cases for unary negation and multiplication and division with negative scalars
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-09-04 16:20:28 +10:00
Charles Korn 9b451abec7
Make positive and negative bucket counts different in existing test cases
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-09-04 16:08:05 +10:00
Jan Fajerski 956245b25b promqltest: adjust eval times and range selector
In order to fix new tests for changes added in
https://github.com/prometheus/prometheus/pull/13904.

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-09-02 11:27:39 +02:00
Jan Fajerski 00315ce15e Merge branch 'main' into 3.0-main-sync-24-08-30
using -Xours

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-09-02 11:27:18 +02:00
Neeraj Gartia 8c7bf39d96
Moves TestNativeHistogram_MulDivOperator to promql testing framework (#14688)
PromQL: add test for mul and div operator

Also, remove the converted test from the engine_test.go file.

This also includes an extension of the test framework to allow NaN/Inf in histogram buckets.

---------

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-08-29 16:42:35 +02:00
Jorge Creixell e9e3d64b7c
PromQL engine: Delay deletion of __name__ label to the end of the query evaluation (#14477)
PromQL engine: Delay deletion of __name__ label to the end of the query evaluation

  - This change allows optionally preserving the `__name__` label via the `label_replace` and `label_join` functions, and helps prevent the dreaded "vector cannot contain metrics with the same labelset" error.
  - The implementation extends the `Series` and `Sample` structs with a boolean flag indicating whether the `__name__` label should be deleted at the end of the query evaluation.
  - The `label_replace` and `label_join` functions can still access the value of the `__name__` label, even if it has been previously marked for deletion. If  `__name__` is used as target label, it won't be dropped at the end of the query evaluation.
  - Fixes https://github.com/prometheus/prometheus/issues/11397
  - See https://github.com/jcreixell/prometheus/pull/2 for previous discussion, including the decision to create this PR and benchmark it before considering other alternatives (like refactoring `labels.Labels`).
  - See https://github.com/jcreixell/prometheus/pull/1 for an alternative implementation using a special label instead of boolean flags.
  - Note: a feature flag  `promql-delayed-name-removal` has been added as it changes the behavior of some "weird" queries (see https://github.com/prometheus/prometheus/issues/11397#issuecomment-1451998792)

Example (this always fails, as `__name__` is being dropped by `count_over_time`):

```
count_over_time({__name__!=""}[1m])

=> Error executing query: vector cannot contain metrics with the same labelset
```

Before:

```
label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)")

=> Error executing query: vector cannot contain metrics with the same labelset
```

After:

```
label_replace(count_over_time({__name__!=""}[1m]), "__name__", "count_$1", "__name__", "(.+)")

=>
count_go_gc_cycles_automatic_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1
count_go_gc_cycles_forced_gc_cycles_total{instance="localhost:9090", job="prometheus"} 1
...
```

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

---------

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
2024-08-29 15:50:39 +02:00
Björn Rabenstein 849215d90c
Merge pull request #14585 from fatsheep9146/covert-TestNativeHistogram_Sum_Count_Add_AvgOperator-to-framework
convert TestNativeHistogram_Sum_Count_Add_AvgOperator into testing framework
2024-08-28 17:21:32 +02:00
Jan Fajerski 7c8c748399 promql tests: adjust range query intervals
Some test queries need their interval adjusted to account for
https://github.com/prometheus/prometheus/pull/13904. Otherwise the
queries don't return enough samples.
promql/engine_test.go:TestHistogramCopyFromIteratorRegression needed the
same, but also the result needed a fix since `increase` interpolates
over the full range.

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-08-21 12:33:52 +02:00
Björn Rabenstein 7fad1ec8ee
Merge pull request #14655 from suntala/suntala/sort-by-label-enhancement
promql: Fall back to full label sets when sorting by label
2024-08-21 12:28:55 +02:00
Jan Fajerski 5138922b0d Merge branch 'main' into 3.0-main-sync-24-08-21 2024-08-21 09:09:36 +02:00
Ziqi Zhao 8f828d45c1 convert TestNativeHistogram_Sum_Count_Add_AvgOperator into testing framework
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2024-08-21 09:24:50 +08:00
Charles Korn 52818a97e2
Merge branch 'main' into sum-and-avg-over-mixed-custom-exponential-histograms
# Conflicts:
#	promql/promqltest/testdata/native_histograms.test
2024-08-14 07:52:08 +10:00
György Krajcsovits 386fc8b9f6 Update from review comments.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-13 15:26:07 +02:00
György Krajcsovits 6aee5b4b38 fix typo
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-12 12:04:45 +02:00
György Krajcsovits 06a8886b94 Native histograms: define behavior when rate is null.
Histogram quantile returns NaN in this case, which might be
surprising, so add a unit test that clarifies that this is
intentional.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-12 10:40:21 +02:00
suntala fd2f44af7f Fall back to comparing by label set when sorting by label desc
Co-authored-by: Aleks Fazlieva <britishrum@users.noreply.github.com>
Signed-off-by: suntala <arati.rana@grafana.com>
2024-08-11 21:44:03 +02:00
suntala 94ad489328 Fall back to comparing by label set when sorting by label
Co-authored-by: Aleks Fazlieva <britishrum@users.noreply.github.com>
Signed-off-by: suntala <arati.rana@grafana.com>
2024-08-11 21:44:03 +02:00
Charles Korn f992f81bd0
Merge branch 'main' into sum-and-avg-over-mixed-custom-exponential-histograms
Signed-off-by: Charles Korn <charleskorn@users.noreply.github.com>
2024-08-09 13:58:54 +10:00
Charles Korn 5cfdde327c
Address PR feedback: add extra test case
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-09 13:57:37 +10:00
Charles Korn 82bb35fabb
Address PR feedback: fix typo and rename variable
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-09 13:51:31 +10:00
Charles Korn f07b3ae67b
Fix issue where `avg` over mixed exponential and custom buckets, or incompatible custom buckets, produces incorrect results or panics
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-07 15:32:35 +10:00
Charles Korn 5ee94f49a2
Fix issue where `sum` over mixed exponential and custom buckets, or incompatible custom buckets, produces incorrect results
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-07 15:30:01 +10:00
Charles Korn 424cefcf5e
Fix "cannot reduce resolution to custom buckets schema" panic in `rate` over native histograms with mix of custom and exponential buckets
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-07 14:45:02 +10:00
Björn Rabenstein ee5bba07c0
Merge pull request #14413 from prometheus/beorn7/promql
promql: more Kahan summation (avg) and less incremental mean calculation (avg, avg_over_time)
2024-08-06 19:56:32 +02:00
Charles Korn aadec25faf
promql: Fix issue where some native histogram-related annotations are not emitted by `rate` (#14575)
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-06 09:10:40 +01:00
Jan Fajerski adf5d6bce1 Merge branch 'main' into 3.0-main-sync-24-07-18
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

Conflicts:
	VERSION
          pick 3.0.0
	promql/promqltest/testdata/histograms.test
          pick changes from c39776c5b5,
          but adjust 5m range selectors to 10m to account for
          https://github.com/prometheus/prometheus/pull/13904.
Fixes:
        promql/promqltest/testdata/functions.test
	promql/promqltest/testdata/staleness.test
          Tests added in https://github.com/prometheus/prometheus/pull/9138
          need to be adjusted to account for
          https://github.com/prometheus/prometheus/pull/13904.
2024-07-18 15:56:40 +02:00
beorn7 c39776c5b5 promql: Add NHCB tests
This adds equivalent NHCB tests to the existing classic histogram
tests.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-07-16 12:20:43 +02:00
beorn7 cff0429b1a promql: make avg_over_time faster and more precise
Same idea as for the avg aggregator before: Most of the time, there is
no overflow, so we don't have to revert to the more expensive and less
precise incremental calculation of the mean value.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-07-10 19:20:24 +02:00
beorn7 c46074f4dd promql: make avg aggregation more precise and less expensive
The basic idea here is that the previous code was always doing
incremental calculation of the mean value, which is more costly and
can be less precise. It protects against overflows, but in most cases,
an overflow doesn't happen anyway.

The other idea applied here is to expand on #14074, where Kahan
summation was applied to sum().

With this commit, the average is calculated in a conventional way
(adding everything up and divide in the end) as long as the sum isn't
overflowing float64. This is combined with Kahan summation so that the
avg aggregation, in most cases, is really equivalent to the sum
aggregation with a following division (which is the user's expectation
as avg is supposed to be syntactic sugar for sum with a following
divison).

If the sum hits ±Inf, the calculation reverts to incremental
calculation of the mean value. Kahan summation is also applied here,
although it cannot fully compensate for the numerical errors
introduced by the incremental mean calculation. (The tests added in
this commit would fail if incremental mean calculation was always
used.)

Signed-off-by: beorn7 <beorn@grafana.com>
2024-07-10 19:20:24 +02:00