promql: fix some aggregations for histograms
This PR fixes the behaviour of `topk`,`bottomk`, `limitk` and `limit_ratio` with histograms. The fixed behaviour are as follows:
- For `topk` and `bottomk` histograms are ignored and add info annotations added.
- For `limitk` and `limit_ratio` histograms are included in the results(if applicable).
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
promqltest: Complete the tests for info annotations
So far, we did not test for the _absence_ of an info annotation
(because many tests triggered info annotations, which we haven't taken
into account so far).
The test for info annotations was also missed for range queries.
This completes the tests for info annotations (and refactors the many
`if` statements into a somewhat more compact `switch` statement).
It fixes most tests to not emit an info annotation anymore. Or it
changes the `eval` to `eval_info` where we actually want to test for
the info annotation.
It also fixes a few spelling errors in comments.
---------
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
PromQL: Correct the behaviour of some operator and aggregators with Native Histograms
---------
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
promql: Fix stddev/stdvar when aggregating histograms, NaNs, and Infs
Native histograms are ignored when calculating stddev or stdvar.
However, for the first series of each group, a `groupedAggregation` is
always created. If the first series that was encountered is a histogram
then it acts as the equivalent of a 0 point.
This change creates the first `groupedAggregation` with the `seen` field set to `false` if the point is a
histogram, thus ignoring it like the rest of the aggregation function does. A new `groupedAggregation`
will then be created once an actual float value is encountered.
This commit also sets the `floatValue` field of the `groupedAggregation` to `NaN`, if the first
float value of a group is `NaN` or `±Inf`, so that the outcome is consistently `NaN` once those
values are in the mix.
(The added tests fail without this change).
Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
Signed-off-by: beorn7 <beorn@grafana.com>
---------
Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
Signed-off-by: beorn7 <beorn@grafana.com>
Co-authored-by: beorn7 <beorn@grafana.com>
The basic idea here is that the previous code was always doing
incremental calculation of the mean value, which is more costly and
can be less precise. It protects against overflows, but in most cases,
an overflow doesn't happen anyway.
The other idea applied here is to expand on #14074, where Kahan
summation was applied to sum().
With this commit, the average is calculated in a conventional way
(adding everything up and divide in the end) as long as the sum isn't
overflowing float64. This is combined with Kahan summation so that the
avg aggregation, in most cases, is really equivalent to the sum
aggregation with a following division (which is the user's expectation
as avg is supposed to be syntactic sugar for sum with a following
divison).
If the sum hits ±Inf, the calculation reverts to incremental
calculation of the mean value. Kahan summation is also applied here,
although it cannot fully compensate for the numerical errors
introduced by the incremental mean calculation. (The tests added in
this commit would fail if incremental mean calculation was always
used.)
Signed-off-by: beorn7 <beorn@grafana.com>
This can give a more precise result, by keeping a separate running
compensation value to accumulate small errors.
See https://en.wikipedia.org/wiki/Kahan_summation_algorithm
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>