Commit Graph

67 Commits (98dcd28b1afe9a0db08dc00de090208e892ba473)

Author SHA1 Message Date
Björn Rabenstein df9916ef66
Merge pull request #14677 from prometheus/beorn7/histogram
promql(native histograms): Introduce exponential interpolation
2024-09-19 18:08:59 +02:00
Jan Fajerski 96e5a94d29 promql: rename holt_winters to double_exponential_smoothing
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-09-19 15:29:01 +02:00
beorn7 6fcd225aee promql(native histograms): Introduce exponential interpolation
The linear interpolation (assuming that observations are uniformly
distributed within a bucket) is a solid and simple assumption in lack
of any other information. However, the exponential bucketing used by
standard schemas of native histograms has been chosen to cover the
whole range of observations in a way that bucket populations are
spread out over buckets in a reasonably way for typical distributions
encountered in real-world scenarios.

This is the origin of the idea implemented here: If we divide a given
bucket into two (or more) smaller exponential buckets, we "most
naturally" expect that the samples in the original buckets will split
among those smaller buckets in a more or less uniform fashion. With
this assumption, we end up with an "exponential interpolation", which
therefore appears to be a better match for histograms with exponential
bucketing.

This commit leaves the linear interpolation in place for NHCB, but
changes the interpolation for exponential native histograms to
exponential. This affects `histogram_quantile` and
`histogram_fraction` (because the latter is more or less the inverse
of the former).

The zero bucket has to be treated specially because the assumption
above would lead to an "interpolation to zero" (the bucket density
approaches infinity around zero, and with the postulated uniform usage
of buckets, we would end up with an estimate of zero for all quantiles
ending up in the zero bucket). We simply fall back to linear
interpolation within the zero bucket.

At the same time, this commit makes the call to stick with the
assumption that the zero bucket only contains positive observations
for native histograms without negative buckets (and vice versa). (This
is an assumption relevant for interpolation. It is a mostly academic
point, as the zero bucket is supposed to be very small anyway.
However, in cases where it _is_ relevantly broad, the assumption helps
a lot in practice.)

This commit also updates and completes the documentation to match both
details about interpolation.

As a more high level note: The approach here attempts to strike a
balance between a more simplistic approach without any assumption, and
a more involved approach with more sophisticated assumptions. I will
shortly describe both for reference:

The "zero assumption" approach would be to not interpolate at all, but
_always_ return the harmonic mean of the bucket boundaries of the
bucket the quantile ends up in. This has the advantage of minimizing
the maximum possible relative error of the quantile estimation.
(Depending on the exact definition of the relative error of an
estimation, there is also an argument to return the arithmetic mean of
the bucket boundaries.) While limiting the maximum possible relative
error is a good property, this approach would throw away the
information if a quantile is closer to the upper or lower end of the
population within a bucket. This can be valuable trending information
in a dashboard. With any kind of interpolation, the maximum possible
error of a quantile estimation increases to the full width of a bucket
(i.e. it more than doubles for the harmonic mean approach, and
precisely doubles for the arithmetic mean approach). However, in
return the _expectation value_ of the error decreases. The increase of
the theoretical maximum only has practical relevance for pathologic
distributions. For example, if there are thousand observations within
a bucket, they could _all_ be at the upper bound of the bucket. If the
quantile calculation picks the 1st observation in the bucket as the
relevant one, an interpolation will yield a value close to the lower
bucket boundary, while the true quantile value is close to the upper
boundary.

The "fancy interpolation" approach would be one that analyses the
_actual_ distribution of samples in the histogram. A lot of statistics
could be applied based on the information we have available in the
histogram. This would include the population of neighboring (or even
all) buckets in the histogram. In general, the resolution of a native
histogram should be quite high, and therefore, those "fancy"
approaches would increase the computational cost quite a bit with very
little practical benefits (i.e. just tiny corrections of the estimated
quantile value). The results are also much harder to reason with.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-09-19 14:19:10 +02:00
Jan Fajerski 15cea39136 promql: put holt_winters behind experimental feature flag
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-09-18 15:39:58 +02:00
Björn Rabenstein 7fad1ec8ee
Merge pull request #14655 from suntala/suntala/sort-by-label-enhancement
promql: Fall back to full label sets when sorting by label
2024-08-21 12:28:55 +02:00
suntala ce4eac859a Link to specific feature flag entry
Signed-off-by: suntala <arati.rana@grafana.com>
2024-08-14 13:22:12 +02:00
suntala 532904a1d6 Document changes to sort by label
Co-authored-by: Aleks Fazlieva <britishrum@users.noreply.github.com>
Signed-off-by: suntala <arati.rana@grafana.com>
2024-08-11 21:44:03 +02:00
suntala 77d111e501 Fix links to feature flags page
Signed-off-by: suntala <arati.rana@grafana.com>
2024-08-02 14:25:22 +02:00
咸鱼暄 bab098a4c1 change all lists to bullets
Signed-off-by: 咸鱼暄 <30610597+smd1121@users.noreply.github.com>
2024-07-11 17:05:23 +02:00
咸鱼暄 ad03ede602 fix markdown list
Signed-off-by: 咸鱼暄 <30610597+smd1121@users.noreply.github.com>
2024-07-11 17:05:23 +02:00
Rick Rackow 9290d1308d
fix(docs/querying): explain `ceil` behaviour more explicitly with examples (#11987)
* fix(docs/querying): explain `ceil` behaviour more explicitly with examples

Signed-off-by: Rick Rackow <rick.rackow@gmail.com>

* fix(docs/querying): explain `floor` behaviour more explicitly with examples

Signed-off-by: Rick Rackow <rick.rackow@paymenttools.com>

---------

Signed-off-by: Rick Rackow <rick.rackow@gmail.com>
Signed-off-by: Rick Rackow <rick.rackow@paymenttools.com>
Co-authored-by: Rick Rackow <rick.rackow@paymenttools.com>
2024-06-28 22:18:04 +02:00
Charles Korn 76b1237215
Document sorting behaviour
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-05-17 13:54:08 +10:00
Faustas Butkus 6feffeb92e
promql: add histogram_avg function (#13467)
Add histogram_avg function

---------

Signed-off-by: Faustas Butkus <faustas.butkus@gmail.com>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
2024-02-01 18:28:42 +01:00
Björn Rabenstein 89523cf9b3
doc: Mark `mad_over_time` as experimental (#13440)
We forgot to do that in
https://github.com/prometheus/prometheus/pull/13059

Signed-off-by: beorn7 <beorn@grafana.com>
2024-01-23 17:05:34 +01:00
Ivan Babrou a6b35ff304
promql: use natural sort in sort_by_label and sort_by_label_desc (#13411)
These functions are intended for humans, as robots can already sort the results
however they please. Humans like things sorted "naturally":

* https://blog.codinghorror.com/sorting-for-humans-natural-sort-order/

A similar thing has been done to Grafana, which is also used by humans:

* https://github.com/grafana/grafana/pull/78024
* https://github.com/grafana/grafana/pull/78494

Signed-off-by: Ivan Babrou <github@ivan.computer>
2024-01-16 21:34:09 -03:00
Jeanette Tan 9bf4cc993e Add mad_over_time function
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-12-01 01:22:58 +08:00
beorn7 0eb0ca42c5 Update “conventional histogram” → “classic histogram”
Signed-off-by: beorn7 <beorn@grafana.com>
2023-11-29 15:22:58 +01:00
Julien Pivotto c1ec6ae851 sort_by_label: Switch to feature flag
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-11-28 15:10:12 +01:00
Alexander Trost 5051a993ab promql: add sort_by_label and sort_by_label_desc functions
This adds functions to sort a vector by its label value.

Based on https://github.com/prometheus/prometheus/pull/1533

Signed-off-by: Alexander Trost <galexrt@googlemail.com>
2023-11-28 14:40:07 +01:00
zenador ccfe14d7e7
PromQL: ignore small errors for bucketQuantile (#13153)
promql: Improve histogram_quantile calculation for classic buckets

Tiny differences between classic buckets are most likely caused by floating point precision issues. With this commit, relative changes below a certain threshold are ignored. This makes the result of histogram_quantile more meaningful, and also avoids triggering the _input to histogram_quantile needed to be fixed for monotonicity_ annotations in unactionable cases.

This commit also adds explanation of the new adjustment and of the monotonicity annotation to the documentation of `histogram_quantile`.

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-11-25 00:05:38 +01:00
Julien Pivotto ac0919d48c
Update docs/querying/functions.md
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-09-29 19:10:41 +02:00
lasea75 f15f0ac16a
Update functions.md
Signed-off-by: lasea75 <lasea75@gmail.com>
2023-09-28 12:26:46 -05:00
zenador 54aaa2bd7e
Add `histogram_stdvar` and `histogram_stddev` functions (#12614)
* Add new function: histogram_stdvar and histogram_stddev

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-08-24 21:02:14 +02:00
Salar Nosrati-Ershad fd96996b75 docs: fix: correct reference to native histograms feature flag
Signed-off-by: Salar Nosrati-Ershad <snosratiershad@gmail.com>
2023-08-19 17:35:20 +03:30
Julien Pivotto 4c81a8f681
Merge pull request #11578 from chancefeick/fix/querying-documentation
Fix Querying Documentation Links
2023-08-08 09:21:31 +02:00
Julien Pivotto bb90379163
Merge pull request #11404 from gberche-orange/patch-2
docs (label_replace): illustrate use of named capturing group
2023-07-28 13:23:29 +02:00
Guillaume Berche f5fb37dbab
Update functions.md
Add missing linefeed as requested by https://github.com/prometheus/prometheus/pull/11404#discussion_r1266625301

Signed-off-by: Guillaume Berche <guillaume.berche@orange.com>
2023-07-19 08:27:15 +02:00
Guillaume Berche 60b380da70
Refine functions.md as suggested during review
See https://github.com/prometheus/prometheus/pull/11404#issuecomment-1631165746

Signed-off-by: Guillaume Berche <guillaume.berche@orange.com>
2023-07-12 09:08:23 +02:00
Ziqi Zhao 42d9169ba1 enhance histogram_quantile to get min/max value
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2023-07-12 04:29:54 +08:00
Carrie Edwards f93ac97867 Update querying function docs
Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com>
2023-07-11 21:51:20 +08:00
marcoderama a308ea773d
Update functions.md
Fix small typo

Signed-off-by: marcoderama <marcoderamagit@gmail.com>
2023-05-26 16:39:55 -07:00
zenador 191bf9055b
Handle more arithmetic operators for native histograms (#12262)
Handle more arithmetic operators and aggregators for native histograms

This includes operators for multiplication (formerly known as scaling), division, and subtraction. Plus aggregations for average and the avg_over_time function.

Stdvar and stddev will (for now) ignore histograms properly (rather than counting them but adding a 0 for them).

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-05-16 21:15:20 +02:00
beorn7 c0879d64cf promql: Separate `Point` into `FPoint` and `HPoint`
In other words: Instead of having a “polymorphous” `Point` that can
either contain a float value or a histogram value, use an `FPoint` for
floats and an `HPoint` for histograms.

This seemingly small change has a _lot_ of repercussions throughout
the codebase.

The idea here is to avoid the increase in size of `Point` arrays that
happened after native histograms had been added.

The higher-level data structures (`Sample`, `Series`, etc.) are still
“polymorphous”. The same idea could be applied to them, but at each
step the trade-offs needed to be evaluated.

The idea with this change is to do the minimum necessary to get back
to pre-histogram performance for functions that do not touch
histograms. Here are comparisons for the `changes` function. The test
data doesn't include histograms yet. Ideally, there would be no change
in the benchmark result at all.

First runtime v2.39 compared to directly prior to this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     542µs ± 1%  +38.58%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     617µs ± 2%  +36.48%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.36ms ± 2%  +21.58%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    8.94ms ± 1%  +14.21%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.30ms ± 1%  +10.67%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.10ms ± 1%  +11.82%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    11.8ms ± 1%  +12.50%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    87.4ms ± 1%  +12.63%  (p=0.000 n=9+9)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    32.8ms ± 1%   +8.01%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.6ms ± 2%   +9.64%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     117ms ± 1%  +11.69%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     876ms ± 1%  +11.83%  (p=0.000 n=9+10)
```

And then runtime v2.39 compared to after this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     547µs ± 1%  +39.84%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     616µs ± 2%  +36.15%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.26ms ± 1%  +12.20%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    7.95ms ± 1%   +1.59%  (p=0.000 n=10+8)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.38ms ± 2%  +13.49%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.02ms ± 1%   +9.80%  (p=0.000 n=10+9)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    10.8ms ± 1%   +3.08%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    78.1ms ± 1%   +0.58%  (p=0.035 n=9+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    33.5ms ± 4%  +10.18%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.0ms ± 1%   +7.98%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     107ms ± 1%   +1.92%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     775ms ± 1%   -1.02%  (p=0.019 n=9+9)
```

In summary, the runtime doesn't really improve with this change for
queries with just a few steps. For queries with many steps, this
commit essentially reinstates the old performance. This is good
because the many-step queries are the one that matter most (longest
absolute runtime).

In terms of allocations, though, this commit doesn't make a dent at
all (numbers not shown). The reason is that most of the allocations
happen in the sampleRingIterator (in the storage package), which has
to be addressed in a separate commit.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:16 +02:00
Chance Feick 52270d6216 Fix relative link to use .md file extension
Signed-off-by: Chance Feick <cfeick@gitlab.com>
2022-11-14 13:30:22 -08:00
Björn Rabenstein 41035469d3
Document the native histogram feature flag and PromQL (#11446)
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-10-14 18:16:12 +05:30
Guillaume Berche e6b84ac1e0
functions.md doc refinement
Removed named capture group example in label_replace

Signed-off-by: Guillaume Berche <guillaume.berche@orange.com>
2022-10-13 10:40:52 +02:00
Jesus Vazquez e934d0f011 Merge 'main' into sparsehistogram
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
2022-10-05 22:14:49 +02:00
Guillaume Berche ea40c15fe9
Refine documentation for label_replace
Illustrate use of named capturing group syntax available from https://github.com/google/re2/wiki/Syntax and their usage in the replacement field

Signed-off-by: Guillaume Berche <guillaume.berche@orange.com>
2022-10-03 18:06:27 +02:00
Björn Rabenstein bcd548c88b
Merge pull request #10076 from mtfoley/docs-deriv
docs: update function docs on deriv
2022-08-11 11:49:21 +02:00
beorn7 9eafed0f79 promql: Add `histogram_count` and `histogram_sum`
This follow a simple function-based approach to access the count and
sum fields of a native Histogram. It might be more elegant to
implement “accessors” via the dot operator, as considered in the
brainstorming doc [1]. However, that would require the introduction of
a whole new concept in PromQL. For the PoC, we should be fine with the
function-based approch. Even the obvious inefficiencies (rate'ing a
whole histogram twice when we only want to rate each the count and the
sum once) could be optimized behind the scenes.

Note that the function-based approach elegantly solves the problem of
detecting counter resets in the sum of observations in the case of
negative observations. (Since the whole native Histogram is rate'd,
the counter reset is detected for the Histogram as a whole.)

We will decide later if an “accessor” approach is really needed. It
would change the example expression for average duration in
functions.md from

      histogram_sum(rate(http_request_duration_seconds[10m]))
	/
      histogram_count(rate(http_request_duration_seconds[10m]))

to

      rate(http_request_duration_seconds.sum[10m])
	/
      rate(http_request_duration_seconds.count[10m])

[1]: https://docs.google.com/document/d/1ch6ru8GKg03N02jRjYriurt-CZqUVY09evPg6yKTA1s/edit

Signed-off-by: beorn7 <beorn@grafana.com>
2022-06-28 18:16:48 +02:00
beorn7 a3a8f58bb3 promql: Add histogram_fraction function
Signed-off-by: beorn7 <beorn@grafana.com>
2022-06-28 15:58:03 +02:00
Julien Pivotto 42f574d5ac
Remove "This function was added in Prometheus 2.0" (#10719)
Since Prometheus documentation is versioned, do not write down that a
specific function was added in Prom 2.0, for consistency.

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2022-05-20 21:43:58 +02:00
Ivo Gosemann e22b54e253 Adds day_of_year function to PromQL
Signed-off-by: Ivo Gosemann <ivo.gosemann@sap.com>
2022-05-20 14:08:34 +02:00
Matthew b6630876d3 remove impl details from docs
Signed-off-by: mtfoley <mtfoley.mae@gmail.com>
2022-02-19 08:44:43 -05:00
Matthew 00578d245b
Merge branch 'prometheus:main' into docs-deriv 2022-02-19 08:37:09 -05:00
jyz0309 02e032884a add doc
Signed-off-by: jyz0309 <45495947@qq.com>
2022-02-13 21:59:03 +08:00
Matthew 57b86cfe9e docs: update function docs on deriv
Signed-off-by: mtfoley <mtfoley.mae@gmail.com>
2021-12-22 21:32:04 -05:00
Levi Harrison 6faca22eec Add inverse hyperbolic functions
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2021-09-15 22:47:10 -04:00
Levi Harrison 74faea64dd Removed specification of pi digits
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2021-09-15 22:47:10 -04:00
Levi Harrison 53d88fd147 Added hyperbolic trig functions
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2021-09-15 22:47:10 -04:00