This is a bit more conservative than we could be. As long as a chunk
isn't the first in a block, we can be pretty sure that the previous
chunk won't disappear. However, the incremental gain of returning
NotCounterReset in these cases is probably very small and might not be
worth the code complications.
Wwith this, we now also pay attention to an explicitly set counter
reset during ingestion. While the case doesn't show up in practice
yet, there could be scenarios where the metric source knows there was
a counter reset even if it might not be visible from the values in the
histogram. It is also useful for testing.
Signed-off-by: beorn7 <beorn@grafana.com>
Extends Appender.AppendHistogram function to accept the FloatHistogram. TSDB supports appending, querying, WAL replay, for this new type of histogram.
Signed-off-by: Marc Tudurí <marctc@protonmail.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
We have a LabelBuilder in EvalNodeHelper; use it instead of creating a new one at every step.
Need to take some care that different uses of enh.lb do not overlap.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* promql: refactor BenchmarkRangeQuery so we can re-use test cases
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* promql: add test for race conditions in query engine
Note we skip large count_values queries -
`count_values` allocates a slice per unique value in the output, and
this test has unique values on every step of every series so it adds up
to a lot of slices. Add Go runtime overhead for checking `-race`, and
it chews up many gigabytes.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* TestConcurrentRangeQueries: wait before starting goroutine
Instead of starting 100 goroutines which just wait for the semaphore.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
We use `labels.Builder` to parse metrics, to avoid depending on the
internal implementation. This is not efficient, but the feature is only
used in tests. It wasn't efficient previously either - calling `Sort()`
after adding each label.
`createLabelsForAbsentFunction` also uses a Builder now, and gets
an extra `map` to replace the previous `Has()` usage.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Fix up promql to compile with changes to Labels
Re-use previous memory if it is already of the correct type.
In `NewListSeries` we hoist the conversion to an interface value out
so it only allocates once.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Patterned after `Chunk.Iterator()`: pass the old iterator in so it
can be re-used to avoid allocating a new object.
(This commit does not do any re-use; it is just changing all the method
signatures so re-use is possible in later commits.)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Switch from 'sanity' to more inclusive lanuage
"Removing ableist language in code is important; it helps to create and
maintain an environment that welcomes all developers of all backgrounds,
while emphasizing that we as developers select the most articulate,
precise, descriptive language we can rather than relying on metaphors.
The phrase sanity check is ableist, and unnecessarily references mental
health in our code bases. It denotes that people with mental illnesses
are inferior, wrong, or incorrect, and the phrase sanity continues to be
used by employers and other individuals to discriminate against these
people."
From https://gist.github.com/seanmhanson/fe370c2d8bd2b3228680e38899baf5cc
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
The bucket receiving math.MaxFloat64 observations now has
math.MaxFloat64 as upper bound, while the bucket after it (the last
possible bucket) has +Inf.
This also adds a test for getBound and moves the getBound code to
generic.go (where it should have been in the first place).
Signed-off-by: beorn7 <beorn@grafana.com>
* histogram: Simplify iterators
We don't really need currLower and currUpper and can calculate it when
needed (as already done for the floatBucketIterator). The calculation
is cheap, while keeping those extra variables around costs RAM
(potentially a lot with many iterators).
* histogram: Convert Bucket/FloatBucket to one generic type
* histogram: Move some bucket iterator code into generic base iterator
* histogram: Remove cumulative iterator for FloatHistogram
We added it in the past for completeness (Histogram has one), but it
has never been used. Plus, even the cumulative iterator for Histogram
is only there for test reasons.
We can always add it back, and then maybe even using generics.
Signed-off-by: beorn7 <beorn@grafana.com>
If we are populating series for a subquery then set the interval
parameter accordingly so that downstream users could use that
information.
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Use new experimental package `golang.org/x/exp/slices`.
slices.Sort works on values that are directly comparable, like ints,
so avoids the overhad of an interface call to `.Less()`.
Left tests unchanged, because they don't need the speed and it may be
a cross-check that slices.Sort gives the same answer.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
And use the new method to call to compact Histograms during
parsing. This happens for both `Histogram` and `FloatHistogram`. In
this way, if targets decide to optimize the exposition size by merging
spans with empty buckets in between, we still get a normalized
results. It will also normalize away any valid but weird
representations like empty spans, spans with offset zero, and empty
buckets at the start or end of a span.
The implementation seemed easy at first as it just turns the
`compactBuckets` helper into a generic function (which now got its own
file). However, the integer Histograms have delta buckets instead of
absolute buckets, which had to be treated specially in the generic
`compactBuckets` function. To make sure it works, I have added plenty
of explicit tests for `Histogram` in addition to the `FloatHistogram`
tests.
I have also updated the doc comment for the `Compact` method.
Based on the insights now expressed in the doc comment, compacting
with a maxEmptyBuckets > 0 is rarely useful. Therefore, this commit
also sets the value to 0 in the two cases we were using 3 so far. We
might still want to reconsider, so I don't want to remove the
maxEmptyBuckets parameter right now.
Signed-off-by: beorn7 <beorn@grafana.com>
And a few cases of `EmptyLabels()`.
Replacing code which assumes the internal structure of `Labels`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Update go to 1.19, set min version to 1.18
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
* Update golangci-lint
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
* model/relabel: Add benchmark
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* model/relabel: re-use Builder across relabels
Saves memory allocations.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* labels.Builder: allow re-use of result slice
This reduces memory allocations where the caller has a suitable slice available.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* model/relabel: re-use source values slice
To reduce memory allocations.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Unwind one change causing test failures
Restore original behaviour in PopulateLabels, where we must not overwrite the input set.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* relabel: simplify values optimisation
Use a stack-based array for up to 16 source labels, which will be the
vast majority of cases.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* lint
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Add histogram validation
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Correct negative offset validation
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Address review comments
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Validation benchmark
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Add more checks
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Attempt to fix tests
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Fix stuff
Signed-off-by: Levi Harrison <git@leviharrison.dev>
* Prettifier: Add spaces with non-callable keywords
I prefer to have a difference between, on one side: functions calls, end(), start(), and on the other side with, without, ignoring, by and group_rrigt, group_left.
The reasoning is that the former ones are not calls, while other are
functions. Additionally, it matches the examples in our documentation.
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
* Fix tests
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
We print the stacktrace of a panic when query causes one, but there's no
information about the query itself, which makes it harder to debug and
reproduce the issue.
This adds the 'expr' string to the logged panic.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
* Shortcut Matrix.ContainsSameLabelset()
It's quite often to execute this check on a Matrix that has zero or only
one series. There's no need to allocate a map for those cases.
There's also a one-liner for two-series case, so why not using it?
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Add license header
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Optimize Vector.ContainsSameLabelset
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Implement Pretty() function for AST nodes.
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
This commit adds .Pretty() for all nodes of PromQL AST.
Each .Pretty() prettifies the node it belongs to, and under
no circustance, the parent or child node is touch/prettified.
Read more in the "Approach" part in `prettier.go`
* Refactor functions between printer.go & prettier.go
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
This commit removes redundancy between printer.go and prettier.go
by taking out the common code into separate private functions.
* Add more unit tests for Prettier.
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
* Add support for spliting function calls with 1 arg & unary expressions.
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
This commit does 2 things:
1. It adds support to split function calls that have 1 arg and exceeds the max_characters_per_line
to multiple lines.
2. Splits Unary expressions that exceed the max_characters_per_line. This is done by formatting the child node
and then removing the prefix indent, which is already applied before the unary operator.
This follow a simple function-based approach to access the count and
sum fields of a native Histogram. It might be more elegant to
implement “accessors” via the dot operator, as considered in the
brainstorming doc [1]. However, that would require the introduction of
a whole new concept in PromQL. For the PoC, we should be fine with the
function-based approch. Even the obvious inefficiencies (rate'ing a
whole histogram twice when we only want to rate each the count and the
sum once) could be optimized behind the scenes.
Note that the function-based approach elegantly solves the problem of
detecting counter resets in the sum of observations in the case of
negative observations. (Since the whole native Histogram is rate'd,
the counter reset is detected for the Histogram as a whole.)
We will decide later if an “accessor” approach is really needed. It
would change the example expression for average duration in
functions.md from
histogram_sum(rate(http_request_duration_seconds[10m]))
/
histogram_count(rate(http_request_duration_seconds[10m]))
to
rate(http_request_duration_seconds.sum[10m])
/
rate(http_request_duration_seconds.count[10m])
[1]: https://docs.google.com/document/d/1ch6ru8GKg03N02jRjYriurt-CZqUVY09evPg6yKTA1s/edit
Signed-off-by: beorn7 <beorn@grafana.com>
Essentially, this mirrors the existing behavior for negative buckets:
If a histogram has only negative buckets, the upper bound of the zero
bucket is assumed to be zero.
Furthermore, it makes sure that the zero bucket boundaries are not
modified if a histogram that has no buckets at all but samples in the
zero bucket.
Also, add an TODO to vet if we really want this behavior.
Signed-off-by: beorn7 <beorn@grafana.com>
* Labels: create signature with/without labels
Instead of creating a new Labels slice then converting to signature,
go directly to the signature and save time.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Labels: refactor Builder tests
Have one test with a range of cases, and have them check the final
output rather than checking the internal structure of the Builder.
Also add a couple of cases where the value is "", which should be
interpreted as 'delete'.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Labels: add 'Keep' function to Builder
This lets us replace `Labels.WithLabels` with the more general `Builder`.
In `engine.resultMetric()` we can call `Keep()` instead of checking
and calling `Del()`.
Avoid calling `Sort()` in `Builder.Labels()` if we didn't add anything,
so that `Keep()` has the same performance as `WithLabels()`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
While empty buckets can make sense in the internal representation (by
joining spans that would otherwise need more overhead for separate
representation), there are no spans in the JSON rendering. Therefore,
the JSON should not contain any empty buckets, since any buckets not
included in the output counts as empty anyway.
This changes both the inefficient MarshalJSON implementation as well
as the jsoniter implementation.
Signed-off-by: beorn7 <beorn@grafana.com>
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir
Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
This now even enables jsoniter marshaling of Points in an instant
query (which previously used the traditional JSON marshaling).
Signed-off-by: beorn7 <beorn@grafana.com>
For conventional histograms, we need to gather all the individual
bucket timeseries at a data point to do the quantile calculation. The
code so far mirrored this behavior for the new native
histograms. However, since a single data point contains all the
buckets alreade, that's actually not needed. This PR simplifies the
code while still detecting a mix of conventional and native
histograms.
The weird signature calculation for the conventional histograms is
getting even weirder because of that. If this PR turns out to do the
right thing, I will implement a proper fix for the signature
calculation upstream.
Signed-off-by: beorn7 <beorn@grafana.com>