The ChunkReader interface's Chunk() has been changed to ChunkOrIterable().
This is a precursor to OOO native histogram support - with OOO native histograms, the chunks.Meta passed to Chunk() can result in multiple chunks being returned rather than just a single chunk (e.g. if oooMergedChunk has a counter reset in the middle).
To support this, ChunkOrIterable() requires either a single chunk or an iterable to be returned. If an iterable is returned, the caller has the responsibility of converting the samples from the iterable into possibly multiple chunks. The OOOHeadChunkReader now returns an iterable rather than a chunk to prepare for the native histograms case. Also as a beneficial side effect, oooMergedChunk and boundedChunk has been simplified as they only need to implement the Iterable interface now, not the full Chunk interface.
---------
Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
I initially didn't copy the otlptranslator/prometheus folder because I
assumed it wouldn't get changes. But it did. So this PR fixes that and
updates the Collector version.
Supersedes: https://github.com/prometheus/prometheus/pull/12809
Signed-off-by: Goutham <gouthamve@gmail.com>
* Fix issue where `chainSampleIterator` can obscure errors
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Address PR feedback.
Signed-off-by: Charles Korn <charles.korn@grafana.com>
---------
Signed-off-by: Charles Korn <charles.korn@grafana.com>
Two issues are fixed here, that lead to the same problem:
1. If `newSampleRing` is called with an unknown ValueType including
ValueNone, we have initialized the interface buffer (`iBuf`).
However, we would still use a specialized buffer for the first
sample, opportunistically assuming that we might still not
encounter mixed samples and we should go down the more efficient
road.
2. If the `sampleRing` is `reset`, we leave all buffers alone,
including `iBuf`, which is generally fine, but not for `iBuf`, see
below.
In both cases, `iBuf` already contains values, but we will fill one of
the specialized buffers first. Once we then actually encounter mixed
samples, the content of the specialized buffer is copied into `iBuf`
using `append`. That's by itself the right idea because `iBuf` might
be `nil`, and even if not, it might or might not have the right
capacity. However, this approach assumes that `iBuf` is empty, or more
precisely has a length of zero.
This commit makes sure that `iBuf` does not get needlessly initialized
in `newSampleRing` and that it is emptied upon `reset`.
A test case is added to demonstrate both issues above.
Signed-off-by: beorn7 <beorn@grafana.com>
Instead of setting to nil and allocating a new slice every time the
merge is advanced, re-use the previous slice.
This is safe because the `currentSets` member is only used inside member
functions, and explicitly copied in `At()`, the only place it leaves the
struct.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
The new version features a set of test cases that simplify the addition
of new HTTP status codes.
Signed-off-by: William Dumont <william.dumont@grafana.com>
Header name is `Retry-Attempt`, only set when >0.
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com>
Return annotations (warnings and infos) from PromQL queries
This generalizes the warnings we have already used before (but only for problems with remote read) as "annotations".
Annotations can be warnings or infos (the latter could be false positives). We do not treat them different in the API for now and return them all as "warnings". It would be easy to distinguish them and return infos separately, should that appear useful in the future.
The new annotations are then used to create a lot of warnings or infos during PromQL evaluations. Partially these are things we have wanted for a long time (e.g. inform the user that they have applied `rate` to a metric that doesn't look like a counter), but the new native histograms have created even more needs for those annotations (e.g. if a query tries to aggregate float numbers with histograms).
The annotations added here are not yet complete. A prominent example would be a warning about a range too short for a rate calculation. But such a warnings is more tricky to create with good fidelity and we will tackle it later.
Another TODO is to take annotations into account when evaluating recording rules.
---------
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
promql: Extend testing framework to support native histograms
This includes both the internal testing framework as well as the rules unit test feature of promtool.
This also adds a bunch of basic tests. Many of the code level tests can now be converted to tests within the framework, and more tests can be added easily.
---------
Signed-off-by: Harold Dost <h.dost@criteo.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Stephen Lang <stephen.lang@grafana.com>
Co-authored-by: Harold Dost <h.dost@criteo.com>
Co-authored-by: Stephen Lang <stephen.lang@grafana.com>
Co-authored-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Add a chunk size limit in bytes
This creates a hard cap for XOR chunks of 1024 bytes.
The limit for histogram chunk is also 1024 bytes, but it is a soft limit as a histogram has a dynamic size, and even a single one could be larger than 1024 bytes.
This also avoids cutting new histogram chunks if the existing chunk has fewer than 10 histograms yet. In that way, we are accepting "jumbo chunks" in order to have at least 10 histograms in a chunk, allowing compression to kick in.
Signed-off-by: Justin Lei <justin.lei@grafana.com>
So far, `ValidateHistogram` would not detect if the count did not
include the count in the zero bucket. This commit fixes the problem
and updates all the tests that have been undetected offenders so far.
Note that this problem would only ever create false negatives, so we
never falsely rejected to store a histogram because of it.
On the other hand, `ValidateFloatHistogram` has been to strict with
the count being at least as large as the sum of the counts in all the
buckets. Float precision issues could create false positives here, see
products of PromQL evaluations, it's actually quite hard to put an
upper limit no the floating point imprecision. Users could produce the
weirdest expressions, maxing out float precision problems. Therefore,
this commit simply removes that particular check from
`ValidateFloatHistogram`.
Signed-off-by: beorn7 <beorn@grafana.com>
* Add OTLP Ingestion endpoint
We copy files from the otel-collector-contrib. See the README in
`storage/remote/otlptranslator/README.md`.
This supersedes: https://github.com/prometheus/prometheus/pull/11965
Signed-off-by: gouthamve <gouthamve@gmail.com>
* Return a 200 OK
It is what the OTEL Golang SDK expect :(
https://github.com/open-telemetry/opentelemetry-go/issues/4363
Signed-off-by: Goutham <gouthamve@gmail.com>
---------
Signed-off-by: gouthamve <gouthamve@gmail.com>
Signed-off-by: Goutham <gouthamve@gmail.com>
* WIP implement WAL watcher reading via notifications over a channel from
the TSDB code
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Notify via head appenders Commit (finished all WAL logging) rather than
on each WAL Log call
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Fix misspelled Notify plus add a metric for dropped Write notifications
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Update tests to handle new notification pattern
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* this test maybe needs more time on windows?
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* does this test need more time on windows as well?
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* read timeout is already a time.Duration
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* remove mistakenly commited benchmark data files
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* address some review feedback
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* fix missed changes from previous commit
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Fix issues from wrapper function
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* try fixing race condition in test by allowing tests to overwrite the
read ticker timeout instead of calling the Notify function
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* fix linting
Signed-off-by: Callum Styan <callumstyan@gmail.com>
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
The storage.ChunkSeries iterator assumes that a histogram sample can always be
appended to the currently open chunk. This is not the case when there is a counter reset,
or when appending a stale sample to a chunk with non-stale samples. In addition, the open chunk sometimes
needs to be recoded before a sample can be appended.
This commit addresses the issue by implementing a RecodingAppender which can recode incoming
samples in a transparent way. It also detects cases when a sample cannot be appended at all and
returns `false` so that the caller can open a new chunk.
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>