* fix: try to reproduce the bug from https://github.com/prometheus/prometheus/issues/13979 in a test case
Signed-off-by: David Vavra <sevenood@gmail.com>
* fix: data corruption in remote write if max_sample_age is applied
Signed-off-by: David Vavra <sevenood@gmail.com>
* add benchmark for buildTimeSeries which does the filtering
Signed-off-by: Callum Styan <callumstyan@gmail.com>
---------
Signed-off-by: David Vavra <sevenood@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: David Vavra <sevenood@gmail.com>
Co-authored-by: Callum Styan <callumstyan@gmail.com>
Now the error will include the timestamp and the existing and new values.
When you are trying to track down the source of this error, it can be
useful to see that the values are close, or alternating, or something
else.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
To facilitate generating OTel translation code for other Prometheus
compatible backends, modify the prometheusremotewrite sources slightly
so that the PrometheusConverter.TimeSeries method is in a file called
timeseries.go. The rationale is to allow other backends to define their
own implementation of this method.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
After a lot of productive discussion between the Prometheus and
OpenTelemetry community we decided that it made sense for Prometheus to
own its own copy of the code in charge for handling OTLP ingestion
traffic.
This commit is removing the README and update-copy.sh files that had the
previous steps to update the code.
Also it is updating the licensing of all the files to make sure the
OpenTelemetry provenance is explicit and to state the new ownership.
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
I have seen prometheis instances misebehaving because of broken chinked remote
read requests.
In order to avoid OOM's when this happens, I propose to close the
queries used by the streamed remote read requests earlier.
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
Dogfood native histograms.
Allow dependent projects to migrate to native histograms.
I took the defaults from client_golang.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* storage/remote: disable resharding during active retry backoffs
Today, remote_write reshards based on pure throughput. This is
problematic if throughput has been diminished because of HTTP 429s;
increasing the number of shards due to backpressure will only exacerbate
the problem.
This commit disables resharding for twice the retry backoff, ensuring
that resharding will never occur during an active backoff, and that
resharding does not become enabled again until enough time has elapsed
to allow any pending requests to be retried.
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* storage/remote: test that resharding is disabled on retry
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* storage/remote: address review feedback
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
* storage/remote: track time where resharding initially got disabled
This change introduces a second atomic int64 to roughly track when
resharding got disabled. This int64 is only updated after updating the
disabled timestamp if resharding was previously enabled.
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
---------
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
On the incoming path, `writeHandler.write()` creates a new table for
each request.
`labelProtosToLabels` takes a `ScratchBuilder` now.
Call `NewScratchBuilder` as required in tests.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This adds support for the new grammar of `{"metric_name", "l1"="val"}` to promql and some of the exposition formats.
This grammar will also be valid for non-UTF-8 names.
UTF-8 names will not be considered valid unless model.NameValidationScheme is changed.
This does not update the go expfmt parser in text_parse.go, which will be addressed by https://github.com/prometheus/common/issues/554/.
Part of https://github.com/prometheus/prometheus/issues/13095
Signed-off-by: Owen Williams <owen.williams@grafana.com>
If given a single querier, just return it instead of constructing a
complicated wrapper. The code in `mergeGenericQuerier` which skipped
merging when there was only one is not needed any more.
This change required a few tests to be tweaked, because they relied on
the specific behaviour of `mergeGenericQuerier.Select()`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
One was silently doing nothing; one was doing something but the work
didn't go up linearly with iteration count.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Relabeling can take a pre-populated `Builder` instead of making a new
one every time. This is much more efficient.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This PR is a reference implementation of the proposal described in #10420.
In addition to what described in #10420, in this PR I've introduced labels.StableHash(). The idea is to offer an hashing function which doesn't change over time, and that's used by query sharding in order to get a stable behaviour over time. The implementation of labels.StableHash() is the hashing function used by Prometheus before stringlabels, and what's used by Grafana Mimir for query sharding (because built before stringlabels was a thing).
Follow up work
As mentioned in #10420, if this PR is accepted I'm also open to upload another foundamental piece used by Grafana Mimir query sharding to accelerate the query execution: an optional, configurable and fast in-memory cache for the series hashes.
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Optimize histogram iterators
Histogram iterators allocate new objects in the AtHistogram and
AtFloatHistogram methods, which makes calculating rates over long
ranges expensive.
In #13215 we allowed an existing object to be reused
when converting an integer histogram to a float histogram. This commit follows
the same idea and allows injecting an existing object in the AtHistogram and
AtFloatHistogram methods. When the injected value is nil, iterators allocate
new histograms, otherwise they populate and return the injected object.
The commit also adds a CopyTo method to Histogram and FloatHistogram which
is used in the BufferedIterator to overwrite items in the ring instead of making
new copies.
Note that a specialized HPoint pool is needed for all of this to work
(`matrixSelectorHPool`).
---------
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Prometheus is hard-coded to use a fanout storage between TSDB and
a remote storage which by default is empty.
This change detects the empty storage and skips merging between
result sets, which would make `Select()` sort results.
Bottom line: we skip a sort unless there really is some remote storage
configured.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>