Łukasz Mierzwa
277f04f0c4
Stop compactions if there's a block to write ( #13754 )
...
* Stop compactions if there's a block to write
db.Compact() checks if there's a block to write with HEAD chunks before calling db.compactBlocks().
This is to ensure that if we need to write a block then it happens ASAP, otherwise memory usage might keep growing.
But what can also happen is that we don't need to write any block, we start db.compactBlocks(),
compaction takes hours, and in the meantime HEAD needs to write out chunks to a block.
This can be especially problematic if, for example, you run Thanos sidecar that's uploading block,
which requires that compactions are disabled. Then you disable Thanos sidecar and re-enable compactions.
When db.compactBlocks() is finally called it might have a huge number of blocks to compact, which might
take a very long time, during which HEAD cannot write out chunks to a new block.
In such case memory usage will keep growing until either:
- compactions are finally finished and HEAD can write a block
- we run out of memory and Prometheus gets OOM-killed
This change adds a check for pending HEAD block writes inside db.compactBlocks(), so that
we bail out early if there are still compactions to run, but we also need to write a new
block.
Also add a test for compactBlocks.
---------
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>
2024-04-07 18:28:28 +01:00
Bryan Boreham
fc567a1df8
Merge pull request #13889 from komisan19/refactor/utilize_standard_functions_max/min
...
refactor: utilize standard functions max/min in promtool and tsdb
2024-04-06 10:23:18 +01:00
Arthur Silva Sens
b4a973753c
Merge pull request #13897 from dashpole/unregister_scrape_metrics
2024-04-05 14:44:32 -03:00
David Ashpole
c755fa9935
support unregistering scrape manager metrics
...
Signed-off-by: David Ashpole <dashpole@google.com>
2024-04-05 16:00:52 +00:00
Bryan Boreham
2278d2377c
Merge pull request #13744 from bboreham/wip-aggr-index
...
[ENHANCEMENT] PromQL: Re-structure aggregations for clarity and performance
2024-04-05 16:34:57 +01:00
Bryan Boreham
12961c6a37
promql: refactor: eliminate one 'else'
...
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
0ac927515b
promql: move group-seen into group struct
...
Save allocating an auxilliary array.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
7499d90913
promql: remove pointer to aggregation groups
...
Just allocate in one slice.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
cfbeb6681b
promql: re-use one heap for topk and bottomk
...
Slightly ugly casting saves memory.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
5e3914a27c
promql: remove histogramMean from groupedAggregation
...
Re-use histogramValue since we don't need them separately.
Tidy up initialization.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
2cf3c9de8f
promql: store labels per-group only for count_values
...
This saves memory in other kinds of aggregation.
We don't need `orderedResult` in `aggregationCountValues`; the ordering
is not guaranteed.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
185290a0d2
promql: pull checking of q and k out of loop
...
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
4584f67e17
promql: inline nextSample function
...
Move Sample out of loop to reduce allocations, otherwise it escapes to
the heap.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
526ce4ee7a
promql: simplify data collection in aggregations
...
We don't need a Sample, just the float and histogram values.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
2f03acbafc
promql: refactor: split topk/bottomk from sum/avg/etc
...
They aggregate results in different ways.
topk/bottomk don't consider histograms so can simplify data collection.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
74eed67ef6
promql: refactor: pull fetching input data out of rangeEvalAgg
...
This is a cleaner split of responsibilities.
We now check the sample count after calling rangeEvalAgg.
Changed re-use of samples to use `Clone` and `defer`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
602eb69edf
promql: refactor: extract function nextSample
...
With sub-function nextValues which we shall use shortly.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
eb41e770b7
promql: refactor: extract function addToSeries
...
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
53a3138eeb
promql aggregations: pre-generate mapping from inputs to outputs
...
So we don't have to re-create it on every time step.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
cb6c4b3092
promql: simplify k/q parameter to topk/bottomk/quantile
...
Pass it as a float64 not as interface{}.
Make k a simple int, since that is the parameter to make().
Pull invalid quantile warning out of the loop.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
b3bda7df4b
promql: aggregations: skip copying input to a Vector
...
We can work directly from the inputMatrix on each timestep.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
c9b6c4c55a
promql: aggregations: output directly to matrix for instant queries
...
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
3851b74db1
promql: aggregations: skip result vector in range queries
...
Adjust test to match the lower count, since samples in the vector
are no longer counted.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
59548b8a0b
promql: refactor: move collection of results into aggregation()
...
We don't need to check for duplicates as aggregation cannot generate them.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
bd9bdccb22
promql: refactor: simplify internal data structures
...
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
5f10d17cef
promql: refactor: split out aggregations over range
...
The new function `rangeEvalAgg` is mostly a copy of `rangeEval`, but
without `initSeries` which we don't need and inlining the callback to
`aggregation()`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
e5f667537c
promql: refactor: initialize aggregation before storing in map
...
This seems more consistent to me.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
29244fb841
promql: refactor: extract count_values implementation
...
The existing aggregation function is very long and covers very different
cases.
`aggregationCountValues` is just for `count_values`, which differs from
other aggregations in that it outputs as many series per group as there
are values in the input.
Remove the top-level switch on string parameter type; use the same `Op`
check there as elswehere.
Pull checking parameters out to caller, where it is only executed once.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
Bryan Boreham
8e04ab6dd4
promql: refactor: extract generateGroupingLabels function
...
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-05 15:47:54 +01:00
David Ashpole
bbfc72b4e2
support unregistering discovery manager metrics ( #13896 )
...
Signed-off-by: David Ashpole <dashpole@google.com>
2024-04-05 16:19:07 +02:00
Julien
8b72ed77f8
Merge pull request #13869 from prometheus/dependabot/go_modules/go-opentelemetry-io-1393210b43
...
build(deps): bump the go-opentelemetry-io group with 3 updates
2024-04-05 14:59:36 +02:00
Sven Dewit
dc7d3fbc3c
fix: scrape_config/interval relabelling is not experimental any more
...
Signed-off-by: Sven Dewit <sven.dewit@1und1.de>
2024-04-05 12:22:16 +02:00
Julien
8eb9228af8
Merge pull request #13864 from yeya24/expose-compactor-metrics
...
Expose compactor metrics
2024-04-05 11:24:41 +02:00
Julien
48c8ec19bc
Merge pull request #13882 from prometheus/update-featureflag-docs
...
Update documentation about existing feature-flags
2024-04-05 11:22:46 +02:00
dandrucz
38b75bc0d7
Linode IPv6 Range support, Optional region filtering, Missing fields in Documentation fixed ( #13774 )
...
* Add support for discovering Linode IPv6 ranges associated with linodes.
* Add optional but recommended region filtering (faster queries, more relevant information).
* Added missing fields in configuration.md, updated linode test cases.
* Convert to TableDrivenTests as per tjhop request.
Signed-off-by: David Andruczyk <dandrucz@akamai.com>
2024-04-05 09:31:59 +01:00
Jonathan Halterman
113938aeb8
Log out of order when writing a block ( #13888 )
...
Signed-off-by: Jonathan Halterman <jonathan@grafana.com>
2024-04-04 14:26:13 +02:00
Bryan Boreham
8799581b24
Merge pull request #13554 from machine424/k8s-failures
...
discovery(k8s): add metric prometheus_sd_kubernetes_failures_total
2024-04-04 10:43:44 +01:00
komisan19
0249e080b4
refactor: utilize standard functions max/min
...
Signed-off-by: komisan19 <18901496+komisan19@users.noreply.github.com>
2024-04-04 03:15:38 +09:00
Bryan Boreham
31491eb37c
Merge pull request #13885 from bboreham/readable-test
...
[TESTS] Truncate some long test names, for readability
2024-04-03 16:55:14 +01:00
Bryan Boreham
7c28521451
[TESTS] Truncate some long test names, for readability
...
The strings produced by these tests can run to thousands of characters,
which makes test logs difficult to read.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-03 10:10:39 +01:00
Charles Korn
cd72ebb05f
promql: include more details in error message when creating test query fails or an unexpected series is returned ( #13847 )
...
* promql: include more details in error message when creating test query fails
Signed-off-by: Charles Korn <charles.korn@grafana.com>
* Include more details when an unexpected metric is returned
Signed-off-by: Charles Korn <charles.korn@grafana.com>
---------
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-04-03 10:57:08 +02:00
Nicolas Takashi
8125634086
[refactor] moving mergedOOOChunks Iterator ( #13881 )
...
Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>
2024-04-03 10:14:34 +02:00
Arthur Silva Sens
db64d2dcdc
Update documentation about existing feature-flags
...
Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
2024-04-02 19:18:57 -03:00
Julius Volz
9b7de47787
Remove unused Dmn field on EvalNodeHelper ( #13877 )
...
https://github.com/prometheus/prometheus/pull/13446 removed the last usage of
this field, but didn't remove the field.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2024-04-02 18:45:46 +02:00
dependabot[bot]
b9453ff51f
build(deps): bump bufbuild/buf-lint-action from 1.1.0 to 1.1.1
...
Bumps [bufbuild/buf-lint-action](https://github.com/bufbuild/buf-lint-action ) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/bufbuild/buf-lint-action/releases )
- [Commits](044d13acb1...06f9dd823d
)
---
updated-dependencies:
- dependency-name: bufbuild/buf-lint-action
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-04-01 23:44:18 +00:00
dependabot[bot]
785f761004
build(deps): bump bufbuild/buf-breaking-action from 1.1.2 to 1.1.4
...
Bumps [bufbuild/buf-breaking-action](https://github.com/bufbuild/buf-breaking-action ) from 1.1.2 to 1.1.4.
- [Release notes](https://github.com/bufbuild/buf-breaking-action/releases )
- [Commits](f47418c81c...c57b3d842a
)
---
updated-dependencies:
- dependency-name: bufbuild/buf-breaking-action
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-04-01 23:44:13 +00:00
dependabot[bot]
2e6c1c35a4
build(deps): bump the go-opentelemetry-io group with 3 updates
...
Bumps the go-opentelemetry-io group with 3 updates: [go.opentelemetry.io/collector/featuregate](https://github.com/open-telemetry/opentelemetry-collector ), [go.opentelemetry.io/collector/pdata](https://github.com/open-telemetry/opentelemetry-collector ) and [go.opentelemetry.io/collector/semconv](https://github.com/open-telemetry/opentelemetry-collector ).
Updates `go.opentelemetry.io/collector/featuregate` from 1.3.0 to 1.4.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-collector/releases )
- [Changelog](https://github.com/open-telemetry/opentelemetry-collector/blob/main/CHANGELOG-API.md )
- [Commits](https://github.com/open-telemetry/opentelemetry-collector/compare/pdata/v1.3.0...pdata/v1.4.0 )
Updates `go.opentelemetry.io/collector/pdata` from 1.3.0 to 1.4.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-collector/releases )
- [Changelog](https://github.com/open-telemetry/opentelemetry-collector/blob/main/CHANGELOG-API.md )
- [Commits](https://github.com/open-telemetry/opentelemetry-collector/compare/pdata/v1.3.0...pdata/v1.4.0 )
Updates `go.opentelemetry.io/collector/semconv` from 0.96.0 to 0.97.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-collector/releases )
- [Changelog](https://github.com/open-telemetry/opentelemetry-collector/blob/main/CHANGELOG-API.md )
- [Commits](https://github.com/open-telemetry/opentelemetry-collector/compare/v0.96.0...v0.97.0 )
---
updated-dependencies:
- dependency-name: go.opentelemetry.io/collector/featuregate
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: go-opentelemetry-io
- dependency-name: go.opentelemetry.io/collector/pdata
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: go-opentelemetry-io
- dependency-name: go.opentelemetry.io/collector/semconv
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: go-opentelemetry-io
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-04-01 23:37:04 +00:00
dependabot[bot]
1eb88a8723
build(deps): bump github.com/prometheus/prometheus
...
Bumps [github.com/prometheus/prometheus](https://github.com/prometheus/prometheus ) from 0.50.1 to 0.51.1.
- [Release notes](https://github.com/prometheus/prometheus/releases )
- [Changelog](https://github.com/prometheus/prometheus/blob/main/CHANGELOG.md )
- [Commits](https://github.com/prometheus/prometheus/compare/v0.50.1...v0.51.1 )
---
updated-dependencies:
- dependency-name: github.com/prometheus/prometheus
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-04-01 23:33:16 +00:00
Augustin Husson
9e2c335bab
Merge pull request #13855 from prometheus/merge-2.51-into-main
...
Merge 2.51.1 into main
2024-04-01 21:54:20 +02:00
carehabit
a672662073
all: fix some typos ( #13863 )
...
Signed-off-by: carehabit <shenyuting@outlook.com>
2024-04-01 18:06:05 +02:00