prometheus

Commit Graph

Author	SHA1	Message	Date
Łukasz Mierzwa	277f04f0c4	Stop compactions if there's a block to write (#13754 ) * Stop compactions if there's a block to write db.Compact() checks if there's a block to write with HEAD chunks before calling db.compactBlocks(). This is to ensure that if we need to write a block then it happens ASAP, otherwise memory usage might keep growing. But what can also happen is that we don't need to write any block, we start db.compactBlocks(), compaction takes hours, and in the meantime HEAD needs to write out chunks to a block. This can be especially problematic if, for example, you run Thanos sidecar that's uploading block, which requires that compactions are disabled. Then you disable Thanos sidecar and re-enable compactions. When db.compactBlocks() is finally called it might have a huge number of blocks to compact, which might take a very long time, during which HEAD cannot write out chunks to a new block. In such case memory usage will keep growing until either: - compactions are finally finished and HEAD can write a block - we run out of memory and Prometheus gets OOM-killed This change adds a check for pending HEAD block writes inside db.compactBlocks(), so that we bail out early if there are still compactions to run, but we also need to write a new block. Also add a test for compactBlocks. --------- Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com> Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>	8 months ago
Jonathan Halterman	113938aeb8	Log out of order when writing a block (#13888 ) Signed-off-by: Jonathan Halterman <jonathan@grafana.com>	8 months ago
komisan19	0249e080b4	refactor: utilize standard functions max/min Signed-off-by: komisan19 <18901496+komisan19@users.noreply.github.com>	8 months ago
Nicolas Takashi	8125634086	[refactor] moving mergedOOOChunks Iterator (#13881 ) Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>	8 months ago
carehabit	a672662073	all: fix some typos (#13863 ) Signed-off-by: carehabit <shenyuting@outlook.com>	8 months ago
Ben Ye	ded35ef20d	expose compactor metrics Signed-off-by: Ben Ye <benye@amazon.com>	8 months ago
Nicolas Takashi	0b762db154	[refactor] moving mergedOOOChunks to ooo_head_read Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>	8 months ago
George Krajcsovits	4eab18abd6	[nhcb branch] Use single bit to differentiate between optimized bounds and floats (#13828 ) * Use single bit to differentiate between optimized bounds and floats Use one bit to decide what kind of data to read/write. This reduces storage need of floats from 72 bits to 65 bits and makes the integers store in 5 to 32 bits instead of 16. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com> Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>	8 months ago
Arve Knudsen	35aab01de0	tsdb/wlog.Checkpoint: Handle also float histograms Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	8 months ago
Nick Pillitteri	481f14e1c0	TSDB: Don't rely on integer overflow in head compaction check (#13755 ) * TSDB: Don't compact the head block when empty Don't compact the Head block if there have not yet been any samples appended. Previously, the logic for determining if the head should be compacted relied on the default values for min and max time and integer overflow when they were checked in `Head.compactable()`. The check in `Head.compactable()` effectively did `math.MinInt64 - math.MaxInt64` which overflowed and wrapped to `1`. Since `1` is less than `1.5` times the chunk range, compaction did not happen. This was the correct behavior but relying on overflow wrapping is surprising. This change add a method for checking if the min and max time for the head is unset and uses it to short-circuit compaction in that case. It also replaces several explicit checks for the default value to determine if the head has not yet had any samples added. Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>	8 months ago
Ben Ye	ceca6c4716	[ENHANCEMENT] TSDB: Log more statistics during startup (#13838 ) * log chunk snapshot and mmap chunks replay duration together with total replay duration Signed-off-by: Ben Ye <benye@amazon.com>	8 months ago
zenador	4acbb7dea6	Add custom buckets to native histogram chunks encoding (#13706 ) * add custom bounds to chunks encoding * change custom buckets schema number * rename custom bounds to custom values Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	8 months ago
machine424	2a2e2ed28b	chore(tsdb): set the wbl to nil as well in DBReadOnly.loadDataAsQueryable Signed-off-by: machine424 <ayoubmrini424@gmail.com>	8 months ago
Arve Knudsen	07332f7427	TestTimeRetention: Split into two sub-tests Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	8 months ago
Arve Knudsen	af694dc295	Merge TestDB_BeyondTimeRetention into TestTimeRetention Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	8 months ago
Arve Knudsen	9c7a734063	tsdb.BeyondTimeRetention: Fix comment and test at retention duration Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	8 months ago
Darshan Chaudhary	b7047f7fcb	Fix retention boundary so 2h retention deletes blocks right at the 2h boundary (#9633 ) Signed-off-by: darshanime <deathbullet@gmail.com>	8 months ago
Arve Knudsen	cef1025ea8	tsdb/wlog.Checkpoint: Fix counting of histogram samples Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	8 months ago
Bryan Boreham	d45b5deb75	TSDB: move function only used in tests Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	8 months ago
Bryan Boreham	3274cac0d3	TSDB: remove unused function Was only used in old WAL implementation. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	8 months ago
Arve Knudsen	1de49d5b69	Remove unused function tsdb/chunks.PopulatedChunk (#13763 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	8 months ago
Bryan Boreham	87edf1f960	[Cleanup] TSDB: Remove old deprecated WAL implementation Deprecated since 2018. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	8 months ago
Bryan Boreham	d08f054950	[ENHANCEMENT] TSDB: Check CRC without allocating (#13742 ) Use the existing utility function which does this. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	9 months ago
carrychair	856f6e49c8	fix function and struct name Signed-off-by: carrychair <linghuchong404@gmail.com>	9 months ago
Bryan Boreham	bbe39af99f	tsdb: zero out Labels and memSeries pointers in pool (#13712 ) * tsdb: zero out Labels and memSeries pointers in pool So that the garbage-collector doesn't see this memory as still in use. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> --------- Signed-off-by: Bryan Boreham <bjboreham@gmail.com> Signed-off-by: Björn Rabenstein <github@rabenste.in> Co-authored-by: Björn Rabenstein <github@rabenste.in>	9 months ago
György Krajcsovits	4d4d822c36	Add native histograms to latency/duration metrics Dogfood native histograms. Allow dependent projects to migrate to native histograms. I took the defaults from client_golang. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	9 months ago
machine424	f477e0539a	Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21 Prevent adding back golang.org/x/exp/slices. Signed-off-by: machine424 <ayoubmrini424@gmail.com>	9 months ago
Bryan Boreham	925134e6de	tsdb tests: make work with labels SymbolTable Need to initialize decoders with SymbolTable. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	9 months ago
Bryan Boreham	93b72ec5dd	tsdb: create SymbolTables for labels as required Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	9 months ago
Bryan Boreham	6ed56c9f04	WAL watcher: improve comments Clarify in the first comment that it is `watch()` that waits, and reduce verbiage. The second comment was slightly contradictory to the first and otherwise didn't seem to add much, since `currentSegment` was incremented just a few lines later. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	9 months ago
Bryan Boreham	a975a83079	tsdb: clean up Watcher debug messages Print lastSegment after it gets initialized. Move variable declaration to first use. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	9 months ago
Bryan Boreham	78f46bccca	tsdb/wlog tests: remove unnecessary sleep check Sleep() is documented to return immediately on negative or zero input. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	9 months ago
Callum Styan	0c71230784	fix bug that would cause us to endlessly fall behind (#13583 ) * fix bug that would cause us to only read from the WAL on the 15s fallback timer if remote write had fallen behind and is no longer reading from the WAL segment that is currently being written to Signed-off-by: Callum Styan <callumstyan@gmail.com> * remove unintended logging, fix lint, plus allow test to take slightly longer because cloud CI Signed-off-by: Callum Styan <callumstyan@gmail.com> * address review feedback Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix watcher sleeps in test, flu brain is smooth Signed-off-by: Callum Styan <callumstyan@gmail.com> * increase timeout, unfortunately cloud CI can require a longer timeout Signed-off-by: Callum Styan <callumstyan@gmail.com> --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	9 months ago
Fiona Liao	841a133514	Move histogramsAppended to be more consistent Signed-off-by: Fiona Liao <fiona.liao@grafana.com>	9 months ago
Fiona Liao	52389647b2	Add type label to outOfOrderSamplesAppended metric Signed-off-by: Fiona Liao <fiona.liao@grafana.com>	9 months ago
Bryan Boreham	c0e36e6bb3	Standardise exemplar label as "trace_id" This is consistent with the OpenTelemetry standard, and an example in OpenMetrics. https://github.com/open-telemetry/opentelemetry-specification/blob/89aa01348139/specification/metrics/data-model.md#exemplars https://github.com/OpenObservability/OpenMetrics/blob/138654493130/specification/OpenMetrics.md#exemplars-1 Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	9 months ago
Bryan Boreham	12cac5bd5c	tsdb tests: use go-cmp instead of DeepEquals Also one simpler call checking nil. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	10 months ago
Bryan Boreham	17f48f2b3b	Tests: use replacement DeepEquals in more places Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	10 months ago
Bryan Boreham	39af788dbd	Tests: use replacement DeepEquals using go-cmp Use DeepEqual replacement using go-cmp, which is more flexible. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	10 months ago
Peter Štibraný	e2b9cfeeeb	Enforce chunks ordering when writing index. (#8085 ) Document conditions on chunks. Add check on chunk time ordering. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>	10 months ago
Mikhail Fesenko	419dd265cc	Fix strange code, add messages to code brought in #8106 (#13509 ) Signed-off-by: Mikhail Fesenko <proggga@gmail.com>	10 months ago
Bryan Boreham	16e68c01e4	tests: remove err from message when testify prints it already For instance `require.NoError` will print the unexpected error; we don't need to include it in the message. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	10 months ago
Mikhail Fesenko	5f2c3a5d3e	Small improvements, add const, remove copypasta (#8106 ) Signed-off-by: Mikhail Fesenko <proggga@gmail.com> Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>	10 months ago
Paweł Szulik	5961f78186	Refactor tsdb tests to use testify. Signed-off-by: Paweł Szulik <paul.szulik@gmail.com>	10 months ago
Bryan Boreham	34230bb172	tsdb/wlog: close segment files sooner 'defer' runs at the end of the whole function; we should close each segment file as soon as we finished reading it. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	10 months ago
Marco Pracucci	501bc6419e	Add ShardedPostings() support to TSDB (#10421 ) This PR is a reference implementation of the proposal described in #10420. In addition to what described in #10420, in this PR I've introduced labels.StableHash(). The idea is to offer an hashing function which doesn't change over time, and that's used by query sharding in order to get a stable behaviour over time. The implementation of labels.StableHash() is the hashing function used by Prometheus before stringlabels, and what's used by Grafana Mimir for query sharding (because built before stringlabels was a thing). Follow up work As mentioned in #10420, if this PR is accepted I'm also open to upload another foundamental piece used by Grafana Mimir query sharding to accelerate the query execution: an optional, configurable and fast in-memory cache for the series hashes. Signed-off-by: Marco Pracucci <marco@pracucci.com>	10 months ago
Bryan Boreham	66237c1996	tsdb: use cheaper Mutex on series Mutex is 8 bytes; RWMutex is 24 bytes and much more complicated. Since `RLock` is only used in two places, `UpdateMetadata` and `Delete`, neither of which are hotspots, we should use the cheaper one. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	10 months ago
Marco Pracucci	ec9cada56e	Remove unused isRegexMetaCharacter() Signed-off-by: Marco Pracucci <marco@pracucci.com>	10 months ago
Marco Pracucci	515890ec53	Use Matcher.SetMatches() Signed-off-by: Marco Pracucci <marco@pracucci.com>	10 months ago
Marco Pracucci	a1a45990a2	Fix TestPostingsForMatcher Signed-off-by: Marco Pracucci <marco@pracucci.com>	10 months ago

... 2 3 4 5 6 ...

1195 Commits (ab64966e9d21ce3a3e42415da3a4227f8220b15c)