prometheus

Commit Graph

Author	SHA1	Message	Date
Alban Hurtaud	4b56af7eb8	Add hidden flag for the delayed compaction random time window (#14919 ) * Add hidden flag for the delayed compaction random time window Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Update cmd/prometheus/main.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> * Update cmd/prometheus/main.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> * Update tsdb/db.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> * Fix flag name according to review - add test for delay Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Fix afer main rebase Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Implement review comments Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> * Update generatedelaytest to try with limit values Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> --------- Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com> Signed-off-by: Alban Hurtaud <alban.hurtaud@amadeus.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>	3 weeks ago
Bryan Boreham	105c692f77	[BUGFIX] TSDB: Don't read in-order chunks from before head MinTime Because we are reimplementing the `IndexReader` to fetch in-order and out-of-order chunks together, we must reproduce the behaviour of `Head.indexRange()`, which floors the minimum time queried at `head.MinTime()`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 month ago
TJ Hoplock	6ebfbd2d54	chore!: adopt log/slog, remove go-kit/log For: #14355 This commit updates Prometheus to adopt stdlib's log/slog package in favor of go-kit/log. As part of converting to use slog, several other related changes are required to get prometheus working, including: - removed unused logging util func `RateLimit()` - forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger - move some of the json file logging functionality to use prom/common package functionality - refactored some of the new json file logging for scraping - changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers - updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition - added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>	2 months ago
Bryan Boreham	91de19fbef	[BUGFIX] TSDB: Don't read in-order chunks from before head MinTime Because we are reimplementing the `IndexReader` to fetch in-order and out-of-order chunks together, we must reproduce the behaviour of `Head.indexRange()`, which floors the minimum time queried at `head.MinTime()`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 months ago
Bryan Boreham	6f0d6038b7	[BUGFIX] TSDB: Only query chunks up to truncation time (#14948 ) If the query overlaps the range currently undergoing compaction, we should only fetch chunks up to that time. Need to store that min time in `HeadAndOOOIndexReader`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 months ago
Bryan Boreham	9215252221	[BUGFIX] TSDB: Only query chunks up to truncation time (#14948 ) If the query overlaps the range currently undergoing compaction, we should only fetch chunks up to that time. Need to store that min time in `HeadAndOOOIndexReader`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2 months ago
Carrie Edwards	14e3c05ce8	tsdb: Add support for ingestion of out-of-order native histogram samples (#14546 ) Add support for ingesting OOO native histograms * Add flag for enabling and disabling OOO native histogram ingestion * Update OOO querying tests to include native histogram samples * Add OOO head tests * Add test for OOO native histogram counter reset headers Signed-off-by: Carrie Edwards <edwrdscarrie@gmail.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Co-authored by: Carrie Edwards <edwrdscarrie@gmail.com> Co-authored by: Jeanette Tan <jeanette.tan@grafana.com> Co-authored by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Co-authored by: Fiona Liao <fiona.liao@grafana.com>	2 months ago
beorn7	0f760f63dd	lint: Revamp our linting rules, mostly around doc comments Several things done here: - Set `max-issues-per-linter` to 0 so that we actually see all linter warnings and not just 50 per linter. (As we also set `max-same-issues` to 0, I assume this was the intention from the beginning.) - Stop using the golangci-lint default excludes (by setting `exclude-use-default: false`. Those are too generous and don't match our style conventions. (I have re-added some of the excludes explicitly in this commit. See below.) - Re-add the `errcheck` exclusion we have used so far via the defaults. - Exclude the signature requirement `govet` has for `Seek` methods because we use non-standard `Seek` methods a lot. (But we keep other requirements, while the default excludes completely disabled the check for common method segnatures.) - Exclude warnings about missing doc comments on exported symbols. (We used to be pretty adamant about doc comments, but stopped that at some point in the past. By now, we have about 500 missing doc comments. We may consider reintroducing this check, but that's outside of the scope of this commit. The default excludes of golangci-lint essentially ignore doc comments completely.) - By stop using the default excludes, we now get warnings back on malformed doc comments. That's the most impactful change in this commit. It does not enforce doc comments (again), but _if_ there is a doc comment, it has to have the recommended form. (Most of the changes in this commit are fixing this form.) - Improve wording/spelling of some comments in .golangci.yml, and remove an outdated comment. - Leave `package-comments` inactive, but add a TODO asking if we should change that. - Add a new sub-linter `comment-spacings` (and fix corresponding comments), which avoids missing spaces after the leading `//`. Signed-off-by: beorn7 <beorn@grafana.com>	3 months ago
Arve Knudsen	3a78e76282	Upgrade golangci-lint to v1.60.1 Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	3 months ago
Bryan Boreham	e7e50a3afd	TSDB: Remove code for querying OOO-head only Just query via `HeadAndOOOQuerier`, which will skip series where no in-order chunks are in range. Now we don't need `OOORangeHead`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	3 months ago
Bryan Boreham	e04d137649	[PERF] TSDB: Query head and ooo-head together Add `HeadAndOOOQuerier` which iterates just once over series, then where necessary merges chunks from in-order and out-of-order lists. Add a ChunkQuerier for in-order and ooo together Add copy-last-chunk behaviour to HeadAndOOOChunkReader Out-of-order chunk IDs are distinguished from in-order by setting bit 23. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	3 months ago
Ben Ye	b7a58dcf3d	Add hidden flag to disable overlapping compaction (#14581 ) TSDB: add hidden flag to disable overlapping compaction Signed-off-by: Ben Ye <benye@amazon.com> --------- Signed-off-by: Ben Ye <benye@amazon.com>	4 months ago
machine424	92873d3009	feat: allow to delay head compaction start time helping Prometheus instances to avoid simultaneous compactions and reduce stress on shared resources. This is enabled via `--enable-feature=delayed-compaction`. Signed-off-by: machine424 <ayoubmrini424@gmail.com>	4 months ago
Bryan Boreham	bded853035	[Test] TSDB: TestOOOCompaction with samples added after compaction starts Test fails due to bug. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	4 months ago
Bryan Boreham	5281a6bc1b	TSDB: rebuild labels symbol-table on each compaction Log begin/end for timing, plus some stats. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	5 months ago
Ben Ye	5585a3c7e5	tsdb: expose hook to customize block querier (#14114 ) * expose hook for block querier Signed-off-by: Ben Ye <benye@amazon.com> * update comment Signed-off-by: Ben Ye <benye@amazon.com> * use defined type Signed-off-by: Ben Ye <benye@amazon.com> --------- Signed-off-by: Ben Ye <benye@amazon.com>	5 months ago
Charles Korn	2c5e88748e	Fix issue where pending OOO read can be left dangling if creating querier fails Signed-off-by: Charles Korn <charles.korn@grafana.com>	5 months ago
Ben Ye	5a218708f1	tsdb: Extend compactor interface to allow compactions to create multiple output blocks (#14143 ) * add hook to allow head compaction to create multiple output blocks Signed-off-by: Ben Ye <benye@amazon.com> * change Compact interface; remove BlockPopulator changes Signed-off-by: Ben Ye <benye@amazon.com> * rebase main Signed-off-by: Ben Ye <benye@amazon.com> * fix lint Signed-off-by: Ben Ye <benye@amazon.com> * fix unit test Signed-off-by: Ben Ye <benye@amazon.com> * address feedbacks; add unit test Signed-off-by: Ben Ye <benye@amazon.com> * Apply suggestions from code review Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Update tsdb/compact_test.go Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> --------- Signed-off-by: Ben Ye <benye@amazon.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	5 months ago
Ben Ye	8a08f452b6	tsdb: Allow passing a custom compactor to override the default one (#14113 ) * expose hook in tsdb to allow customizing compactor Signed-off-by: Ben Ye <benye@amazon.com> * address comment Signed-off-by: Ben Ye <benye@amazon.com> --------- Signed-off-by: Ben Ye <benye@amazon.com>	6 months ago
Arve Knudsen	d699dc3c77	Fix language in docs and comments (#14041 ) Fix language in docs and comments --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: Björn Rabenstein <github@rabenste.in>	7 months ago
machine424	c5a1cc9148	chore(tsdb): add a sandboxDir to DBReadOnly, the directory can be used for transient file writes. use it in loadDataAsQueryable to make sure the RO Head doesn't truncate or cut new chunks in data/chunks_head/. add a -sandbox-dir-root flag to "promtool tsdb dump/dump-openmetrics" to control the root of that sandbox dirrectory. Signed-off-by: machine424 <ayoubmrini424@gmail.com>	7 months ago
Matthieu MOREL	6f595c6762	golangci-lint: enable whitespace linter (#13905 ) Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	8 months ago
Jonathan Halterman	633224886a	Write out of order hint when initially creating meta file (#13894 ) Signed-off-by: Jonathan Halterman <jonathan@grafana.com> Signed-off-by: Jonathan Halterman <jhalterman@gmail.com> Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>	8 months ago
Łukasz Mierzwa	277f04f0c4	Stop compactions if there's a block to write (#13754 ) * Stop compactions if there's a block to write db.Compact() checks if there's a block to write with HEAD chunks before calling db.compactBlocks(). This is to ensure that if we need to write a block then it happens ASAP, otherwise memory usage might keep growing. But what can also happen is that we don't need to write any block, we start db.compactBlocks(), compaction takes hours, and in the meantime HEAD needs to write out chunks to a block. This can be especially problematic if, for example, you run Thanos sidecar that's uploading block, which requires that compactions are disabled. Then you disable Thanos sidecar and re-enable compactions. When db.compactBlocks() is finally called it might have a huge number of blocks to compact, which might take a very long time, during which HEAD cannot write out chunks to a new block. In such case memory usage will keep growing until either: - compactions are finally finished and HEAD can write a block - we run out of memory and Prometheus gets OOM-killed This change adds a check for pending HEAD block writes inside db.compactBlocks(), so that we bail out early if there are still compactions to run, but we also need to write a new block. Also add a test for compactBlocks. --------- Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com> Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>	8 months ago
carehabit	a672662073	all: fix some typos (#13863 ) Signed-off-by: carehabit <shenyuting@outlook.com>	8 months ago
machine424	2a2e2ed28b	chore(tsdb): set the wbl to nil as well in DBReadOnly.loadDataAsQueryable Signed-off-by: machine424 <ayoubmrini424@gmail.com>	8 months ago
Arve Knudsen	9c7a734063	tsdb.BeyondTimeRetention: Fix comment and test at retention duration Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	8 months ago
Darshan Chaudhary	b7047f7fcb	Fix retention boundary so 2h retention deletes blocks right at the 2h boundary (#9633 ) Signed-off-by: darshanime <deathbullet@gmail.com>	8 months ago
Bryan Boreham	d45b5deb75	TSDB: move function only used in tests Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	8 months ago
Bryan Boreham	3274cac0d3	TSDB: remove unused function Was only used in old WAL implementation. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	8 months ago
Bryan Boreham	87edf1f960	[Cleanup] TSDB: Remove old deprecated WAL implementation Deprecated since 2018. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	8 months ago
György Krajcsovits	4d4d822c36	Add native histograms to latency/duration metrics Dogfood native histograms. Allow dependent projects to migrate to native histograms. I took the defaults from client_golang. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	9 months ago
machine424	f477e0539a	Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21 Prevent adding back golang.org/x/exp/slices. Signed-off-by: machine424 <ayoubmrini424@gmail.com>	9 months ago
Marco Pracucci	501bc6419e	Add ShardedPostings() support to TSDB (#10421 ) This PR is a reference implementation of the proposal described in #10420. In addition to what described in #10420, in this PR I've introduced labels.StableHash(). The idea is to offer an hashing function which doesn't change over time, and that's used by query sharding in order to get a stable behaviour over time. The implementation of labels.StableHash() is the hashing function used by Prometheus before stringlabels, and what's used by Grafana Mimir for query sharding (because built before stringlabels was a thing). Follow up work As mentioned in #10420, if this PR is accepted I'm also open to upload another foundamental piece used by Grafana Mimir query sharding to accelerate the query execution: an optional, configurable and fast in-memory cache for the series hashes. Signed-off-by: Marco Pracucci <marco@pracucci.com>	10 months ago
Giedrius Statkevičius	b695e069b8	tsdb/main: wire "EnableOverlappingCompaction" to tsdb.Options (#13398 ) This added the https://github.com/prometheus/prometheus/pull/13393 "EnableOverlappingCompaction" parameter to the compactor code but not to the tsdb.Options. I forgot about that. Add it to `tsdb.Options` too and set it to `true` in Prometheus. Copy/paste the description from https://github.com/prometheus/prometheus/pull/13393#issuecomment-1891787986 Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	10 months ago
Giedrius Statkevičius	61b4080a14	tsdb/{index,compact}: allow using custom postings encoding format (#13242 ) * tsdb/{index,compact}: allow using custom postings encoding format We would like to experiment with a different postings encoding format in Thanos so in this change I am proposing adding another argument to `NewWriter` which would allow users to change the format if needed. Also, wire the leveled compactor so that it would be possible to change the format there too. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * tsdb/compact: use a struct for leveled compactor options As discussed on Slack, let's use a struct for the options in leveled compactor. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * tsdb: make changes after Bryan's review - Make changes less intrusive - Turn the postings encoder type into a function - Add NewWriterWithEncoder() Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> --------- Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	11 months ago
Giedrius Statkevičius	f36b56a62c	tsdb: remove unused option (#13282 ) Digging around the TSDB code and I've found that this flag is unused so let's remove it. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	12 months ago
Matthieu MOREL	8f6cf3aabb	tsdb: use Go standard errors Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	12 months ago
Charles Korn	59844498f7	Fix issue where queries can fail or omit OOO samples if OOO head compaction occurs between creating a querier and reading chunks (#13115 ) * Add failing test. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Don't run OOO head garbage collection while reads are running. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Add further test cases for different order of operations. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Ensure all queriers are closed if `DB.blockChunkQuerierForRange()` fails. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Ensure all queriers are closed if `DB.Querier()` fails. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Invert error handling in `DB.Querier()` and `DB.blockChunkQuerierForRange()` to make it clearer Signed-off-by: Charles Korn <charles.korn@grafana.com> * Ensure that queries that touch OOO data can't block OOO head garbage collection forever. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Address PR feedback: fix parameter name in comment Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com> Signed-off-by: Charles Korn <charleskorn@users.noreply.github.com> * Address PR feedback: use `lastGarbageCollectedMmapRef` Signed-off-by: Charles Korn <charles.korn@grafana.com> * Address PR feedback: ensure pending reads are cleaned up if creating an OOO querier fails Signed-off-by: Charles Korn <charles.korn@grafana.com> --------- Signed-off-by: Charles Korn <charles.korn@grafana.com> Signed-off-by: Charles Korn <charleskorn@users.noreply.github.com> Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>	1 year ago
Matthieu MOREL	dd8871379a	remplace errors.Errorf by fmt.Errorf Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	1 year ago
Márcio Carôso	dff1c395f6	Expose --storage.tsdb.retention.time in metric prometheus_tsdb_retention_limit_seconds (#12986 ) * Expose --storage.tsdb.retention.time in a metric Signed-off-by: Marcio Caroso <msscaroso@gmail.com> --------- Signed-off-by: Marcio Caroso <msscaroso@gmail.com>	1 year ago
George Krajcsovits	7d7b9eacff	Fix int32 overflow issues (#12978 ) On a 32 bit architecture the size of int is 32 bits. Thus converting from int64, uint64 can overflow it and flip the sign. Try for yourself in playground: package main import "fmt" func main() { x := int64(0x1F0000001) y := int64(1) z := int32(x - y) // numerically this is 0x1F0000000 fmt.Printf("%v\n", z) } Prints -268435456 as if x was smaller. Followup to #12650 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	1 year ago
Ganesh Vernekar	4df2f2432b	Additionally wrap WBL replay error (#12406 ) * Additionally wrap WBL replay error Although WBL replay is already wrapped with errLoadWbl, there are other errors that can happen during a WBL replay. We should not try to repair WAL in those cases. This commit additionally wraps the final error in Head.Init again with errLoadWbl so that WBL replay errors can be identified properly. Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com> Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com> Signed-off-by: Levi Harrison <git@leviharrison.dev>	1 year ago
Ganesh Vernekar	f5913266a1	Additionally wrap WBL replay error (#12406 ) * Additionally wrap WBL replay error Although WBL replay is already wrapped with errLoadWbl, there are other errors that can happen during a WBL replay. We should not try to repair WAL in those cases. This commit additionally wraps the final error in Head.Init again with errLoadWbl so that WBL replay errors can be identified properly. Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com> Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>	1 year ago
Goutham Veeramachaneni	86729d4d7b	Update exp package (#12650 )	1 year ago
Arve Knudsen	4451ba10b4	Add context argument to IndexReader.Postings (#12667 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
Arve Knudsen	6ef9ed0bc3	Add context argument to DB.Delete (#12834 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
Arve Knudsen	6daee89e5f	Add context argument to Querier.Select (#12660 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
Bryan Boreham	0d283effa8	promql: force mmap of head chunks in BenchmarkRangeQuery Otherwise we have a highly unusual situation of over 100 chunks in the headChunks list of each series, which heavily skews performance. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
SuperQ	8d38d59fc5	Cleanup temporary chunk snapshot dirs Simlar to cleanup of WAL files on startup, cleanup temporary chunk_snapshot dirs. This prevents storage space leaks due to terminated snapshots on shutdown. Signed-off-by: SuperQ <superq@gmail.com>	1 year ago

1 2 3 4

165 Commits (release-3.0)