prometheus

Commit Graph

Author	SHA1	Message	Date
Ben Ye	ceca6c4716	[ENHANCEMENT] TSDB: Log more statistics during startup (#13838 ) * log chunk snapshot and mmap chunks replay duration together with total replay duration Signed-off-by: Ben Ye <benye@amazon.com>	2024-03-26 11:16:27 +00:00
György Krajcsovits	4d4d822c36	Add native histograms to latency/duration metrics Dogfood native histograms. Allow dependent projects to migrate to native histograms. I took the defaults from client_golang. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-03-01 14:44:38 +01:00
Bryan Boreham	93b72ec5dd	tsdb: create SymbolTables for labels as required Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-02-26 11:45:25 +00:00
Fiona Liao	52389647b2	Add type label to outOfOrderSamplesAppended metric Signed-off-by: Fiona Liao <fiona.liao@grafana.com>	2024-02-19 15:24:39 +00:00
Bryan Boreham	cd4562d3a6	Merge pull request #13473 from bboreham/pure-mutex tsdb: use cheaper Mutex on series	2024-01-30 09:57:08 +00:00
Marco Pracucci	501bc6419e	Add ShardedPostings() support to TSDB (#10421 ) This PR is a reference implementation of the proposal described in #10420. In addition to what described in #10420, in this PR I've introduced labels.StableHash(). The idea is to offer an hashing function which doesn't change over time, and that's used by query sharding in order to get a stable behaviour over time. The implementation of labels.StableHash() is the hashing function used by Prometheus before stringlabels, and what's used by Grafana Mimir for query sharding (because built before stringlabels was a thing). Follow up work As mentioned in #10420, if this PR is accepted I'm also open to upload another foundamental piece used by Grafana Mimir query sharding to accelerate the query execution: an optional, configurable and fast in-memory cache for the series hashes. Signed-off-by: Marco Pracucci <marco@pracucci.com>	2024-01-29 11:57:27 +00:00
Bryan Boreham	66237c1996	tsdb: use cheaper Mutex on series Mutex is 8 bytes; RWMutex is 24 bytes and much more complicated. Since `RLock` is only used in two places, `UpdateMetadata` and `Delete`, neither of which are hotspots, we should use the cheaper one. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-01-26 11:01:39 +00:00
Bryan Boreham	b9eab6e4b8	tsdb: simplify internal series delete function (#13261 ) Lifting an optimisation from Agent code, `seriesHashmap.del` can use the unique series reference, doesn't need to check Labels. Also streamline the logic for deleting from `unique` and `conflicts` maps, and add some comments to help the next person. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-01-25 11:57:54 +01:00
Bryan Boreham	3f30ad3cc2	Merge pull request #13015 from bboreham/smaller-txring tsdb: make transaction isolation data structures smaller	2024-01-25 10:48:15 +00:00
Matthieu MOREL	8f6cf3aabb	tsdb: use Go standard errors Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-12-11 12:18:54 +00:00
Arthur Silva Sens	5082655392	Append Created Timestamps (#12733 ) * Append created timestamps. Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> * Log when created timestamps are ignored Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> * Proposed changes to Append CT PR. Changes: * Changed textparse Parser interface for consistency and robustness. * Changed CT interface to be more explicit and handle validation. * Simplified test, change scrapeManager to allow testability. * Added TODOs. Signed-off-by: bwplotka <bwplotka@gmail.com> * Updates. Signed-off-by: bwplotka <bwplotka@gmail.com> * Addressed comments. Signed-off-by: bwplotka <bwplotka@gmail.com> * Refactor head_appender test Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> * Fix linter issues Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> * Use model.Sample in head appender test Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> --------- Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com> Signed-off-by: bwplotka <bwplotka@gmail.com> Co-authored-by: bwplotka <bwplotka@gmail.com>	2023-12-11 08:43:42 +00:00
Xiaochao Dong	28d8f1650c	tsdb: Make sure the cache for postings cardinality properly honors the label name (#12653 ) Add a string remembering which label and limit the cache corresponds to. Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>	2023-11-28 13:54:37 +00:00
Arve Knudsen	1200c89d0c	Fix tsdb.stripeSeries.gc so it handles conflicts properly (#13195 ) * Fix tsdb.stripeSeries.gc so it handles conflicts properly tsdb.stripeSeries.gc needs to prune seriesHashmap.conflicts first, otherwise seriesHashmap replaces the unique field with the first among the conflicts. Also add regression test. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> * TestStripeSeries_gc: Support stringlabels, don't use internals Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2023-11-28 14:43:35 +01:00
Arve Knudsen	ecc37588b0	tsdb: seriesHashmap.set by making receiver a pointer (#13193 ) * Fix tsdb.seriesHashmap.set by making receiver a pointer The method tsdb.seriesHashmap.set currently doesn't set the conflicts field properly, due to the receiver being a non-pointer. Fix by turning the receiver into a pointer, and add a corresponding regression test. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2023-11-27 15:40:30 +00:00
Charles Korn	59844498f7	Fix issue where queries can fail or omit OOO samples if OOO head compaction occurs between creating a querier and reading chunks (#13115 ) * Add failing test. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Don't run OOO head garbage collection while reads are running. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Add further test cases for different order of operations. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Ensure all queriers are closed if `DB.blockChunkQuerierForRange()` fails. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Ensure all queriers are closed if `DB.Querier()` fails. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Invert error handling in `DB.Querier()` and `DB.blockChunkQuerierForRange()` to make it clearer Signed-off-by: Charles Korn <charles.korn@grafana.com> * Ensure that queries that touch OOO data can't block OOO head garbage collection forever. Signed-off-by: Charles Korn <charles.korn@grafana.com> * Address PR feedback: fix parameter name in comment Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com> Signed-off-by: Charles Korn <charleskorn@users.noreply.github.com> * Address PR feedback: use `lastGarbageCollectedMmapRef` Signed-off-by: Charles Korn <charles.korn@grafana.com> * Address PR feedback: ensure pending reads are cleaned up if creating an OOO querier fails Signed-off-by: Charles Korn <charles.korn@grafana.com> --------- Signed-off-by: Charles Korn <charles.korn@grafana.com> Signed-off-by: Charles Korn <charleskorn@users.noreply.github.com> Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>	2023-11-24 12:38:38 +01:00
Bryan Boreham	f13bc1a5c9	Merge pull request #13040 from bboreham/smaller-stripeseries TSDB: make the global hash lookup table smaller	2023-11-20 12:12:09 +00:00
Oleg Zaytsev	f997c72f29	Make head block ULIDs descriptive (#13100 ) * Make head block ULIDs descriptive As far as I understand, these ULIDs aren't persisted anywhere, so it should be safe to change them. When debugging an issue, seeing an ULID like `2ZBXFNYVVFDXFPGSB1CHFNYQTZ` or `33DXR7JA39CHDKMQ9C40H6YVVF` isn't very helpful, so I propose to make them readable in their ULID string version. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Set a different ULID for RangeHead Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> --------- Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2023-11-17 12:29:36 +01:00
Matthieu MOREL	dd8871379a	remplace errors.Errorf by fmt.Errorf Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-11-14 13:04:31 +00:00
Bryan Boreham	65a443e6e3	TSDB: initialize conflicts map only when we need it. Suggested by @songjiayang. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-11-13 16:20:31 +00:00
Bryan Boreham	e6c0f69f98	TSDB: Only pay for hash collisions when they happen Instead of a map of slices of `memSeries`, ready for any of them to hold series where hash values collide, split into a map of `memSeries` and a map of slices which is usually empty, since hash collisions are a one-in-a-billion thing. The `del` method gets more complicated, to maintain the invariant that a series is only in one of the two maps. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-11-09 07:44:39 -06:00
Bryan Boreham	ce4e757704	TSDB: refine variable naming in chunk gc Slight further refactor. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-11-09 07:44:39 -06:00
Bryan Boreham	071d5732af	TSDB: refactor cleanup of chunks and series Extract the middle of the loop into a function, so it will be easier to modify the `seriesHashmap` data structure. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-11-09 07:44:39 -06:00
machine424	a32fbc3658	head.go: Remove an unneeded snapshot trigger that was moved in https://github.com/prometheus/prometheus/pull/9328 and brougt back by mistake in `095f572d4a` as part of https://github.com/prometheus/prometheus/pull/11447 Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2023-11-09 11:46:46 +01:00
Oleksandr Redko	fa90ca46e5	ci(lint): enable godot; append dot at the end of comments Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>	2023-10-31 19:53:38 +02:00
Bryan Boreham	90e98e0235	tsdb: create isolation transaction slice on demand When Prometheus restarts it creates every series read in from the WAL, but many of those series will be finished, and never receive any more samples. By defering allocation of the txRing slice to when it is first needed, we save 32 bytes per stale series. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-10-21 13:45:47 +00:00
Ganesh Vernekar	f5913266a1	Additionally wrap WBL replay error (#12406 ) * Additionally wrap WBL replay error Although WBL replay is already wrapped with errLoadWbl, there are other errors that can happen during a WBL replay. We should not try to repair WAL in those cases. This commit additionally wraps the final error in Head.Init again with errLoadWbl so that WBL replay errors can be identified properly. Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com> Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>	2023-10-13 14:21:35 +02:00
Bryan Boreham	6dcbd653e9	tsdb: register metrics after Head is initialized (#12876 ) This avoids situations where metrics are scraped before the data they are trying to look at is initialized. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-09-25 21:57:08 +01:00
Alan Protasio	959c98441b	Add context argument to tsdb.PostingsForMatchers Signed-off-by: Alan Protasio <alanprot@gmail.com>	2023-09-16 18:13:32 +02:00
Arve Knudsen	6ef9ed0bc3	Add context argument to DB.Delete (#12834 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2023-09-13 15:43:06 +02:00
Justin Lei	8ef7dfdeeb	Add a chunk size limit in bytes (#12054 ) Add a chunk size limit in bytes This creates a hard cap for XOR chunks of 1024 bytes. The limit for histogram chunk is also 1024 bytes, but it is a soft limit as a histogram has a dynamic size, and even a single one could be larger than 1024 bytes. This also avoids cutting new histogram chunks if the existing chunk has fewer than 10 histograms yet. In that way, we are accepting "jumbo chunks" in order to have at least 10 histograms in a chunk, allowing compression to kick in. Signed-off-by: Justin Lei <justin.lei@grafana.com>	2023-08-24 15:21:17 +02:00
Oleg Zaytsev	61daa30bb1	Pass ref to SeriesLifecycleCallback.PostDeletion (#12626 ) When a particular SeriesLifecycleCallback tries to optimize and run closer to the Head, keeping track of the HeadSeriesRef instead of the labelsets, it's impossible to handle the PostDeletion callback properly as there's no way to know which series refs were deleted from the head. This changes the callback to provide the series refs alongside the labelsets, so the implementation can choose what to do. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2023-08-03 10:56:27 +02:00
cui fliter	f26dfc95e6	fix struct name in comment (#12624 ) Signed-off-by: cui fliter <imcusg@gmail.com>	2023-08-01 12:24:42 +02:00
Łukasz Mierzwa	3c80963e81	Use a linked list for memSeries.headChunk (#11818 ) Currently memSeries holds a single head chunk in-memory and a slice of mmapped chunks. When append() is called on memSeries it might decide that a new headChunk is needed to use for given append() call. If that happens it will first mmap existing head chunk and only after that happens it will create a new empty headChunk and continue appending our sample to it. Since appending samples uses write lock on memSeries no other read or write can happen until any append is completed. When we have an append() that must create a new head chunk the whole memSeries is blocked until mmapping of existing head chunk finishes. Mmapping itself uses a lock as it needs to be serialised, which means that the more chunks to mmap we have the longer each chunk might wait for it to be mmapped. If there's enough chunks that require mmapping some memSeries will be locked for long enough that it will start affecting queries and scrapes. Queries might timeout, since by default they have a 2 minute timeout set. Scrapes will be blocked inside append() call, which means there will be a gap between samples. This will first affect range queries or calls using rate() and such, since the time range requested in the query might have too few samples to calculate anything. To avoid this we need to remove mmapping from append path, since mmapping is blocking. But this means that when we cut a new head chunk we need to keep the old one around, so we can mmap it later. This change makes memSeries.headChunk a linked list, memSeries.headChunk still points to the 'open' head chunk that receives new samples, while older, yet to be mmapped, chunks are linked to it. Mmapping is done on a schedule by iterating all memSeries one by one. Thanks to this we control when mmapping is done, since we trigger it manually, which reduces the risk that it will have to compete for mmap locks with other chunks. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2023-07-31 11:10:24 +02:00
Merrick Clay	70e41fc5ac	improve incorrect doc comment Signed-off-by: Merrick Clay <merrick.e.clay@gmail.com>	2023-07-10 16:52:00 -06:00
Justin Lei	89af351730	Remove samplesPerChunk from memSeries (#12390 ) Signed-off-by: Justin Lei <justin.lei@grafana.com>	2023-05-25 11:18:41 +02:00
Baskar Shanmugam	905a0bd63a	Added 'limit' query parameter support to /api/v1/status/tsdb endpoint (#12336 ) * Added 'topN' query parameter support to /api/v1/status/tsdb endpoint Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Updated query parameter for tsdb status to 'limit' Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Corrected Stats() parameter name from topN to limit Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Fixed p.Stats CI failure Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> --------- Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>	2023-05-22 14:37:07 +02:00
Callum Styan	0d2108ad79	[tsdb] re-implement WAL watcher to read via a "notification" channel (#11949 ) * WIP implement WAL watcher reading via notifications over a channel from the TSDB code Signed-off-by: Callum Styan <callumstyan@gmail.com> * Notify via head appenders Commit (finished all WAL logging) rather than on each WAL Log call Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix misspelled Notify plus add a metric for dropped Write notifications Signed-off-by: Callum Styan <callumstyan@gmail.com> * Update tests to handle new notification pattern Signed-off-by: Callum Styan <callumstyan@gmail.com> * this test maybe needs more time on windows? Signed-off-by: Callum Styan <callumstyan@gmail.com> * does this test need more time on windows as well? Signed-off-by: Callum Styan <callumstyan@gmail.com> * read timeout is already a time.Duration Signed-off-by: Callum Styan <callumstyan@gmail.com> * remove mistakenly commited benchmark data files Signed-off-by: Callum Styan <callumstyan@gmail.com> * address some review feedback Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix missed changes from previous commit Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix issues from wrapper function Signed-off-by: Callum Styan <callumstyan@gmail.com> * try fixing race condition in test by allowing tests to overwrite the read ticker timeout instead of calling the Notify function Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix linting Signed-off-by: Callum Styan <callumstyan@gmail.com> --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	2023-05-15 12:31:49 -07:00
Björn Rabenstein	37fe9b89dc	Merge pull request #12055 from leizor/leizor/prometheus/issues/12009 Adjust samplesPerChunk from 120 to 220	2023-05-10 14:45:12 +02:00
Bryan Boreham	0ab9553611	tsdb: drop deleted series from the WAL sooner (#12297 ) `head.deleted` holds the WAL segment in use at the time each series was removed from the head. At the end of `truncateWAL()` we will delete all segments up to `last`, so we can drop any series that were last seen in a segment at or before that point. (same change in Prometheus Agent too) Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-05-01 16:43:15 +01:00
Matthieu MOREL	bae9a21200	Merge branch 'main' into linter/nilerr Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-04-19 19:56:39 +02:00
Đurica Yuri Nikolić	b028112331	Making the number of CPU cores used for sorting postings lists editable (#12247 ) Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>	2023-04-18 12:13:05 +02:00
Justin Lei	052993414a	Add storage.tsdb.samples-per-chunk flag Signed-off-by: Justin Lei <justin.lei@grafana.com>	2023-04-13 15:59:49 -07:00
Matthieu MOREL	fb3eb21230	enable gocritic, unconvert and unused linters Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-04-13 19:20:22 +00:00
beorn7	817a2396cb	Name float values as "floats", not as "values" In the past, every sample value was a float, so it was fine to call a variable holding such a float "value" or "sample". With native histograms, a sample might have a histogram value. And a histogram value is still a value. Calling a float value just "value" or "sample" or "V" is therefore misleading. Over the last few commits, I already renamed many variables, but this cleans up a few more places where the changes are more invasive. Note that we do not to attempt naming in the JSON APIs or in the protobufs. That would be quite a disruption. However, internally, we can call variables as we want, and we should go with the option of avoiding misunderstandings. Signed-off-by: beorn7 <beorn@grafana.com>	2023-04-13 19:25:24 +02:00
Ganesh Vernekar	e709b0b36e	Merge pull request #12127 from codesome/ooo-mmap-replay Update OOO min/max time properly after replaying m-map chunks	2023-04-04 12:05:57 +05:30
Oleg Zaytsev	6e2905a4d4	Use zeropool.Pool to workaround SA6002 (#12189 ) * Use zeropool.Pool to workaround SA6002 I built a tiny library called https://github.com/colega/zeropool to workaround the SA6002 staticheck issue. While searching for the references of that SA6002 staticheck issues on Github first results was Prometheus itself, with quite a lot of ignores of it. This changes the usages of `sync.Pool` to `zeropool.Pool[T]` where a pointer is not available. Also added a benchmark for HeadAppender Append/Commit when series already exist, which is one of the most usual cases IMO, as I didn't find any. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Improve BenchmarkHeadAppender with more cases Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * A little copying is better than a little dependency https://www.youtube.com/watch?v=PAAkCSZUG1c&t=9m28s Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Fix imports order Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Add license header Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Copyright should be on one of the first 3 lines Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Use require.Equal for testing I don't depend on testify in my lib, but here we have it available. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Avoid flaky test Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Also use zeropool for pointsPool in engine.go Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> --------- Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2023-03-29 20:34:34 +01:00
Abhijit Mukherjee	8f6d5dcd45	Fix: getting rid of EncOOOXOR chunk encoding (#12111 ) Signed-off-by: mabhi <abhijit.mukherjee@infracloud.io>	2023-03-16 15:53:47 +05:30
Ganesh Vernekar	2af44f9558	tsdb: Update OOO min/max time properly after replaying m-map chunks Without this fix, if snapshots were enabled, and wbl goes missing between restarts, then TSDB does not recognize that there are ooo mmap chunks on disk and we cannot query them until those chunks are compacted into blocks. Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2023-03-13 13:14:00 +05:30
Ganesh Vernekar	c9d06f2826	tsdb: Replay m-map chunk only when required M-map chunks replayed on startup are discarded if there was no WAL and no snapshot loaded, because there is no series created in the Head that it can map to. So only load m-map chunks from disk if there is either a snapshot loaded or there is WAL on disk. Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2023-03-13 13:13:42 +05:30
Ganesh Vernekar	6c008ec56a	Merge pull request #11962 from jesusvazquez/jvp/protect-new-compaction-head-from-uninitialized-wbl TSDB: Protect NewOOOCompactionHead from an uninitialized wbl	2023-03-13 10:52:03 +05:30

1 2 3 4 5

238 Commits (ceca6c4716e275d39bd28798bd6ac0ec998938e7)