prometheus

Commit Graph

Author	SHA1	Message	Date
Arve Knudsen	6ef9ed0bc3	Add context argument to DB.Delete (#12834 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
György Krajcsovits	b2fa4d910a	Fix more counterResetInAnyBucket edgecases Case a) empty span is at the beginning of the spans. Case b) two consequtive empty spans with positive offsets. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	1 year ago
Fiona Liao	4419399e4e	Do WBL mmap marker replay concurrently (#12801 ) * Benchmark WBL Extended WAL benchmark test with WBL parts too - added basic cases for OOO handling - a percentage of series have a percentage of samples set as OOO ones. Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>	1 year ago
Shirley	d3a1044354	WBL loading: don't send empty buffers over chan (#12808 ) Signed-off-by: Shirley Leu <4163034+fridgepoet@users.noreply.github.com> Co-authored-by: Fiona Liao <fiona.y.liao@gmail.com>	1 year ago
Arve Knudsen	6daee89e5f	Add context argument to Querier.Select (#12660 ) Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	1 year ago
Fiona Liao	f211fcd92d	Remove duplicated ms.mmMaxTime check in WAL Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>	1 year ago
George Krajcsovits	b6f903b5f9	Fix handling of explicit counter reset header in histograms. (#12772 ) * Fix handling of explicit counter reset header in histograms. Explicit counter reset were being ignored. Also there was no unit test coverage. Add test case for the first sample in a chunk. Add test case for non first sample in chunk. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> --------- Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	1 year ago
Dimitar Dimitrov	b40865833d	PostingsForMatchers race with creating new series (#12558 ) Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>	1 year ago
Bryan Boreham	bdc7983956	TSDB: re-use iterator when moving between series Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
Bryan Boreham	0d283effa8	promql: force mmap of head chunks in BenchmarkRangeQuery Otherwise we have a highly unusual situation of over 100 chunks in the headChunks list of each series, which heavily skews performance. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
Gregor Zeitlinger	f01718262a	Unit tests for native histograms (#12668 ) promql: Extend testing framework to support native histograms This includes both the internal testing framework as well as the rules unit test feature of promtool. This also adds a bunch of basic tests. Many of the code level tests can now be converted to tests within the framework, and more tests can be added easily. --------- Signed-off-by: Harold Dost <h.dost@criteo.com> Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com> Signed-off-by: Stephen Lang <stephen.lang@grafana.com> Co-authored-by: Harold Dost <h.dost@criteo.com> Co-authored-by: Stephen Lang <stephen.lang@grafana.com> Co-authored-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>	1 year ago
Michal Biesek	04d7b4dbee	lint: Fix `SA1019` Using a deprecated function `rand.Read` has been deprecated since Go 1.20 `crypto/rand.Read` is more appropriate Ref: https://tip.golang.org/doc/go1.20 Signed-off-by: Michal Biesek <michalbiesek@gmail.com>	1 year ago
Justin Lei	8ef7dfdeeb	Add a chunk size limit in bytes (#12054 ) Add a chunk size limit in bytes This creates a hard cap for XOR chunks of 1024 bytes. The limit for histogram chunk is also 1024 bytes, but it is a soft limit as a histogram has a dynamic size, and even a single one could be larger than 1024 bytes. This also avoids cutting new histogram chunks if the existing chunk has fewer than 10 histograms yet. In that way, we are accepting "jumbo chunks" in order to have at least 10 histograms in a chunk, allowing compression to kick in. Signed-off-by: Justin Lei <justin.lei@grafana.com>	1 year ago
beorn7	aa82fe198f	tsdb: Fix histogram validation So far, `ValidateHistogram` would not detect if the count did not include the count in the zero bucket. This commit fixes the problem and updates all the tests that have been undetected offenders so far. Note that this problem would only ever create false negatives, so we never falsely rejected to store a histogram because of it. On the other hand, `ValidateFloatHistogram` has been to strict with the count being at least as large as the sum of the counts in all the buckets. Float precision issues could create false positives here, see products of PromQL evaluations, it's actually quite hard to put an upper limit no the floating point imprecision. Users could produce the weirdest expressions, maxing out float precision problems. Therefore, this commit simply removes that particular check from `ValidateFloatHistogram`. Signed-off-by: beorn7 <beorn@grafana.com>	1 year ago
Mustafa Ateş Uzun	e5e51bebef	fix: error message typo Signed-off-by: Mustafa Ateş Uzun <mustafauzun0@gmail.com>	1 year ago
SuperQ	8d38d59fc5	Cleanup temporary chunk snapshot dirs Simlar to cleanup of WAL files on startup, cleanup temporary chunk_snapshot dirs. This prevents storage space leaks due to terminated snapshots on shutdown. Signed-off-by: SuperQ <superq@gmail.com>	1 year ago
Oleg Zaytsev	6ea6def0d3	Use zeropool when replaying agent's DB WAL (#12651 ) Same as https://github.com/prometheus/prometheus/pull/12189 but for tsdb/agent/db.go Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	1 year ago
Oleg Zaytsev	c810e7cae3	Fix typo in Appender.AppendHistogram() arg name Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	1 year ago
Oleg Zaytsev	61daa30bb1	Pass ref to SeriesLifecycleCallback.PostDeletion (#12626 ) When a particular SeriesLifecycleCallback tries to optimize and run closer to the Head, keeping track of the HeadSeriesRef instead of the labelsets, it's impossible to handle the PostDeletion callback properly as there's no way to know which series refs were deleted from the head. This changes the callback to provide the series refs alongside the labelsets, so the implementation can choose what to do. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	1 year ago
Oleg Zaytsev	cd7d0b69a2	Check nil err first when committing (#12625 ) The most common case is to have a nil error when appending series, so let's check that first instead of checking the 3 error types first. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	1 year ago
cui fliter	f26dfc95e6	fix struct name in comment (#12624 ) Signed-off-by: cui fliter <imcusg@gmail.com>	1 year ago
Łukasz Mierzwa	3c80963e81	Use a linked list for memSeries.headChunk (#11818 ) Currently memSeries holds a single head chunk in-memory and a slice of mmapped chunks. When append() is called on memSeries it might decide that a new headChunk is needed to use for given append() call. If that happens it will first mmap existing head chunk and only after that happens it will create a new empty headChunk and continue appending our sample to it. Since appending samples uses write lock on memSeries no other read or write can happen until any append is completed. When we have an append() that must create a new head chunk the whole memSeries is blocked until mmapping of existing head chunk finishes. Mmapping itself uses a lock as it needs to be serialised, which means that the more chunks to mmap we have the longer each chunk might wait for it to be mmapped. If there's enough chunks that require mmapping some memSeries will be locked for long enough that it will start affecting queries and scrapes. Queries might timeout, since by default they have a 2 minute timeout set. Scrapes will be blocked inside append() call, which means there will be a gap between samples. This will first affect range queries or calls using rate() and such, since the time range requested in the query might have too few samples to calculate anything. To avoid this we need to remove mmapping from append path, since mmapping is blocking. But this means that when we cut a new head chunk we need to keep the old one around, so we can mmap it later. This change makes memSeries.headChunk a linked list, memSeries.headChunk still points to the 'open' head chunk that receives new samples, while older, yet to be mmapped, chunks are linked to it. Mmapping is done on a schedule by iterating all memSeries one by one. Thanks to this we control when mmapping is done, since we trigger it manually, which reduces the risk that it will have to compete for mmap locks with other chunks. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	1 year ago
Robert Fratto	886945cda7	tsdb/agent: ensure that new series get written to WAL on rollback (#12592 ) If a new series is introduced in a storage.Appender instance, that series should be written to the WAL once the storage.Appender is closed, even on Rollback. Previously, new series would only be written to the WAL when calling Commit. However, because the series is stored in memory regardless, subsequent calls to Commit may write samples to the WAL which reference a series ID which that was never written. Related to #11589. It's likely that this fix also resolves this issue, but we need more testing from users to see if the problem persists after this fix; there may be more cases where samples get written to the WAL in Prometheus Agent mode without the corresponding series record. Signed-off-by: Robert Fratto <robertfratto@gmail.com>	1 year ago
George Krajcsovits	6cd2d1621f	Hide histogram chunk append and reset header internals (#12352 ) tsdb: Hide histogram chunk append and reset header internals Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>	1 year ago
György Krajcsovits	d4e355243a	tsdbutil/ChunkFromSamplesGeneric should not panic Add error handling instead. Prepares for #12352 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	1 year ago
cui fliter	096ceca44f	remove repetitive words (#12556 ) Signed-off-by: cui fliter <imcusg@gmail.com>	1 year ago
beorn7	0e3f35324b	scrape: Enable ingestion of multiple exemplars per sample This has become a requirement for native histograms, as a single histogram sample commonly has many buckets, so that providing many exemplars makes sense. Since OM text doesn't support native histograms yet, the test had to be expanded to also support protobuf test cases. Signed-off-by: beorn7 <beorn@grafana.com>	1 year ago
Justin Lei	32d87282ad	Add Zstandard compression option for wlog (#11666 ) Snappy remains as the default compression but there is now a flag to switch the compression algorithm. Signed-off-by: Justin Lei <justin.lei@grafana.com>	1 year ago
Julien Pivotto	bf5bf1a4b3	TSDB: Remove usused import of sort Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	1 year ago
Merrick Clay	70e41fc5ac	improve incorrect doc comment Signed-off-by: Merrick Clay <merrick.e.clay@gmail.com>	1 year ago
Bryan Boreham	ce153e3fff	Replace sort.Sort with faster slices.SortFunc The generic version is more efficient. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
Marc Tudurí	4851ced266	tsdb: Support native histograms in snapshot on shutdown (#12258 ) Signed-off-by: Marc Tuduri <marctc@protonmail.com>	1 year ago
Patrick Oyarzun	68e5937474	Apply relevant label matchers in LabelValues before fetching extra postings (#12274 ) * Apply matchers when fetching label values Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com> * Avoid extra copying of label values Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com> --------- Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>	1 year ago
Bryan Boreham	5255bf06ad	Replace sort.Slice with faster slices.SortFunc The generic version is more efficient. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	1 year ago
Marco Pracucci	35069910f5	Fix infinite loop in index Writer when a series contains duplicated label names Signed-off-by: Marco Pracucci <marco@pracucci.com>	1 year ago
Marco Pracucci	031d22df9e	Fix race condition in ChunkDiskMapper.Truncate() (#12500 ) * Fix race condition in ChunkDiskMapper.Truncate() Signed-off-by: Marco Pracucci <marco@pracucci.com> * Added unit test Signed-off-by: Marco Pracucci <marco@pracucci.com> * Update tsdb/chunks/head_chunks.go Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Marco Pracucci <marco@pracucci.com> --------- Signed-off-by: Marco Pracucci <marco@pracucci.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	1 year ago
Nidhey Nitin Indurkar	a8772a4178	Feat: Get block by id directly on promtool analyze & get latest block if ID not provided (#12031 ) * feat: analyze latest block or block by ID in CLI (promtool) Signed-off-by: nidhey27 <nidhey.indurkar@infracloud.io> * address remarks Signed-off-by: nidhey60@gmail.com <nidhey.indurkar@infracloud.io> * address latest review comments Signed-off-by: nidhey60@gmail.com <nidhey.indurkar@infracloud.io> --------- Signed-off-by: nidhey27 <nidhey.indurkar@infracloud.io> Signed-off-by: nidhey60@gmail.com <nidhey.indurkar@infracloud.io>	2 years ago
Alan Protasio	73078bf738	Opmizing Group Regex (#12375 ) Signed-off-by: Alan Protasio <alanprot@gmail.com>	2 years ago
Justin Lei	e73d8b2084	Also pass chunkOpts into appendPreprocessor Signed-off-by: Justin Lei <justin.lei@grafana.com>	2 years ago
Justin Lei	4c4454e4c9	Group args to append to memSeries in chunkOpts Signed-off-by: Justin Lei <justin.lei@grafana.com>	2 years ago
Justin Lei	89af351730	Remove samplesPerChunk from memSeries (#12390 ) Signed-off-by: Justin Lei <justin.lei@grafana.com>	2 years ago
zenador	37e5249e33	Use DefaultSamplesPerChunk in tsdb (#12387 ) Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2 years ago
Baskar Shanmugam	905a0bd63a	Added 'limit' query parameter support to /api/v1/status/tsdb endpoint (#12336 ) * Added 'topN' query parameter support to /api/v1/status/tsdb endpoint Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Updated query parameter for tsdb status to 'limit' Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Corrected Stats() parameter name from topN to limit Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Fixed p.Stats CI failure Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> --------- Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>	2 years ago
Alan Protasio	8c5d4b4add	Opmize MatchNotEqual (#12377 ) Signed-off-by: Alan Protasio <alanprot@gmail.com>	2 years ago
Matthieu MOREL	c8e7f95a3c	ci(lint): enable predeclared linter Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2 years ago
George Krajcsovits	92d6980360	Fix populateWithDelChunkSeriesIterator and gauge histograms (#12330 ) Use AppendableGauge to detect corrupt chunk with gauge histograms. Detect if first sample is a gauge but the chunk is not set up to contain gauge histograms. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>	2 years ago
Baskar Shanmugam	f731a90a7f	Fix LabelValueStats in posting stats (#12342 ) Problem: LabelValueStats - This will provide a list of the label names and memory used in bytes. It is calculated by adding the length of all values for a given label name. But internally Prometheus stores the name and the value independently for each series. Solution: MemPostings struct maintains the values to seriesRef map which is used to get the number of series which contains the label values. Using that LabelValueStats is calculated as: seriesCnt * len(value name) Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>	2 years ago
Xiaochao Dong	80b7f73d26	Copy tombstone intervals to avoid race (#12245 ) Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>	2 years ago
Callum Styan	0d2108ad79	[tsdb] re-implement WAL watcher to read via a "notification" channel (#11949 ) * WIP implement WAL watcher reading via notifications over a channel from the TSDB code Signed-off-by: Callum Styan <callumstyan@gmail.com> * Notify via head appenders Commit (finished all WAL logging) rather than on each WAL Log call Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix misspelled Notify plus add a metric for dropped Write notifications Signed-off-by: Callum Styan <callumstyan@gmail.com> * Update tests to handle new notification pattern Signed-off-by: Callum Styan <callumstyan@gmail.com> * this test maybe needs more time on windows? Signed-off-by: Callum Styan <callumstyan@gmail.com> * does this test need more time on windows as well? Signed-off-by: Callum Styan <callumstyan@gmail.com> * read timeout is already a time.Duration Signed-off-by: Callum Styan <callumstyan@gmail.com> * remove mistakenly commited benchmark data files Signed-off-by: Callum Styan <callumstyan@gmail.com> * address some review feedback Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix missed changes from previous commit Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix issues from wrapper function Signed-off-by: Callum Styan <callumstyan@gmail.com> * try fixing race condition in test by allowing tests to overwrite the read ticker timeout instead of calling the Notify function Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix linting Signed-off-by: Callum Styan <callumstyan@gmail.com> --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	2 years ago
György Krajcsovits	c6618729c9	Fix HistogramAppender.Appendable array out of bound error The code did not handle spans with 0 length properly. Spans with length zero are now skipped in the comparison. Span index check not done against length-1, since length is a unit32, thus subtracting 1 leads to 2^32, not -1. Fixes and unit tests for both integer and float histograms added. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2 years ago

... 2 3 4 5 6 ...

995 Commits (113938aeb894e60c5706ff9ca993344a990a96e7)