Commit Graph

164 Commits (48287b15d4fe3a2c9578982bdffeccdc0e14d1ef)

Author SHA1 Message Date
Oleg Zaytsev cd1f8ac129
MemPostings: keep a map of label values slices (#15426)
While investigating lock contention on `MemPostings`, we saw that lots
of locking is happening in `LabelValues` and
`PostingsForLabelsMatching`, both copying the label values slices while
holding the mutex.

This adds an extra map that holds an append-only label values slice for
each one of the label names. Since the slice is append-only, it can be
copied without holding the mutex.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-29 12:52:56 +01:00
Bryan Boreham ca3119bd24 TSDB: eliminate one yolostring
When the only use of a []byte->string conversion is as a map key, Go
doesn't allocate.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-26 17:21:55 +00:00
Bryan Boreham e98c19c1ce [PERF] TSDB: Cache all symbols for compaction
Trade a bit more memory for a lot less CPU spent looking up symbols.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-26 17:21:55 +00:00
Oleg Zaytsev 9aa6e041d3
MemPostings: allocate ListPostings once in PFALV (#15465)
Same as #15427 but for the new method added in #14144

Instead of allocating each ListPostings one by one, allocate them all in
one go.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-26 16:03:45 +01:00
Oleg Zaytsev cc390aab64
MemPostings: allocate ListPostings once in PFLM (#15427)
Instead of allocating ListPostings pointers one by one, allocate a slice
and take pointers from that. It's faster, and also generates less
garbage (NewListPostings is one of the top offenders in number of
allocations).

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-20 17:52:20 +01:00
Arve Knudsen 89bbb885e5
Upgrade to golangci-lint v1.62.0 (#15424)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-11-20 17:22:20 +01:00
Arve Knudsen 06d54fcc6c
[PERF] TSDB: Optimize inverse matching (#14144)
Simple follow-up to #13620. Modify `tsdb.PostingsForMatchers` to use the optimized tsdb.IndexReader.PostingsForLabelMatching method also for inverse matching.

Introduce method `PostingsForAllLabelValues`, to avoid changing the existing method.

The performance is much improved for a subset of the cases; there are up to
~60% CPU gains and ~12.5% reduction in memory usage. 

Remove `TestReader_InversePostingsForMatcherHonorsContextCancel` since
`inversePostingsForMatcher` only passes `ctx` to `IndexReader` implementations now.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-11-19 15:49:01 +00:00
Ben Ye 140f4aa9ae
feat: Allow customizing TSDB postings decoder (#13567)
* allow customizing TSDB postings decoder

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-11-11 07:59:24 +01:00
Oleg Zaytsev b1e4052682
MemPostings.Delete(): make pauses to unlock and let the readers read (#15242)
This introduces back some unlocking that was removed in #13286 but in a
more balanced way, as suggested by @pracucci.

For TSDBs with a lot of churn, Delete() can take a couple of seconds,
and while it's holding the mutex, reads and writes are blocked waiting
for that mutex, increasing the number of connections handled and memory
usage.

This implementation pauses every 4K labels processed (note that also
compared to #13286 we're not processing all the label-values anymore,
but only the affected ones, because of #14307), makes sure that it's
possible to get the read lock, and waits for a few milliseconds more.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
2024-11-05 12:59:57 +01:00
Bryan Boreham 2fbbfc3da8 Revert "Fix `MemPostings.Add` and `MemPostings.Get` data race (#15141)"
This reverts commit 50ef0dc954.

Memory allocation goes so high in Prombench that the system is unusable.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-11-03 12:30:34 +00:00
Bryan Boreham e2e01c1cff
Merge pull request #15216 from yeya24/log-last-series-labels
log last series labelset when hitting OOO series labels
2024-11-01 14:15:39 +00:00
Oleg Zaytsev ba11a55df4
Revert "Process `MemPostings.Delete()` with `GOMAXPROCS` workers"
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-10-29 17:13:40 +01:00
Ben Ye 99882eec3b log last series labelset when hitting OOO series labels during compaction
Signed-off-by: Ben Ye <benye@amazon.com>
2024-10-24 09:27:15 -07:00
Oleg Zaytsev 50ef0dc954
Fix `MemPostings.Add` and `MemPostings.Get` data race (#15141)
* Tests for Mempostings.{Add,Get} data race
* Fix MemPostings.{Add,Get} data race

We can't modify the postings list that are held in MemPostings as they
might already be in use by some readers.

* Modify BenchmarkHeadStripeSeriesCreate to have common labels

If there are no common labels on the series, we don't excercise the
ordering part of MemSeries, as we're just creating slices of one element
for each label value.

---------

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-10-11 15:21:15 +02:00
Bryan Boreham 54de4fb780
Merge pull request #14975 from colega/process-mempostings-delete-with-gomaxprocs-workers
Process `MemPostings.Delete()` with `GOMAXPROCS` workers
2024-09-29 07:58:42 +01:00
Oleg Zaytsev ada8a6ef10
Add some more tests for MemPostings_Delete
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-27 10:14:39 +02:00
Oleg Zaytsev 4fd2556baa
Extract processWithBoundedParallelismAndConsistentWorkers
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-26 15:43:19 +02:00
Oleg Zaytsev ccd0308abc
Don't do anything if MemPostings are empty
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 15:00:10 +02:00
Oleg Zaytsev 9c417aa710
Fix deadlock with empty MemPostings
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 14:08:50 +02:00
Bryan Boreham 5d8f0ef0c2
Merge pull request #14721 from bboreham/exp-grow-postings
[PERF] TSDB: Grow postings by doubling
2024-09-25 10:47:55 +01:00
Oleg Zaytsev e196b977af
Process MemPostings.Delete() with GOMAXPROCS workers
We are still seeing lock contention on MemPostings.mtx, and MemPostings.Delete() is by far the most expensive operation on that mutex.

This adds parallelism to that method, trying to reduce the amount of time we spend with the mutex held.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-09-25 10:38:47 +02:00
Bryan Boreham ca673eb749 Merge remote-tracking branch 'origin/release-2.55' into merge-2.55-into-main
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-22 17:49:34 +01:00
Bryan Boreham 31c5760551
Neater string vs byte-slice conversions (#14425)
unsafe.Slice and unsafe.StringData were added in Go 1.20

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-21 12:19:21 +02:00
Ganesh Vernekar 5ccb069414 Backward compatibility with upcoming index v3
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2024-09-19 10:27:52 +01:00
Bryan Boreham 33adbe47b1 [PERF] TSDB: Grow postings by doubling
Go's built-in append() grows larger slices with factor 1.3, which means we do a lot more allocating and copying for larger postings.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-24 11:16:58 +01:00
beorn7 0f760f63dd lint: Revamp our linting rules, mostly around doc comments
Several things done here:

- Set `max-issues-per-linter` to 0 so that we actually see all linter
  warnings and not just 50 per linter. (As we also set
  `max-same-issues` to 0, I assume this was the intention from the
  beginning.)

- Stop using the golangci-lint default excludes (by setting
  `exclude-use-default: false`. Those are too generous and don't match
  our style conventions. (I have re-added some of the excludes
  explicitly in this commit. See below.)

- Re-add the `errcheck` exclusion we have used so far via the
  defaults.

- Exclude the signature requirement `govet` has for `Seek` methods
  because we use non-standard `Seek` methods a lot. (But we keep other
  requirements, while the default excludes completely disabled the
  check for common method segnatures.)

- Exclude warnings about missing doc comments on exported symbols. (We
  used to be pretty adamant about doc comments, but stopped that at
  some point in the past. By now, we have about 500 missing doc
  comments. We may consider reintroducing this check, but that's
  outside of the scope of this commit. The default excludes of
  golangci-lint essentially ignore doc comments completely.)

- By stop using the default excludes, we now get warnings back on
  malformed doc comments. That's the most impactful change in this
  commit. It does not enforce doc comments (again), but _if_ there is
  a doc comment, it has to have the recommended form. (Most of the
  changes in this commit are fixing this form.)

- Improve wording/spelling of some comments in .golangci.yml, and
  remove an outdated comment.

- Leave `package-comments` inactive, but add a TODO asking if we
  should change that.

- Add a new sub-linter `comment-spacings` (and fix corresponding
  comments), which avoids missing spaces after the leading `//`.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-08-22 17:36:11 +02:00
Arve Knudsen 3a78e76282 Upgrade golangci-lint to v1.60.1
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-08-18 12:13:25 +02:00
Oleg Zaytsev 726ed124e4
Replace `ListPostings.Seek`'s binary search call by the generic `slices.BinarySearch` (#14393)
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-07-02 14:51:05 +01:00
Oleg Zaytsev fd1a89b7c8
Pass affected labels to `MemPostings.Delete()` (#14307)
* Pass affected labels to MemPostings.Delete

As suggested by @bboreham, we can track the labels of the deleted series
and avoid iterating through all the label/value combinations.

This looks much faster on the MemPostings.Delete call. We don't have a
benchmark on stripeSeries.gc() where we'll pay the price of iterating
the labels of each one of the deleted series.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-18 10:28:56 +00:00
Ben Ye 0e6fca8e76 add unit test
Signed-off-by: Ben Ye <benye@amazon.com>
2024-06-16 12:09:42 -07:00
Ben Ye e7db2e30a4 fix check context cancellation not incrementing count
Signed-off-by: Ben Ye <benye@amazon.com>
2024-06-15 11:43:26 -07:00
Oleg Zaytsev 64a9abb8be
Change LabelValuesFor() to accept index.Postings (#14280)
The only call we have to LabelValuesFor() has an index.Postings, and we
expand it to pass to this method, which will iterate over the values.

That's a waste of resources: we can iterate on the index.Postings
directly.

If there's any downstream implementation that has a slice of series,
they can always do an index.ListPostings from them: doing that is
cheaper than expanding an abstract index.Postings.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-11 15:36:46 +02:00
Oleg Zaytsev 10a3c7220b
`MemPostings.PostingsForLabelMatching()`: don't hold the mutex while matching (#14286)
* MemPostings.PostingsForLabelMatching: let mutex go

This changes the `MemPostings.PostingsForLabelMatching` implementation
to stop holding the read mutex while matching the label values.

We've seen that this method can be slow when the matcher is expensive,
that's why we even added a context expiration check.

However, there are critical process that might be waiting on this mutex:
writes (adding new series) and compaction (deleting the
garbage-collected ones), so we should avoid holding it for a long period
of time.

Given that we've copied the values to a slice anyway, there's no need to
hold the lock while matching.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-10 14:24:17 +02:00
Oleg Zaytsev 2dc177d8af
`MemPostings.Delete()`: reduce locking/unlocking (#13286)
* MemPostings: reduce locking/unlocking

MemPostings.Delete is called from Head.gc(), i.e. it gets the IDs of the
series that have churned.

I'd assume that many label values aren't affected by that churn at all,
so it doesn't make sense to touch the lock while checking them.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-10 14:23:22 +02:00
Arve Knudsen b8b9015e38 tsdb/index: Fix TestReader_PostingsForLabelMatchingHonorsContextCancel
Fix number of series in
TestReader_PostingsForLabelMatchingHonorsContextCancel (off by one).

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-06-03 17:29:06 +02:00
Oleg Zaytsev fe9cb5a803
Check context every 128 labels instead of 100 (#14118)
Follow up on https://github.com/prometheus/prometheus/pull/14096

As promised, I bring a benchmark, which shows a very small improvement
if context is checked every 128 iterations of label instead of every
100.

It's much easier for a computer to check modulo 128 than modulo 100.
This is a very small 0-2% improvement but I'd say this is one of the
hottest paths of the app so this is still relevant.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-21 11:30:43 +02:00
Arve Knudsen 5ca56eeb6b
tsdb/index: Refactor Reader tests (#14071)
tsdb/index: Refactor Reader tests

Co-authored-by: Björn Rabenstein <github@rabenste.in>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
2024-05-16 11:51:46 +02:00
Oleksandr Redko f10c3454e9 Enable perfsprint linter and fix up code
Signed-off-by: Oleksandr Redko <oleksandr.red+github@gmail.com>
2024-05-15 17:51:05 +03:00
György Krajcsovits b215a41be4 tsdb/index/postings: fix missing lock unlock
Followup to #14096

Unfortunately the previous PR introduced this bug by not releasing the
lock before returning.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-05-15 14:02:39 +02:00
George Krajcsovits fdaafdb041
tsdb: check for context cancel before regex matching postings (#14096)
* tsdb: check for context cancel before regex matching postings

Regex matching can be heavy if the regex takes a lot of cycles to
evaluate and we can get stuck evaluating postings for a long time
without this fix. The constant checkContextEveryNIterations=100
may be changed later.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-05-15 06:26:19 +02:00
Arve Knudsen 5c4310aa37
[ENHANCEMENT] TSDB: Optimize querying with regexp matchers
Add method `PostingsForLabelMatching` to `tsdb.IndexReader`, to obtain postings for labels with a certain name and values accepted by a provided callback, and use it from `tsdb.PostingsForMatchers`.
The intention is to optimize regexp matcher paths, especially not having to load all label values before matching on them.

Plus tests, and refactor some `tsdb/index.Reader` methods.

Benchmarking shows memory reduction up to ~100%, and speedup of up to ~50%.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2024-05-09 10:55:30 +01:00
Arve Knudsen d699dc3c77
Fix language in docs and comments (#14041)
Fix language in docs and comments

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
2024-05-08 17:57:09 +02:00
Matthieu MOREL 6f595c6762
golangci-lint: enable whitespace linter (#13905)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-04-11 09:27:54 +01:00
carrychair 856f6e49c8 fix function and struct name
Signed-off-by: carrychair <linghuchong404@gmail.com>
2024-03-09 17:53:17 +08:00
machine424 f477e0539a
Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21
Prevent adding back golang.org/x/exp/slices.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-02-28 14:54:53 +01:00
Bryan Boreham 93b72ec5dd tsdb: create SymbolTables for labels as required
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Bryan Boreham 17f48f2b3b Tests: use replacement DeepEquals in more places
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-08 19:32:33 +00:00
Peter Štibraný e2b9cfeeeb
Enforce chunks ordering when writing index. (#8085)
Document conditions on chunks. Add check on chunk time ordering.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
2024-02-04 16:31:49 +01:00
Bryan Boreham 98c4889029
Merge pull request #9298 from Creatone/creatone/use-testify
tests: Move from t.Errorf and others.
2024-02-04 16:27:57 +01:00
Mikhail Fesenko 419dd265cc
Fix strange code, add messages to code brought in #8106 (#13509)
Signed-off-by: Mikhail Fesenko <proggga@gmail.com>
2024-02-02 10:00:38 +01:00