Use a heap for Next for merges, and
pre-compute if there's many postings on the
unset path.
Add posting lookup benchmarks
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
Added methods needed to retain data based on a byte limitation rather than time. Limitation is only applied if the flag is set (defaults to 0). Both blocks that are older than the retention period and the blocks that make the size of the storage too large are removed.
2 new metrics for keeping track of the size of the local storage folder and the amount of times data has been deleted because the size restriction was exceeded.
Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>
Changes:
* Make `NewReader` method useful. It was impossible to use it, because closer was always nil.
* ReadSymbols, TOC and ReadOffsetTable are not public functions (used by Thanos).
* decbufXXX are now functions.
* More verbose errors.
* Removed unused crc32 field.
* Some var name changes to make it more verbose:
* symbols -> allocatedSymbols
* symbolsSlice -> symbolsV1
* symbols -> symbolsV2
*
* Pre-calculate symbolsTableSize.
* Initialized symbols for Symbols() method with valid length.
* Added test for Symbol method.
* Made Decoder LookupSymbol method public. Kept Decode public as it is useful as helper from index package.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
This change also uses the latest staticcheck version which comes with
new verifications, hence some clean up in the code.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Avoid a tree of merge objects, which can result in
what I suspect is n^2 calls to Seek when using Without.
With 100k metrics, and a regex of ^$ in BenchmarkHeadPostingForMatchers:
Before:
BenchmarkHeadPostingForMatchers-8 1 51633185216 ns/op 29745528 B/op 200357 allocs/op
After:
BenchmarkHeadPostingForMatchers-8 10 108924996 ns/op 25715025 B/op 101748 allocs/op
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
This saves memory, about a quarter of the size of the postings map
itself with high-cardinality labels (not including the post ids).
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
This reduces memory by only having to store the string's 16
bytes+map overheard once per label name, rather than duplicating it in every
entry for the label value.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
Reuse the string already allocated for symbols
in the posting tables.
Use a slice for symbols in v2 format.
Move symbol size logic into the index code.
Avoid duplication of lookupSymbol logic.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
fixes: https://github.com/prometheus/tsdb/issues/426
Using `filepath.Join()` instead of strings containing forward slash path delimiters (needed for non-*nix OSes), as suggested by @krasi-georgiev
more meaningful names for serializedStringTuples and stringTuples structs
Signed-off-by: knrt10 <tripathi.kautilya@gmail.com>
Co-authored-by: Krasi Georgiev <kgeorgie@redhat.com>
Currently the offsets are cast into uint32 even though the index can
grow larger than 4GiB.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>