diff --git a/tsdb/docs/format/chunks.md b/tsdb/docs/format/chunks.md index 54b8b000e..4243ab93e 100644 --- a/tsdb/docs/format/chunks.md +++ b/tsdb/docs/format/chunks.md @@ -34,22 +34,62 @@ in-file offset (lower 4 bytes) and segment sequence number (upper 4 bytes). └───────────────┴───────────────────┴──────────────┴────────────────┘ ``` -## XOR chunk +Notes: +* `` has 1 to 10 bytes. +* `encoding`: Currently either `XOR` or `histogram`. +* `data`: See below for each encoding. -TODO(beorn7): Add. - -## Histogram chunk - -TODO(beorn7): This is out of date. Update once settled on the (more or less) final format. +## XOR chunk data ``` -┌──────────────┬─────────────────┬──────────────────────────┬──────────────────────────┬──────────────┐ -│ len │ schema │ pos-spans │ neg-spans │ data │ -└──────────────┴─────────────────┴──────────────────────────┴──────────────────────────┴──────────────┘ - -span-section: - -┌──────────────┬──────────────────┬──────────────────┬────────────┐ -│ len │ length1 │ offset1 │ length2... │ -└──────────────┴──────────────────┴──────────────────┴────────────┘ +┌──────────────────────┬───────────────┬───────────────┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┬─────┐ +│ num_samples │ ts_0 │ v_0 │ ts_1_delta │ v_1_xor │ ts_n_dod │ v_n_xor │ ... │ +└──────────────────────┴───────────────┴───────────────┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┴─────┘ ``` + +### Notes: + +* `ts` is the timestamp, `v` is the value. +* `...` means to repeat the previous two fields as needed, with `n` starting at 2 and going up to `num_samples` – 1. +* `` has 2 bytes in big-endian order. +* `` and `` have 1 to 10 bytes each. +* `ts_1_delta` is `ts_1` – `ts_0`. +* `ts_n_dod` is the “delta of deltas” of timestamps, i.e. (`ts_n` – `ts_n-1`) – (`ts_n-1` – `ts_n-2`). +* `v_n_xor>` is the result of `v_n` XOR `v_n-1`. +* `` is a specific variable bitwidth encoding of the result of XORing the current and the previous value. It has between 1 bit and 77 bits. + See [code for details](https://github.com/prometheus/prometheus/blob/7309c20e7e5774e7838f183ec97c65baa4362edc/tsdb/chunkenc/xor.go#L220-L253). +* `` is a specific variable bitwidth encoding for the “delta of deltas” of timestamps (signed integers that are ideally small). + It has between 1 and 68 bits. + see [code for details](https://github.com/prometheus/prometheus/blob/7309c20e7e5774e7838f183ec97c65baa4362edc/tsdb/chunkenc/xor.go#L179-L205). + +## Histogram chunk data + +``` +┌──────────────────────┬───────────────────────────────┬─────────────────────┬──────────────────┬──────────────────┬────────────────┐ +│ num_samples │ zero_threshold <1 or 9 bytes> │ schema │ pos_spans │ neg_spans │ samples │ +└──────────────────────┴───────────────────────────────┴─────────────────────┴──────────────────┴──────────────────┴────────────────┘ +``` + +### Positive and negative spans data: + +``` +┌───────────────────┬────────────────────────┬───────────────────────┬─────┬──────────────────────────┬─────────────────────────┐ +│ num │ length_1 │ offset_1 │ ... │ length_num │ offset_num │ +└───────────────────┴────────────────────────┴───────────────────────┴─────┴──────────────────────────┴─────────────────────────┘ +``` + +### Samples data: + +``` +TODO +``` + +### Notes: + +* `zero_threshold` has a specific encoding: + * If 0, it is a single zero byte. + * If a power of two between 2^-243 and 2^10, it is a single byte between 1 and 254. + * Otherwise, it is a byte with all bits set (255), followed by a float64, resulting in 9 bytes length. +* `schema` is a specific value defined by the exposition format. Currently valid values are -4 <= n <= 8. +* `` is a variable bitwidth encoding for signed integers, optimized for “delta of deltas” of bucket deltas. It has between 1 bit and 9 bytes. +* `` is a variable bitwidth encoding for unsigned integers with the same bit-bucketing as ``.