* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir
Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
Calling `wal.NewSegmentBufReader()` without any segments would cause a
`panic` resulting in prometheus crashing. This patch fixes the panic by
making segmentBufReader return a EOF if there are not segments.
This also means an empty checkpoint directory which should never be the
case unless it has been tampered with (or has issues due to the
underlying filesystem e.g. NFS) would be ignored by Prometheus and would
continue to run instead of the current behaviour which is to panic.
Fixes: https://github.com/prometheus/prometheus/issues/9605
Signed-off-by: Sunil Thaha <sthaha@redhat.com>
Snappy cannot encode records larger than ~3.7 GB and will panic if an
encoding is attempted. Check to make sure that the record is smaller
than this before encoding.
In the future, we could improve this behavior to still compress large
records (or break them up into smaller records), but this avoids the
panic for users with very large single scrape targets.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
Right now a new segment might be created unnecessarily if the
uncompressed record would not fit, but after compression (typically
reducing record size in half) it would.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
* Exports metric for WAL write errors
Signed-off-by: John McBride <jpmmcbride@gmail.com>
* Correct name for counter
Signed-off-by: John McBride <jpmmcbride@gmail.com>
* Move WAL write failure to wal.go
Signed-off-by: John McBride <jpmmcbride@gmail.com>
* WAL write fail metric moved to Log for external consumers
Signed-off-by: John McBride <jpmmcbride@gmail.com>
In running Prometheus instances, compressing the records was shown to
reduce disk usage by half while incurring a negligible CPU cost.
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
With the next release of client_golang, Summaries will not have
objectives by default.
As it turns out, for prometheus_tsdb_head_gc_duration_seconds and
prometheus_tsdb_wal_truncate_duration_seconds, the objective-less
default makes more sense then the current default.
To make sure we do the right thing before and after the upcoming
release of client_golang, I have set the objectives explicitly
wherever that was not the case so far:
- prometheus_tsdb_head_gc_duration_seconds and
prometheus_tsdb_wal_truncate_duration_seconds now have no objectives
explicitly.
- prometheus_tsdb_wal_fsync_duration_seconds now explicitly uses the
previous default objectives.
Signed-off-by: beorn7 <beorn@grafana.com>
This reduces disk space usage to not be a minimum of 3 128MB files
in small setups. This will possibly also help debug wal data issues,
by making things a bit more deterministic.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Always create a new clean segment when starting the WAL.
* Ensure we flush the last page after repairing and before recreating the
new segment in Repair.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* fix wal panic when page flush fails.
New records should be added to the page only when the last flush
succeeded. Otherwise the page would be full and panics when trying to
add a new record.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
since golang 1.12 no special handling is required for file.Sync()
@pborzenkov thanks for the pointer.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
If the corrupt segment is full, then we set donePages on open,
c59ed492b2/wal/wal.go (L235-L243)
Then when we try to repair, we set the segment to be a new segment but
we don't update the donePages: c59ed492b2/wal/wal.go (L334)
We we try to log to this, because donePages is full, we will never log
anything to this segment and create a new one: c59ed492b2/wal/wal.go (L486)
This does not cause issues because we simply concatenate the segments on
read, there by transparently skipping this `0b` segment.
Make WAL live tailer return EOF when the there is a half-written record at the end of the file.
Previously, this would cause an infinite loop as we ignored EOFs when filling the buffer. We now differentiate between EOFs that read >0 bytes, and EOFs that didn't.
Add some more unit tests for tailing a corrupt WAL, and unify interfaces Reader and LiveReader for the purposes of testing.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Test to corrupt segments mid-WAL, repair and check we can read the correct number of records.
Make segmentBufReader pad short segments with zeros, and only advance curr segment index after fully reading segment.
* refactor NewSegmentsRangeReader to take multi WAL ranges
In case of an error when checkpointing the WAL the error doesn't show
the exact WAL index that is corrupter. this is because it uses
MultiReader to read multiply WAL files.
This refactoring allows the NewSegmentsRangeReader to take more than a
single WAL range and it reads all of the ranges by iterating each one.
this changes the logs from
create checkpoint: read segments: corruption after 4841144384 bytes:...
to
create checkpoint: read segments: corruption in segment
data/wal/00017351 at 123142208: ...
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* repair wal when the record cannot be decoded
Currently repair is run only when the error happens in the reader.
A corruption can occur after the record is read and when it is decoded.
This change wraps the error at decoding as a CorruptionErr as this error
is expected to trigger a repair.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* return an error when the last wal segment record is torn.
this ensures that a repair will be run when the last record in a segment
is torn.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>