If the corrupt segment is full, then we set donePages on open,
c59ed492b2/wal/wal.go (L235-L243)
Then when we try to repair, we set the segment to be a new segment but
we don't update the donePages: c59ed492b2/wal/wal.go (L334)
We we try to log to this, because donePages is full, we will never log
anything to this segment and create a new one: c59ed492b2/wal/wal.go (L486)
This does not cause issues because we simply concatenate the segments on
read, there by transparently skipping this `0b` segment.
Make WAL live tailer return EOF when the there is a half-written record at the end of the file.
Previously, this would cause an infinite loop as we ignored EOFs when filling the buffer. We now differentiate between EOFs that read >0 bytes, and EOFs that didn't.
Add some more unit tests for tailing a corrupt WAL, and unify interfaces Reader and LiveReader for the purposes of testing.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Test to corrupt segments mid-WAL, repair and check we can read the correct number of records.
Make segmentBufReader pad short segments with zeros, and only advance curr segment index after fully reading segment.
* refactor NewSegmentsRangeReader to take multi WAL ranges
In case of an error when checkpointing the WAL the error doesn't show
the exact WAL index that is corrupter. this is because it uses
MultiReader to read multiply WAL files.
This refactoring allows the NewSegmentsRangeReader to take more than a
single WAL range and it reads all of the ranges by iterating each one.
this changes the logs from
create checkpoint: read segments: corruption after 4841144384 bytes:...
to
create checkpoint: read segments: corruption in segment
data/wal/00017351 at 123142208: ...
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* repair wal when the record cannot be decoded
Currently repair is run only when the error happens in the reader.
A corruption can occur after the record is read and when it is decoded.
This change wraps the error at decoding as a CorruptionErr as this error
is expected to trigger a repair.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* return an error when the last wal segment record is torn.
this ensures that a repair will be run when the last record in a segment
is torn.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* Add msg parameter to Equals function in testutil
Co-authored-by: Chris Marchbanks <csmarchbanks@gmail.com>
Signed-off-by: Camille Janicki <camille.janicki@gmail.com>
* Fix filehandling for windows
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
* Fix more windows filehandling issues
Windows: Close files before deleting Checkpoints.
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
Windows: Close writers in case of errors so they can be deleted
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
Windows: Close block so that it can be deleted.
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
Windows: Close file to delete it
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
Windows: Close dir so that it can be deleted.
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
Windows: close files so that they can be deleted.
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
* Review feedback
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
This reverts commit 98fe30438c.
After some discussion, it was concluded that we want the full
`prometheus_tsdb_...` prefix hardcoded in the library.
Signed-off-by: beorn7 <beorn@soundcloud.com>
The buffers we allocated were escaping to the heap, resulting in large
memory usage spikes during startup and checkpointing in Prometheus.
This attaches the buffer to the reader object to prevent this.
Signed-off-by: Fabian Reinartz <freinartz@google.com>
Allow to repair the WAL based on the error returned by a reader
during a full scan over all records.
Signed-off-by: Fabian Reinartz <freinartz@google.com>
This adds a new WAL that's agnostic to the actual record contents.
It's much simpler and should be more resilient than the existing one.
Signed-off-by: Fabian Reinartz <freinartz@google.com>