Skip to content

Skip WalPeriods that are empty #26237

Open
@hiltontj

Description

@hiltontj

Problem

#26223 resolved an issue that was leading to empty WriteBatchs being created, which therefore had min and max timestamps of i64::MAX and i64::MIN, respectively, and could produce corrupted WAL files (see #25650).

There may still be corrupted WAL files created on users' object stores, however, so we need to handle those without panic'ing and preventing server start where completely clearing the WAL from object store is not an option.

Proposed solution

Add a check here:

// add this to the snapshot tracker, so we know what to clear out later if the replay
// was a wal file that had a snapshot
self.flush_buffer
.lock()
.await
.replay_wal_period(WalPeriod::new(
wal_contents.wal_file_number,
Timestamp::new(wal_contents.min_timestamp_ns),
Timestamp::new(wal_contents.max_timestamp_ns),
));

So that if the min and max timestamps are equal to i64::MAX and i64::MIN, respectively, we skip adding the WalPeriod to the buffer (but still handle snapshot if necessary). This will avoid the panic in #25650, but we also need to verify that skipping WalPeriods does not lead to other downstream issues.

This might require some manual testing to get right since we resolved the bug that allowed for the generation of these corrupt WAL files in the first place.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions