Skip to content

[storage] Allow for winding back QMDB #3444

@clabby

Description

@clabby

Overview

When managing multiple QMDB instances in an application, an edge case can occur where one DB commits successfully and the program crashes before the others can. When the applications restart, the databases will be in an inconsistent state, with some having applied a state transition that others haven't.


Image

Failing forward here is not possible, since we can't apply state transitions cleanly on top of the desynchronized databases. We could state sync the DBs again to get them to a common target set in history, though doing so would leave gaps of blocks in marshal's database, which we're trying to avoid in #3381.

  • edit: This also is not robust against a total-cluster crash. Other nodes would exist in the same inconsistent state, and everyone would endlessly be asking peers for operations/proofs they don't have.

To allow for recovering from this situation, it would be great if we could roll the databases back to a previous [inactivity_floor, commit_floor) range. It would be up to the user to keep enough information around to wind back if need be. For the sake of a stateful application that needed crash recovery when calling sync on multiple instances of QMDB, they would just need to retain the previous range's operations when pruning.

related: #3381

Metadata

Metadata

Assignees

Labels

to-considerFurther consideration is needed whether this should be implemented
No fields configured for Feature.

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions