Skip to content

Crash-atomic txn WAL markers (begin/commit pair) #127

@justrach

Description

@justrach

Background

POST /db/:col/txn (added in this branch) is in-memory atomic under stripe locks: concurrent readers never observe a partial txn, and validation runs before any write. But it is not crash-atomic.

Current sequence inside Collection.applyTxn:

  1. Acquire all stripe locks (sorted, deadlock-free)
  2. Validate every op against current state + intra-batch dup check
  3. For each op: write its individual WAL record (doc_insert / doc_update / doc_delete), update in-memory indexes
  4. Release locks

If the process crashes between step 3.a and 3.b — say, after the first WAL record is written but before the second — WAL replay on restart applies the partial state. The reader-visible invariant (no torn batch) is gone after a crash.

TigerBeetle gets crash-atomicity by interleaving consensus with checkpointing; we don't have consensus, but we do have a single-node WAL we can wrap with markers.

Proposed solution

Add two new WAL op types:

  • OpType.txn_begin — payload is the txn id (process-monotonic u64, picked when applyTxn starts)
  • OpType.txn_commit — payload is the same txn id

applyTxn becomes:

write WAL: {.txn_begin, txn_id}
for each op: write WAL: {.doc_*, ..., txn_id} (extend record format with optional txn_id)
write WAL: {.txn_commit, txn_id}
update in-memory indexes

WAL replay (existing WAL.replay in src/storage/wal.zig):

collect all records into a list
for each txn_id, check we have a matching commit marker
if begin without commit: skip ALL doc records carrying that txn_id

Acceptance criteria

  • Collection.applyTxn writes a txn_begin before, txn_commit after
  • WAL record format carries an optional txn_id (0 = non-txn legacy write)
  • WAL replay skips uncommitted txn ops on restart
  • New test: insert 2 docs in a txn, kill -9 the process between WAL write and commit, restart, verify neither doc is reachable
  • No perf regression on single-op insert path (txn_id=0 fast-path)

Out of scope

  • Cross-collection txn (would need cross-stripe-set lock ordering across collections)
  • Distributed txn (no consensus protocol on this branch)

References

  • src/collection.zig applyTxn — current in-memory atomic impl
  • src/storage/wal.zig WAL.write / WAL.replay — where the markers and the replay filter live
  • parity/divergence.py test ZagDB: Implement sign.zig — Ed25519 package signing #6 — already verifies in-memory atomic batch
  • TigerBeetle's approach: src/vsr/journal.zig (durable prepare records)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions