[storage/journal/contiguous] avoid need to fsync when crossing blob boundaries#3790
[storage/journal/contiguous] avoid need to fsync when crossing blob boundaries#3790roberto-bayardo wants to merge 12 commits into
Conversation
Deploying monorepo with
|
| Latest commit: |
fd84861
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://5f56fcf7.monorepo-eu0.pages.dev |
| Branch Preview URL: | https://no-fsync-crossing-blobs.monorepo-eu0.pages.dev |
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
commonware-mcp | fd84861 | May 21 2026, 05:00 PM |
Benchmark resultsTip ✅ PASSED: No benchmark exceeded the regression threshold. Benchmark comparison table
Baseline commit(s): |
7979ff9 to
6693b28
Compare
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR optimizes the contiguous journal so that append no longer needs to fsync when crossing section/blob boundaries. Instead, dirty sections are tracked and fsynced only on explicit commit() or sync(). A new DURABLE_SIZE_KEY watermark allows recovery to skip the prefix of fully-durable data while still being able to rebuild the suffix from the data journal when needed.
Changes:
- Replace per-section auto-sync in
append/append_manywith deferred fsync gated by adirty_from_sectiontracker; introduce anop_lockto serialize mutators withcommit/syncwhile preserving concurrent reads. - Persist a durable-size watermark (
DURABLE_SIZE_KEY) in fixed journal metadata; rework recovery to use the watermark and rebuild only the offsets suffix in the variable journal. - Update qmdb test harnesses to call
db.commit()(sinceapply_batchis no longer implicitly durable) and add new recovery/rewind tests.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| storage/src/journal/contiguous/fixed.rs | Adds durable-size metadata, defers fsync to commit/sync, reworks recovery and adds tests. |
| storage/src/journal/contiguous/variable.rs | Tracks dirty data sections, replaces upgradable lock with rwlock+mutex, rebuilds offsets suffix from anchor, adds tests. |
| storage/src/qmdb/current/sync/tests.rs | Adds explicit db.commit() calls in test harnesses now that apply_batch does not auto-sync. |
| storage/src/qmdb/any/sync/tests.rs | Adds explicit db.commit() calls in test harnesses. |
08458bf to
435903c
Compare
|
bugbot run |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 435903c. Configure here.
7c66e57 to
07df9a9
Compare
cbd56aa to
30efa74
Compare
eebd417 to
ad87998
Compare
ee79f69 to
908f1a5
Compare
8e21f13 to
5c643cb
Compare
0642f3e to
8db4cad
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 8db4cad. Configure here.
f89000a to
4fa5307
Compare
4fa5307 to
f9b5b59
Compare
f9b5b59 to
7d4a2d1
Compare
Codecov Report❌ Patch coverage is @@ Coverage Diff @@
## main #3790 +/- ##
==========================================
+ Coverage 95.77% 95.78% +0.01%
==========================================
Files 486 486
Lines 200338 201613 +1275
Branches 4858 4879 +21
==========================================
+ Hits 191872 193123 +1251
- Misses 6834 6856 +22
- Partials 1632 1634 +2
... and 11 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|

The need to fsync when crossing blob boundaries can introduce unexpected delays and increased variance in
journal.append()performance. It also complicates making append minimally read blocking.This PR removes append-time blob-boundary fsyncs. Instead, fixed and variable journals track dirty sections and fsync them when durability is explicitly requested via
commit()orsync().For recovery, fixed journals now persist
RECOVERY_WATERMARK_KEYas a conservative replay boundary for layered users such as the variable journal. Fixed-journal recovery itself remains length-based: it recovers the longest contiguous prefix from retained blobs and truncates at the first short or missing section.Variable-journal recovery continues to treat the data journal as the source of truth. It uses the offsets journal watermark as a bounded replay anchor when safe, rejects stale anchors beyond retained data, and truncates newer data when replay encounters a short retained section.
Detailed Summary
commit()andsync()can fsync all necessary data.sync().init().