Skip to content

[storage/journal] prevent read-blocking during contiguous journal prune#3812

Open
roberto-bayardo wants to merge 3 commits into
mainfrom
non-read-blocking-prune
Open

[storage/journal] prevent read-blocking during contiguous journal prune#3812
roberto-bayardo wants to merge 3 commits into
mainfrom
non-read-blocking-prune

Conversation

@roberto-bayardo
Copy link
Copy Markdown
Collaborator

@roberto-bayardo roberto-bayardo commented May 17, 2026

Summary

  • Split segmented journal pruning into unlink and commit phases so contiguous journals can perform physical blob removal without blocking readers.
  • Update contiguous prune() to serialize with mutators via an upgradable read guard, then upgrade only for the short in-memory commit.
  • Clarify the Storage::remove contract for already-open blob handles.
  • Add deterministic coverage for reads during prune unlink and remove-failure reopen recovery, including partial unlink failure.

Context

#2857

Testing

  • just test -p commonware-storage test_fixed_journal_reads_during_prune_unlink test_fixed_journal_prune_remove_failure_reopens_contiguous test_fixed_journal_partial_prune_remove_failure_reopens_suffix
  • just test -p commonware-storage journal::contiguous::fixed
  • just test -p commonware-storage journal::segmented

@roberto-bayardo roberto-bayardo requested a review from Copilot May 17, 2026 17:39
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 17, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
commonware-mcp 19c9731 May 20 2026, 11:37 PM

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 17, 2026

Benchmark results

Tip

PASSED: No benchmark exceeded the regression threshold.

Benchmark comparison table
Benchmark Baseline (main) Current Delta Threshold Status
qmdb::merkleize/variant=any::unordered::fixed::mmr keys=10000 ch=false sync=false 1.667 ms 1.746 ms +4.78% 10.00% ✅ PASS
qmdb::merkleize/variant=current::ordered::fixed::mmb chunk=256 keys=10000 ch=true sync=false 3.113 ms 3.310 ms +6.30% 10.00% ✅ PASS

Baseline commit(s): 64d3dc21d6b5

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Splits the fixed segmented journal's prune into a two-phase operation (unlink_before + commit_prune) so that the contiguous fixed journal can perform the slow storage remove calls under an upgradable read lock, allowing concurrent readers to continue using already-open section handles during physical removal. The runtime Storage::remove contract is updated to require that existing Blob handles remain readable after a named remove.

Changes:

  • Introduce unlink_before / commit_prune on Manager (and re-expose them on segmented Journal), with the existing one-shot prune reimplemented on top.
  • Rework contiguous::fixed::Journal::prune to downgrade the write lock to upgradable during unlink and re-upgrade only for the in-memory commit.
  • Add a BlockingContext test harness plus two new tests covering reader concurrency during prune and remove-failure recovery; document the new remove contract in the runtime crate.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
storage/src/journal/segmented/manager.rs Splits prune into unlink_before (storage-only) and commit_prune (map mutation).
storage/src/journal/segmented/fixed.rs Re-exports the two-phase prune to the contiguous layer.
storage/src/journal/contiguous/fixed.rs Uses upgradable-read downgrade/upgrade around unlink; adds BlockingContext and two new tests.
storage/Cargo.toml Adds governor dev-dependency used by BlockingContext's clock impls.
runtime/src/lib.rs Documents that named remove must not invalidate already-open Blob handles.
Cargo.lock Records new governor dependency for the storage crate.
Comments suppressed due to low confidence (1)

storage/src/journal/segmented/manager.rs:279

  • commit_prune is exposed at pub(super) / pub(in crate::journal) and unconditionally removes entries below min from self.blobs without verifying that unlink_before was previously called for the same min. If a caller invokes commit_prune without a matching successful unlink_before (e.g. in a future refactor), the blobs are silently dropped from the map while remaining present in storage, leaking them and potentially confusing re-init. The doc comment notes the precondition, but nothing enforces it. Consider, at minimum, a debug_assert! that the range is empty in storage, or restructuring the API so the two steps cannot be invoked independently.
    /// Commit a successful unlink by removing sections less than `min` from the in-memory map.
    ///
    /// This must only be called after [Self::unlink_before] succeeds for the same `min`.
    /// Returns true if any section handles were removed.
    pub(super) fn commit_prune(&mut self, min: u64) -> bool {
        let sections: Vec<u64> = self
            .blobs
            .range(..min)
            .map(|(&section, _)| section)
            .collect();
        let pruned = !sections.is_empty();
        for section in sections {
            let blob = self.blobs.remove(&section).unwrap();
            drop(blob);
            debug!(section, "pruned blob");
            self.tracked.dec();
            self.pruned.inc();
        }
        if pruned {
            self.oldest_retained_section = min;
        }

        pruned
    }

Comment thread storage/src/journal/contiguous/fixed.rs Outdated
Comment thread storage/src/journal/segmented/manager.rs Outdated
Comment thread runtime/src/lib.rs
Comment on lines 707 to 712
/// If no `name` is provided, the entire partition is removed.
/// If a `name` is provided, existing [Blob] handles for that blob must remain readable
/// until they are dropped, but the blob must be removed from future namespace lookups.
///
/// An Ok result indicates the blob is durably removed.
fn remove(
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test to confirm this behavior that should run on all storage backends.

Comment thread storage/src/journal/contiguous/fixed.rs
@roberto-bayardo roberto-bayardo force-pushed the non-read-blocking-prune branch from 6df45ae to 8b5dd16 Compare May 17, 2026 17:43
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 17, 2026

Deploying monorepo with  Cloudflare Pages  Cloudflare Pages

Latest commit: 19c9731
Status: ✅  Deploy successful!
Preview URL: https://329b83e5.monorepo-eu0.pages.dev
Branch Preview URL: https://non-read-blocking-prune.monorepo-eu0.pages.dev

View logs

@roberto-bayardo roberto-bayardo force-pushed the non-read-blocking-prune branch 4 times, most recently from 9e8f5f5 to 79b40ed Compare May 17, 2026 18:07
@roberto-bayardo roberto-bayardo requested a review from Copilot May 17, 2026 18:09
@roberto-bayardo
Copy link
Copy Markdown
Collaborator Author

bugbot run

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 79b40ed. Configure here.

@roberto-bayardo roberto-bayardo force-pushed the non-read-blocking-prune branch 3 times, most recently from 6996cb5 to 66f9a7b Compare May 17, 2026 18:37
@roberto-bayardo roberto-bayardo marked this pull request as ready for review May 17, 2026 18:49
@roberto-bayardo roberto-bayardo moved this from In Progress to Ready for Review in Tracker May 17, 2026
@roberto-bayardo roberto-bayardo force-pushed the non-read-blocking-prune branch from 66f9a7b to 62299e4 Compare May 18, 2026 19:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated no new comments.

@roberto-bayardo roberto-bayardo changed the title [storage/journal] prevent read-blocking during contiguous fixed journal prune [storage/journal] prevent read-blocking during contiguous journal prune May 20, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

❌ Patch coverage is 88.09107% with 68 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.70%. Comparing base (011acd4) to head (19c9731).
⚠️ Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
storage/src/journal/contiguous/variable.rs 86.34% 35 Missing and 2 partials ⚠️
storage/src/journal/contiguous/fixed.rs 86.75% 31 Missing ⚠️
@@            Coverage Diff             @@
##             main    #3812      +/-   ##
==========================================
- Coverage   95.72%   95.70%   -0.02%     
==========================================
  Files         472      473       +1     
  Lines      189348   193373    +4025     
  Branches     4597     4680      +83     
==========================================
+ Hits       181256   185073    +3817     
- Misses       6568     6728     +160     
- Partials     1524     1572      +48     
Files with missing lines Coverage Δ
runtime/src/lib.rs 97.73% <ø> (-0.05%) ⬇️
runtime/src/storage/mod.rs 97.91% <100.00%> (+0.15%) ⬆️
storage/src/journal/segmented/fixed.rs 97.57% <100.00%> (+0.03%) ⬆️
storage/src/journal/segmented/manager.rs 97.29% <100.00%> (+0.08%) ⬆️
storage/src/journal/segmented/variable.rs 93.95% <100.00%> (+0.04%) ⬆️
storage/src/journal/contiguous/fixed.rs 95.89% <86.75%> (-1.21%) ⬇️
storage/src/journal/contiguous/variable.rs 97.32% <86.34%> (-1.46%) ⬇️

... and 49 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 011acd4...19c9731. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Ready for Review

Development

Successfully merging this pull request may close these issues.

2 participants