Skip to content

Conversation

@zachschuermann
Copy link
Member

What changes are proposed in this pull request?

This PR adds new validation to ensure we don't write checkpoints/compaction whenever we have staged commits in the LogSegment. We want to prohibit this to prevent creating gaps in the log (if we allow checkpoint at version N but that version only has staged commit then old readers will break instead of just returning stale reads as in the intention of catalogManaged tables)

details

Since we generally treat staged commits as an encapsulated detail of LogSegment, we expose a new internal api LogSegment::validate_no_staged_commits and inject calls in our checkpoint writer and log compaction writer to ensure we don't write checkpoints/compactions for log segments with staged commits.

How was this change tested?

new UTs

@codecov
Copy link

codecov bot commented Oct 6, 2025

Codecov Report

❌ Patch coverage is 94.38202% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.90%. Comparing base (723df08) to head (d391b82).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
kernel/src/checkpoint/tests.rs 86.20% 1 Missing and 3 partials ⚠️
kernel/src/log_compaction/writer.rs 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1374      +/-   ##
==========================================
+ Coverage   84.87%   84.90%   +0.02%     
==========================================
  Files         114      114              
  Lines       28971    29060      +89     
  Branches    28971    29060      +89     
==========================================
+ Hits        24588    24672      +84     
- Misses       3213     3214       +1     
- Partials     1170     1174       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@zachschuermann zachschuermann force-pushed the no-staged-checkpoint-compact branch from eb5367c to 6df0dc8 Compare October 7, 2025 22:51
Copy link
Collaborator

@nicklan nicklan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix

@zachschuermann zachschuermann merged commit 8e4f0bc into delta-io:main Oct 9, 2025
21 checks passed
@zachschuermann zachschuermann deleted the no-staged-checkpoint-compact branch October 9, 2025 21:11
samansmink pushed a commit to samansmink/delta-kernel-rs that referenced this pull request Oct 19, 2025
## What changes are proposed in this pull request?
This PR adds new validation to ensure we don't write
checkpoints/compaction whenever we have staged commits in the
LogSegment. We want to prohibit this to prevent creating gaps in the log
(if we allow checkpoint at version N but that version only has staged
commit then old readers will break instead of just returning stale reads
as in the intention of catalogManaged tables)

### details
Since we generally treat staged commits as an encapsulated detail of
`LogSegment`, we expose a new internal api
`LogSegment::validate_no_staged_commits` and inject calls in our
checkpoint writer and log compaction writer to ensure we don't write
checkpoints/compactions for log segments with staged commits.

## How was this change tested?
new UTs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants