Skip to content

Conversation

@zachschuermann
Copy link
Member

@zachschuermann zachschuermann commented Sep 24, 2025

What changes are proposed in this pull request?

Our current Transaction::commit(self, engine) returns a CommitResult but it doesn't retain enough information for a post-commit Snapshot (and feels a little clunky)

This PR refactors to a new pattern in which commit returns a new CommitResult with one of: committed/conflicted/retryable transaction - each just holds a transaction and a little more metadata instead of the old Committed/Conflicted pattern.

// New
pub enum CommitResult {
    CommittedTransaction(CommittedTransaction),
    ConflictedTransaction(ConflictedTransaction),
    RetryableTransaction(RetryableTransaction),
}

// Old
pub enum CommitResult {
    Committed {
        version: Version,
        post_commit_stats: PostCommitStats,
    },
    Conflict(Transaction, Version),
}

This opens the door for a post-commit-snapshot. it requires a bit more thought/testing but something like:

pub fn post_commit_snapshot(&self, engine: &dyn Engine) -> DeltaResult<SnapshotRef> {
    Snapshot::builder_from(self.transaction.read_snapshot.clone())
        .with_log_tail(new_path)
        .at_version(self.commit_version)
        .build(engine)
}

This PR affects the following public APIs

CommitResult API change, new enum + structs for CommittedTransaction, ConflictedTransaction, and RetryableTransaction

How was this change tested?

refactor

@github-actions github-actions bot added the breaking-change Change that require a major version bump label Sep 24, 2025
@codecov
Copy link

codecov bot commented Sep 24, 2025

Codecov Report

❌ Patch coverage is 70.17544% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.87%. Comparing base (94a15e0) to head (73e7083).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
kernel/src/transaction/mod.rs 81.25% 8 Missing and 1 partial ⚠️
ffi/src/transaction/mod.rs 11.11% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1343      +/-   ##
==========================================
- Coverage   84.90%   84.87%   -0.03%     
==========================================
  Files         114      114              
  Lines       28935    28971      +36     
  Branches    28935    28971      +36     
==========================================
+ Hits        24566    24588      +22     
- Misses       3200     3213      +13     
- Partials     1169     1170       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach LGTM

Copy link
Collaborator

@nicklan nicklan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!


/// Compute a new snapshot for the table at the commit version. Note this is generally more
/// efficient than creating a new snapshot from scratch.
pub fn post_commit_snapshot(&self, engine: &dyn Engine) -> DeltaResult<SnapshotRef> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we know what we just commited, can we construct the snapshot without having to do any work at all?

Not for this PR, but should we have a follow-up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually realized this was more complex. removed and tracking in #916

@zachschuermann zachschuermann changed the title feat!: rearchitect CommitResult, add post-commit-snapshot refactor!: rearchitect CommitResult Oct 6, 2025
Copy link
Collaborator

@nicklan nicklan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly good, one comment

self.into_conflicted(commit_version),
)),
// TODO: we may want to be more selective about what is retryable
Err(e) => Ok(CommitResult::RetryableTransaction(self.into_retryable(e))),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not ensure at least that this is an IO error of some sort?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea i went back-and-forth on this. I expanded the comment and went ahead and made it only do "retryable" on IOError. I think the main item here is to more clearly define what write_json_files can return in error cases. We likely don't want the entirety of Error struct there. opened #1388

Copy link
Collaborator

@DrakeLin DrakeLin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with nits

)),
// TODO: we may want to be more or less selective about what is retryable (this is tied
// to the idea of "what kind of Errors should write_json_file return?")
Err(e @ Error::IOError(_)) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woah never seen this syntax!

@zachschuermann zachschuermann merged commit 723df08 into delta-io:main Oct 9, 2025
19 of 21 checks passed
@zachschuermann zachschuermann deleted the txn-commit-stuff branch October 9, 2025 19:15
samansmink pushed a commit to samansmink/delta-kernel-rs that referenced this pull request Oct 19, 2025
## What changes are proposed in this pull request?
Our current `Transaction::commit(self, engine)` returns a `CommitResult`
but it doesn't retain enough information for a post-commit Snapshot (and
feels a little clunky)

This PR refactors to a new pattern in which `commit` returns a new
`CommitResult` with one of: committed/conflicted/retryable transaction -
each just holds a transaction and a little more metadata instead of the
old Committed/Conflicted pattern.

```rust
// New
pub enum CommitResult {
    CommittedTransaction(CommittedTransaction),
    ConflictedTransaction(ConflictedTransaction),
    RetryableTransaction(RetryableTransaction),
}

// Old
pub enum CommitResult {
    Committed {
        version: Version,
        post_commit_stats: PostCommitStats,
    },
    Conflict(Transaction, Version),
}
```

This opens the door for a post-commit-snapshot. it requires a bit more
thought/testing but something like:
```rust
pub fn post_commit_snapshot(&self, engine: &dyn Engine) -> DeltaResult<SnapshotRef> {
    Snapshot::builder_from(self.transaction.read_snapshot.clone())
        .with_log_tail(new_path)
        .at_version(self.commit_version)
        .build(engine)
}
```

### This PR affects the following public APIs
`CommitResult` API change, new enum + structs for
`CommittedTransaction`, `ConflictedTransaction`, and
`RetryableTransaction`

## How was this change tested?
refactor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change Change that require a major version bump

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants