Skip to content

Conversation

@abhi-airspace-intelligence
Copy link
Contributor

Description

Consider the following case:

  1. I kick off a compaction of a delta table
  2. Simultaneously, I start a write
  3. The compaction finishes, logically removing/adding some files
  4. I go to commit my write

This will fail, since by default, the conflict checker assumes that the write depended on all the files in the table at a given state, this with ConcurrentDeleteRead. However, if it is a true blind append, the writer never depended on the state of the table.

Instead, this PR marks a write commit as a blind append if it is in mode Append and there are no remove actions.

Marked as draft bc I'd like to write tests and confirm this is the correct behavior.

Related Issue(s)

Likely closes #2700

Documentation

Reference here: https://books.japila.pl/delta-lake-internals/OptimisticTransactionImpl/?h=blind+app#isBlindAppend
Planning on diving into the delta-spark code when I have the time to ensure this is the correct heuristic.

Signed-off-by: Abhi Agarwal <[email protected]>
@github-actions github-actions bot added the binding/rust Issues for the Rust crate label Oct 22, 2025
@github-actions
Copy link

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@abhi-airspace-intelligence abhi-airspace-intelligence changed the title Add support for blind appends feat: add support for blind appends Oct 22, 2025
@rtyler rtyler self-assigned this Oct 23, 2025
@rtyler
Copy link
Member

rtyler commented Oct 23, 2025

👋 Thanks for starting the PR in draft to get the discussion going!

Blind appends are ... tricky 😄

For your specific situation I am curious if setting the appendOnly table property would suffice:
append
only

If that table property doesn't solve the issue you describe, then I would say
we have a bug in our commit code that should be rectified, because on an
appendOnly table there should be no problem writing.

There's not really anything in the protocol that states what the rules should
be for a blind append. In the scenario that you describe it is probably safe
enough to issue an append but i'm not sure I would call that a blind append.

When I think of a blind append, I think about something that doesn't
care/consider whether the schema has been changed underneath either, i.e. isn't
considering if a metadata action happened between its current state and
whatever $latest_version is.

@abhiaagarwal
Copy link
Contributor

abhiaagarwal commented Oct 23, 2025

@rtyler the offending code is here:

https://github.com/delta-io/delta-rs/blob/main/crates/core/src/kernel/transaction/conflict_checker.rs#L209-L212

Yes, my tables are already marked as append only, but I still run into the conflict bug. Where if there is no read predicate specified, then all the files that were read in the snapshot are returned. However, in the case of blind appends, while the table state was read, it doesn't have any actual impact on what was written. Additionally, even if a table is not AppendOnly by property, I'd still say that marking commits as blind appends if they are is a decent practice.

isBlindAppend isn't specified in the protocol, but delta-spark has special handling for it based on my look at the codebase.

@abhiaagarwal
Copy link
Contributor

For what it's worth, this PR has solved my problem entirely of being able to run concurrent appends + compactions as a background tokio task without any sort of concurrency issues.

@abhi-airspace-intelligence
Copy link
Contributor Author

I opened an issue in the delta-spark repository

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

binding/rust Issues for the Rust crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimizer Compact not running parallel to append writers

3 participants