Skip to content

Conversation

@chroma-droid
Copy link

This PR cherry-picks the commit acde656 onto rc/2026-01-02. If there are unresolved conflicts, please resolve them manually.

## Description of changes

_Summarize the changes made by this PR._

- Improvements & Bug fixes
  - N/A
- New functionality
- Eagerly commit blocks in ordered blockfile writer to avoid excessive
memory usage

## Test plan

_How are these changes tested?_

- [ ] Tests pass locally with `pytest` for python, `yarn test` for js,
`cargo test` for rust

## Migration plan

_Are there any migrations, or any forwards/backwards compatibility
changes needed in order to make sure this change deploys reliably?_

## Observability plan

_What is the plan to instrument and monitor this change?_

## Documentation Changes

_Are all docstrings for user-facing APIs updated if required? Do we need
to make documentation changes in the [docs
section](https://github.com/chroma-core/chroma/tree/main/docs/docs.trychroma.com)?_
@github-actions
Copy link

github-actions bot commented Jan 3, 2026

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@propel-code-bot
Copy link
Contributor

Eager block commits in ordered blockfile writer

Updates ArrowOrderedBlockfileWriter to convert completed OrderedBlockDeltas into persisted Blocks as soon as they are sealed, keeping the per-writer in-memory state bounded. The commit path now iterates over pre-built blocks, reuses their IDs for migrations, and removes the now-unused OrderedBlockDelta::len helper.

Key Changes

• Replaced Inner.completed_block_deltas with completed_blocks, committing deltas immediately inside complete_current_delta and related split paths.
• Adjusted commit to await self.complete_current_delta, reuse the produced Block list for sparse-index updates, and feed their IDs directly into apply_migrations_to_blockfile.
• Removed OrderedBlockDelta::len and pushed split handling into the new asynchronous complete_current_delta flow.

Affected Areas

• rust/blockstore/src/arrow/ordered_blockfile_writer.rs
• rust/blockstore/src/arrow/block/delta/ordered_block_delta.rs

This summary was automatically generated by @propel-code-bot

Comment on lines +199 to +211
if delta.get_size::<K, V>() > self.root.max_block_size_bytes {
let split_blocks = delta.split::<K, V>(self.root.max_block_size_bytes);
for (split_key, split_delta) in split_blocks {
self.root
.sparse_index
.add_block(split_key, split_delta.id)
.map_err(|e| Box::new(e) as Box<dyn ChromaError>)?;
let block = self.block_manager.commit::<K, V>(split_delta).await;
inner.completed_blocks.push(block);
}
}
let block = self.block_manager.commit::<K, V>(delta).await;
inner.completed_blocks.push(block);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended

[Maintainability] This logic for splitting an oversized delta and committing its pieces is duplicated in the set method on lines 319-331.

To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, consider extracting this logic into a private helper method. This new method could take the delta and inner state, perform the necessary splitting and committing, and then be called from both complete_current_delta and set.

Context for Agents
This logic for splitting an oversized delta and committing its pieces is duplicated in the `set` method on lines 319-331.

To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, consider extracting this logic into a private helper method. This new method could take the `delta` and `inner` state, perform the necessary splitting and committing, and then be called from both `complete_current_delta` and `set`.

File: rust/blockstore/src/arrow/ordered_blockfile_writer.rs
Line: 211

@Sicheng-Pan Sicheng-Pan merged commit 0cfd3b9 into rc/2026-01-02 Jan 3, 2026
65 checks passed
@Sicheng-Pan Sicheng-Pan deleted the hotfix-6109/rc/2026-01-02 branch January 3, 2026 04:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants