deps(ballista): bring in object-store shuffle fixes#10919
Merged
Conversation
Contributor
✅ Pull with Spice PassedPassing checks:
|
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the workspace’s Ballista dependencies to a newer spiceai/datafusion-ballista git revision in order to pick up the PrefixStore-based fix for S3/Azure shuffle locations that include a URL path prefix (e.g. s3://bucket/shuffle/prefix).
Changes:
- Bump
ballista-core,ballista-executor, andballista-schedulergit pins to a PR-specific revision intended to include the prefixed shuffle key fix. - Update
Cargo.lockto reflect the new Ballista git source resolution.
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| Cargo.toml | Updates the git rev pins for Ballista crates to a PR branch commit. |
| Cargo.lock | Updates the resolved Ballista git source commit recorded in the lockfile. |
3590800 to
80dffd0
Compare
auto-merge was automatically disabled
May 20, 2026 00:46
Pull request was converted to draft
b5991fa to
1b202a1
Compare
1b202a1 to
7b657ec
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
Cargo.toml:429
- PR description says this bumps Ballista crates to the
spiceai-52.5tip (revf62181cf...), but the current change actually replaces the git deps with localpathdeps. Either the description should be updated or (preferably) the dependencies should be set to the new gitrevso the change matches the stated intent.
ballista-core = { path = "/Users/phillip/code/apache/datafusion-ballista/ballista/core" } # LOCAL ITERATION (replace with git rev before pushing)
7b657ec to
b15de61
Compare
b15de61 to
b3b9af9
Compare
…tafusion-ballista PRs #42 + #43) Bumps ballista-core / ballista-executor / ballista-scheduler to spiceai-52.5 tip (07be66a8), which carries: 1. (#42) Wrap ObjectStoreShuffleStorage in object_store::prefix::PrefixStore so the URL path is reattached to every key — without this, writers uploaded to s3://bucket/<job>/... while readers looked under s3://bucket/<prefix>/<job>/... and got NotFound on every reduce stage. 2. (#43) Dispatch s3:// partition paths inside BallistaClient::fetch_partition to the existing object-store reader. Before this the gRPC FetchPartition handler called tokio::fs::File::open("s3://...") and failed every single-batch query (q1) and reduce-stage fetch (q2+). 3. (#43) Replace per-batch serialize_batch_to_ipc_bytes with a long-lived StreamingMultipartIpcUploader: one StreamWriter per output partition means the IPC stream has one header and one EOS marker instead of one stream per batch concatenated together. Fixes the ArrowError(IpcError("Unexpected EOS")) we saw on multi-batch hash- repartition queries.
b3b9af9 to
aca1b14
Compare
ewgenius
approved these changes
May 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bumps
ballista-core,ballista-executor, andballista-schedulertoad17b153from spiceai/datafusion-ballista#43. That branch includes the merged prefix fix from #42 plus two additional object-store shuffle read/write fixes.The previous Ballista pin could write shuffle data to S3 but fail to read it back in common distributed-query paths. Prefixed shuffle locations dropped the URL path on write, final-stage fetches treated
s3://partition paths as local files, and multi-batch partitions were uploaded as concatenated Arrow IPC streams. Those showed up asNotFound, local file-open failures, orUnexpected EOSerrors when object-store shuffle was enabled.Changes
s3://bucket/prefix/...path.BallistaClient::fetch_partitionwhen executors reports3://partition locations.Cargo.lockfor the new Ballista revision.Test plan
shuffle_location: s3://<bucket>/<two-segment-prefix>:SELECT COUNT(*) FROM tSELECT col, COUNT(*) FROM t GROUP BY col ORDER BY c DESC LIMIT 5cargo test -p ballista-core --lib shuffle_writerandcargo test -p ballista-core --lib shuffle_storagecargo fmt --all -- --checkandcargo clippy --all-targets --workspace --all-features -- -D warningscargo check -p runtime-cluster