Skip to content

deps(ballista): bring in object-store shuffle fixes#10919

Merged
phillipleblanc merged 2 commits into
trunkfrom
phillip/ballista-shuffle-prefix-fix
May 22, 2026
Merged

deps(ballista): bring in object-store shuffle fixes#10919
phillipleblanc merged 2 commits into
trunkfrom
phillip/ballista-shuffle-prefix-fix

Conversation

@phillipleblanc
Copy link
Copy Markdown
Contributor

@phillipleblanc phillipleblanc commented May 19, 2026

Summary

Bumps ballista-core, ballista-executor, and ballista-scheduler to ad17b153 from spiceai/datafusion-ballista#43. That branch includes the merged prefix fix from #42 plus two additional object-store shuffle read/write fixes.

The previous Ballista pin could write shuffle data to S3 but fail to read it back in common distributed-query paths. Prefixed shuffle locations dropped the URL path on write, final-stage fetches treated s3:// partition paths as local files, and multi-batch partitions were uploaded as concatenated Arrow IPC streams. Those showed up as NotFound, local file-open failures, or Unexpected EOS errors when object-store shuffle was enabled.

Changes

  • Repoint Ballista dependencies to the upstream branch commit that carries the object-store shuffle fixes.
  • Pull in prefix-aware shuffle writes so keys are written under the configured s3://bucket/prefix/... path.
  • Pull in object-store reads for BallistaClient::fetch_partition when executors report s3:// partition locations.
  • Pull in streaming multipart IPC uploads so each shuffle partition is one valid Arrow IPC stream across all batches.
  • Update Cargo.lock for the new Ballista revision.

Test plan

  • Manual end-to-end test with a local 1 scheduler / 2 executor Spice cluster and shuffle_location: s3://<bucket>/<two-segment-prefix>:
    • SELECT COUNT(*) FROM t
    • SELECT col, COUNT(*) FROM t GROUP BY col ORDER BY c DESC LIMIT 5
    • confirmed shuffle objects were written under the prefixed key with no stray job-id directories at the bucket root
  • Upstream Ballista unit tests: cargo test -p ballista-core --lib shuffle_writer and cargo test -p ballista-core --lib shuffle_storage
  • Upstream Ballista lint: cargo fmt --all -- --check and cargo clippy --all-targets --workspace --all-features -- -D warnings
  • cargo check -p runtime-cluster
  • GitHub CI

Copilot AI review requested due to automatic review settings May 19, 2026 07:34
@github-actions github-actions Bot added area/config kind/dependencies Pull requests that update a dependency file size/s labels May 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

✅ Pull with Spice Passed

Passing checks:

  • ✅ Title meets minimum length requirement (10 characters)
  • ✅ No banned labels detected
  • ✅ Has a label from required category kind/
  • ✅ Has a label from required category area/
  • ✅ Has at least one assignee: phillipleblanc

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the workspace’s Ballista dependencies to a newer spiceai/datafusion-ballista git revision in order to pick up the PrefixStore-based fix for S3/Azure shuffle locations that include a URL path prefix (e.g. s3://bucket/shuffle/prefix).

Changes:

  • Bump ballista-core, ballista-executor, and ballista-scheduler git pins to a PR-specific revision intended to include the prefixed shuffle key fix.
  • Update Cargo.lock to reflect the new Ballista git source resolution.

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 2 comments.

File Description
Cargo.toml Updates the git rev pins for Ballista crates to a PR branch commit.
Cargo.lock Updates the resolved Ballista git source commit recorded in the lockfile.

Comment thread Cargo.toml Outdated
Comment thread Cargo.toml Outdated
@phillipleblanc phillipleblanc force-pushed the phillip/ballista-shuffle-prefix-fix branch from 3590800 to 80dffd0 Compare May 19, 2026 07:58
@phillipleblanc phillipleblanc marked this pull request as ready for review May 19, 2026 07:58
@phillipleblanc phillipleblanc requested a review from a team as a code owner May 19, 2026 07:58
Copilot AI review requested due to automatic review settings May 19, 2026 07:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated no new comments.

@phillipleblanc phillipleblanc self-assigned this May 19, 2026
@sgrebnov sgrebnov enabled auto-merge May 19, 2026 16:25
@phillipleblanc phillipleblanc marked this pull request as draft May 20, 2026 00:46
auto-merge was automatically disabled May 20, 2026 00:46

Pull request was converted to draft

@phillipleblanc phillipleblanc force-pushed the phillip/ballista-shuffle-prefix-fix branch from b5991fa to 1b202a1 Compare May 20, 2026 00:48
Copilot AI review requested due to automatic review settings May 20, 2026 00:48
@phillipleblanc phillipleblanc force-pushed the phillip/ballista-shuffle-prefix-fix branch from 1b202a1 to 7b657ec Compare May 20, 2026 00:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

Cargo.toml:429

  • PR description says this bumps Ballista crates to the spiceai-52.5 tip (rev f62181cf...), but the current change actually replaces the git deps with local path deps. Either the description should be updated or (preferably) the dependencies should be set to the new git rev so the change matches the stated intent.
ballista-core = { path = "/Users/phillip/code/apache/datafusion-ballista/ballista/core" }      # LOCAL ITERATION (replace with git rev before pushing)

Comment thread Cargo.toml Outdated
@phillipleblanc phillipleblanc force-pushed the phillip/ballista-shuffle-prefix-fix branch from 7b657ec to b15de61 Compare May 20, 2026 04:04
@phillipleblanc phillipleblanc changed the title deps(ballista): pull in PrefixStore-based S3 shuffle key fix (datafusion-ballista PR #42) deps(ballista): pull in shuffle-on-object-store correctness fixes (datafusion-ballista PRs #42 + #43) May 20, 2026
Copilot AI review requested due to automatic review settings May 20, 2026 23:33
@phillipleblanc phillipleblanc force-pushed the phillip/ballista-shuffle-prefix-fix branch from b15de61 to b3b9af9 Compare May 20, 2026 23:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.

Comment thread Cargo.toml Outdated
@phillipleblanc phillipleblanc changed the title deps(ballista): pull in shuffle-on-object-store correctness fixes (datafusion-ballista PRs #42 + #43) deps(ballista): bring in object-store shuffle fixes May 21, 2026
…tafusion-ballista PRs #42 + #43)

Bumps ballista-core / ballista-executor / ballista-scheduler to spiceai-52.5
tip (07be66a8), which carries:

  1. (#42) Wrap ObjectStoreShuffleStorage in object_store::prefix::PrefixStore
     so the URL path is reattached to every key — without this, writers
     uploaded to s3://bucket/<job>/... while readers looked under
     s3://bucket/<prefix>/<job>/... and got NotFound on every reduce stage.

  2. (#43) Dispatch s3:// partition paths inside BallistaClient::fetch_partition
     to the existing object-store reader. Before this the gRPC FetchPartition
     handler called tokio::fs::File::open("s3://...") and failed every
     single-batch query (q1) and reduce-stage fetch (q2+).

  3. (#43) Replace per-batch serialize_batch_to_ipc_bytes with a long-lived
     StreamingMultipartIpcUploader: one StreamWriter per output partition
     means the IPC stream has one header and one EOS marker instead of one
     stream per batch concatenated together. Fixes the
     ArrowError(IpcError("Unexpected EOS")) we saw on multi-batch hash-
     repartition queries.
@phillipleblanc phillipleblanc marked this pull request as ready for review May 21, 2026 06:55
@phillipleblanc phillipleblanc force-pushed the phillip/ballista-shuffle-prefix-fix branch from b3b9af9 to aca1b14 Compare May 21, 2026 06:55
Copilot AI review requested due to automatic review settings May 22, 2026 01:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 2 comments.

Comment thread Cargo.toml
Comment thread Cargo.toml
@phillipleblanc phillipleblanc enabled auto-merge May 22, 2026 05:56
@phillipleblanc phillipleblanc added this pull request to the merge queue May 22, 2026
Merged via the queue into trunk with commit cf8f12a May 22, 2026
72 of 76 checks passed
@phillipleblanc phillipleblanc deleted the phillip/ballista-shuffle-prefix-fix branch May 22, 2026 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config kind/dependencies Pull requests that update a dependency file size/s size/xs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants