Skip to content

Commit b3b9af9

Browse files
deps(ballista): pull in shuffle-on-object-store correctness fixes (datafusion-ballista PRs #42 + #43)
Bumps ballista-core / ballista-executor / ballista-scheduler to a tip of phillip/shuffle-final-fetch-via-object-store (datafusion-ballista #43), which includes #42 already merged into spiceai-52.5. Three independent object-store shuffle correctness fixes: 1. (#42) Wrap ObjectStoreShuffleStorage in object_store::prefix::PrefixStore so the URL path is reattached to every key — without this, writers uploaded to s3://bucket/<job>/... while readers looked under s3://bucket/<prefix>/<job>/... and got NotFound on every reduce stage. 2. (#43) Dispatch s3:// partition paths inside BallistaClient::fetch_partition to the existing object-store reader. Before this the gRPC FetchPartition handler called tokio::fs::File::open("s3://...") and failed every single-batch query (q1) and reduce-stage fetch (q2+). 3. (#43) Replace per-batch serialize_batch_to_ipc_bytes with a long-lived StreamingMultipartIpcUploader: one StreamWriter per output partition means the IPC stream has one header and one EOS marker instead of one stream per batch concatenated together. Fixes the ArrowError(IpcError("Unexpected EOS")) we saw on multi-batch hash- repartition queries. Will repin to a stable spiceai-52.5 rev once #43 merges.
1 parent b935783 commit b3b9af9

2 files changed

Lines changed: 22 additions & 22 deletions

File tree

Cargo.lock

Lines changed: 19 additions & 19 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -429,9 +429,9 @@ datafusion-substrait = { git = "https://github.com/spiceai/datafusion.git", rev
429429

430430
datafusion-table-providers = { git = "https://github.com/datafusion-contrib/datafusion-table-providers.git", rev = "b798c391b6566c172d44361f8acc8472c958ca75" } # spiceai-52
431431

432-
ballista-core = { git = "https://github.com/spiceai/datafusion-ballista.git", rev = "47e2b4946762c834d4a11532a25cc99c9e8a0b9d" } # spiceai-52.5
433-
ballista-executor = { git = "https://github.com/spiceai/datafusion-ballista.git", rev = "47e2b4946762c834d4a11532a25cc99c9e8a0b9d" } # spiceai-52.5
434-
ballista-scheduler = { git = "https://github.com/spiceai/datafusion-ballista.git", rev = "47e2b4946762c834d4a11532a25cc99c9e8a0b9d" } # spiceai-52.5
432+
ballista-core = { git = "https://github.com/spiceai/datafusion-ballista.git", rev = "ad17b1539a4244885208bbca641fc2814034fcca" } # phillip/shuffle-final-fetch-via-object-store (PR #43)
433+
ballista-executor = { git = "https://github.com/spiceai/datafusion-ballista.git", rev = "ad17b1539a4244885208bbca641fc2814034fcca" } # phillip/shuffle-final-fetch-via-object-store (PR #43)
434+
ballista-scheduler = { git = "https://github.com/spiceai/datafusion-ballista.git", rev = "ad17b1539a4244885208bbca641fc2814034fcca" } # phillip/shuffle-final-fetch-via-object-store (PR #43)
435435

436436
delta_kernel = { git = "https://github.com/spiceai/delta-kernel-rs.git", rev = "47034733a0477f72e4f6abbbf6a27d0da069860a" } # spiceai-0.18.2
437437

0 commit comments

Comments
 (0)