Skip to content

fix: route in-memory shuffle partitions to remote fetch when not present locally#17

Merged
lukekim merged 1 commit into
spiceai-51from
lukim/in-memory-shuffle
Feb 5, 2026
Merged

fix: route in-memory shuffle partitions to remote fetch when not present locally#17
lukekim merged 1 commit into
spiceai-51from
lukim/in-memory-shuffle

Conversation

@lukekim
Copy link
Copy Markdown

@lukekim lukekim commented Feb 4, 2026

When using in-memory shuffle (shuffle_location: memory), shuffle partitions are stored in the executor's local InMemoryShuffleManager. Previously, the shuffle reader incorrectly assumed all memory:// paths could be read locally.

In distributed mode with multiple executors, a shuffle reader on executor A may need to read a partition that was written by executor B. The partition would have a memory:// path, but only exist in executor B's shuffle manager. This caused 'Shuffle partition not found in memory' errors.

This fix modifies split_partition_locations() to check if a memory:// partition actually exists in the local shuffle manager before categorizing it for local read. If the partition doesn't exist locally, it's routed to the remote fetch path which uses the Arrow Flight service. The Flight service already correctly handles memory:// paths by reading from the executor's local shuffle manager.

Fixes spiceai/spiceai#9290

…ent locally

When using in-memory shuffle (shuffle_location: memory), shuffle partitions are
stored in the executor's local InMemoryShuffleManager. Previously, the shuffle
reader incorrectly assumed all memory:// paths could be read locally.

In distributed mode with multiple executors, a shuffle reader on executor A may
need to read a partition that was written by executor B. The partition would
have a memory:// path, but only exist in executor B's shuffle manager. This
caused 'Shuffle partition not found in memory' errors.

This fix modifies split_partition_locations() to check if a memory:// partition
actually exists in the local shuffle manager before categorizing it for local
read. If the partition doesn't exist locally, it's routed to the remote fetch
path which uses the Arrow Flight service. The Flight service already correctly
handles memory:// paths by reading from the executor's local shuffle manager.

Fixes spiceai/spiceai#9290
@lukekim lukekim self-assigned this Feb 4, 2026
@lukekim lukekim added the bug Something isn't working label Feb 4, 2026
@lukekim lukekim merged commit 68afffb into spiceai-51 Feb 5, 2026
29 checks passed
@lukekim lukekim deleted the lukim/in-memory-shuffle branch February 5, 2026 02:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants