Commit 624119f
committed
fix: route in-memory shuffle partitions to remote fetch when not present locally
When using in-memory shuffle (shuffle_location: memory), shuffle partitions are
stored in the executor's local InMemoryShuffleManager. Previously, the shuffle
reader incorrectly assumed all memory:// paths could be read locally.
In distributed mode with multiple executors, a shuffle reader on executor A may
need to read a partition that was written by executor B. The partition would
have a memory:// path, but only exist in executor B's shuffle manager. This
caused 'Shuffle partition not found in memory' errors.
This fix modifies split_partition_locations() to check if a memory:// partition
actually exists in the local shuffle manager before categorizing it for local
read. If the partition doesn't exist locally, it's routed to the remote fetch
path which uses the Arrow Flight service. The Flight service already correctly
handles memory:// paths by reading from the executor's local shuffle manager.
Fixes spiceai/spiceai#92901 parent 20ef1eb commit 624119f
1 file changed
Lines changed: 23 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
425 | 425 | | |
426 | 426 | | |
427 | 427 | | |
428 | | - | |
429 | | - | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
430 | 441 | | |
431 | 442 | | |
432 | 443 | | |
| |||
598 | 609 | | |
599 | 610 | | |
600 | 611 | | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
601 | 622 | | |
602 | 623 | | |
603 | 624 | | |
| |||
0 commit comments