feat: add mlx chunked prefill support by gufengc · Pull Request #469 · GradientHQ/parallax

gufengc · 2026-06-01T14:05:25Z

Summary

Adds MLX/mac chunked prefill support with chunk-aware request length tracking and prefix-cache backed chunk progression.
Makes prefix cache enabled by default and adds --chunked-prefill-size with a default of 1024; 0 disables chunking.
Materializes MLX linear caches after prefill chunks to avoid lazy update graph accumulation across chunks.
Preserves downstream pipeline chunk ordering when multiple chunks for the same request id arrive before the previous chunk finishes.

Why

Long-prefill requests on macOS MLX could exceed memory because the prefill path ran the full prompt at once. During chunked testing, downstream peers could also drop queued chunks with the same request id, causing two-node pipeline requests to hang. This PR chunks the MLX prefill work, reuses prefix cache state across chunks, and keeps distinct same-rid prefill chunks queued until their turn.

Validation

pre-commit run --all-files
pytest -> 143 passed, 5 skipped
Manual macOS two-node validation with mlx-community/Qwen3.5-0.8B-MLX-bf16: node0 hosted layers [0, 12), node1 hosted layers [12, 24), --chunked-prefill-size 128; a 1675-token prompt completed successfully with 8 generated tokens.

Add MLX chunked prefill support

24e1ad0

gufengc changed the title ~~[codex] Add MLX chunked prefill support~~ feat: add MLX chunked prefill support Jun 1, 2026

chore: fix chunked prefill test imports

55e7df8

gufengc changed the title ~~feat: add MLX chunked prefill support~~ feat: add mlx chunked prefill support Jun 1, 2026

gufengc marked this pull request as ready for review June 1, 2026 14:29

gufengc requested a review from a team June 1, 2026 14:29

gufengc merged commit 3888eed into main Jun 1, 2026
19 of 21 checks passed

gufengc deleted the codex/mac-chunked-prefill branch June 1, 2026 14:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add mlx chunked prefill support#469

feat: add mlx chunked prefill support#469
gufengc merged 2 commits into
mainfrom
codex/mac-chunked-prefill

gufengc commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gufengc commented Jun 1, 2026

Summary

Why

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant