[levanter] Share Pallas autotune helpers and restore compile offload by dlwh · Pull Request #4130 · marin-community/marin

dlwh · 2026-03-25T05:24:45Z

Move shard-aware autotune benchmarking out of fused cross-entropy into a shared Pallas helper and restore compile offload for shard-mapped autotune sweeps. This keeps fused CE behavior intact while making the benchmark path reusable from other kernels.

Fixes #4129

…n35-a3b-direct-compare # Conflicts: # lib/levanter/src/levanter/kernels/pallas/fused_cross_entropy_loss/api.py # lib/levanter/tests/kernels/test_pallas_fused_cross_entropy_loss.py

dlwh · 2026-03-25T05:53:13Z

@claude review

claude · 2026-03-25T05:53:44Z

Claude Code is working…

I'll analyze this and get back to you.

View job run

dlwh · 2026-03-25T05:54:27Z

@dlwh-golem will you review please

dlwh-golem · 2026-03-25T05:56:14Z

No findings.

I reviewed the helper extraction in lib/levanter/src/levanter/kernels/pallas/autotune_utils.py and the fused CE integration in lib/levanter/src/levanter/kernels/pallas/fused_cross_entropy_loss/api.py. The extraction preserved the prior shard-map/manual-sharding/tracer behavior, and the new generic tests cover the moved logic plus the fused CE autotune sweep path.

I also ran:

uv run --project lib/levanter --group test python -m pytest -q lib/levanter/tests/kernels/test_pallas_autotune_utils.py lib/levanter/tests/kernels/test_pallas_fused_cross_entropy_loss.py

Result: 68 passed, 12 skipped in about 9s. Residual risk is low; the only thing not really proven yet is behavior for future non-CE adopters with more complex out_specs, but nothing in this PR looks blocking.

Codex work

Repository: marin-community/marin
Trigger: [levanter] Share Pallas autotune helpers and restore compile offload #4130 (comment)
Request ID: 45aad31d-3bc8-4e6a-834c-25a79966b7cb
Accepted: true

Codex transcript

Completed items:
- none captured

Transcript:
No findings.

I reviewed the helper extraction in [lib/levanter/src/levanter/kernels/pallas/autotune_utils.py](/home/golem/oa-jder-bot/worktrees/github.com/marin-community/marin/4130/lib/levanter/src/levanter/kernels/pallas/autotune_utils.py) and the fused CE integration in [lib/levanter/src/levanter/kernels/pallas/fused_cross_entropy_loss/api.py](/home/golem/oa-jder-bot/worktrees/github.com/marin-community/marin/4130/lib/levanter/src/levanter/kernels/pallas/fused_cross_entropy_loss/api.py). The extraction preserved the prior shard-map/manual-sharding/tracer behavior, and the new generic tests cover the moved logic plus the fused CE autotune sweep path.

I also ran:
`uv run --project lib/levanter --group test python -m pytest -q lib/levanter/tests/kernels/test_pallas_autotune_utils.py lib/levanter/tests/kernels/test_pallas_fused_cross_entropy_loss.py`

Result: `68 passed, 12 skipped` in about 9s. Residual risk is low; the only thing not really proven yet is behavior for future non-CE adopters with more complex `out_specs`, but nothing in this PR looks blocking.

yonromai

Approved. I did not find a clear regression in the helper extraction or in the restored compile-offload path. The shared helper preserves the prior fused CE behavior around manual sharding, shard-map wrapping, tracer lowering, and fallback handling, and the targeted tests cover both the new helper and the fused CE sweep path.

Generated with Codex

…4130) Move shard-aware autotune benchmarking out of fused cross-entropy into a shared Pallas helper and restore compile offload for shard-mapped autotune sweeps. This keeps fused CE behavior intact while making the benchmark path reusable from other kernels. Fixes #4129

dlwh added 4 commits March 20, 2026 22:10

[levanter] Preserve fused CE autotune shard-map fence

aa08c26

[levanter] Test fused CE autotune sharding states directly

efabfb1

Merge remote-tracking branch 'origin/main' into research/grug-moe-qwe…

b1a20cc

…n35-a3b-direct-compare # Conflicts: # lib/levanter/src/levanter/kernels/pallas/fused_cross_entropy_loss/api.py # lib/levanter/tests/kernels/test_pallas_fused_cross_entropy_loss.py

[levanter] Extract Pallas autotune sharding helpers

86b0918

dlwh added the agent-generated Created by automation/agent label Mar 25, 2026

claude Bot mentioned this pull request Mar 25, 2026

[levanter] Extract shared Pallas autotune sharding helpers #4129

Closed

dlwh added 2 commits March 24, 2026 22:30

[levanter] Restore autotune compile offload

59a04d1

Merge branch 'main' into codex/pallas-autotune-helper

490266c

Merge branch 'main' into codex/pallas-autotune-helper

0a97fb0

dlwh changed the title ~~[levanter] Extract shared Pallas autotune helpers~~ [levanter] Share Pallas autotune helpers and restore compile offload Mar 25, 2026

dlwh requested a review from yonromai March 25, 2026 06:34

yonromai approved these changes Mar 25, 2026

View reviewed changes

Merge branch 'main' into codex/pallas-autotune-helper

5e9a800

dlwh enabled auto-merge (squash) March 25, 2026 18:50

dlwh disabled auto-merge March 25, 2026 18:50

dlwh merged commit 6553f3f into marin-community:main Mar 25, 2026
32 of 33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[levanter] Share Pallas autotune helpers and restore compile offload#4130

[levanter] Share Pallas autotune helpers and restore compile offload#4130
dlwh merged 8 commits intomarin-community:mainfrom
dlwh:codex/pallas-autotune-helper

dlwh commented Mar 25, 2026 •

edited

Loading

Uh oh!

dlwh commented Mar 25, 2026

Uh oh!

claude Bot commented Mar 25, 2026

Uh oh!

dlwh commented Mar 25, 2026

Uh oh!

dlwh-golem commented Mar 25, 2026

Uh oh!

yonromai left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dlwh commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dlwh commented Mar 25, 2026

Uh oh!

claude Bot commented Mar 25, 2026

Uh oh!

dlwh commented Mar 25, 2026

Uh oh!

dlwh-golem commented Mar 25, 2026

Uh oh!

yonromai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dlwh commented Mar 25, 2026 •

edited

Loading