Commit 1a033b2
iter 117b-3 NOT-THROUGHPUT-DELIVERING ✗: sparse-dispatch eager-mode overhead dominates dense fused bmm
KILLED at s10 (~11min after launch). step_avg trajectory 111->33s converging
to ~28-30s steady-state, 42% slower than dense iter 95 baseline 23.5s.
Throughput economics analysis (the principled finding):
Dense MoE: E*N tokens via fused bmm (single highly-optimized kernel)
Sparse C=8: 8N tokens via gather/scatter + grouped GEMM
Gather/scatter overhead is O(N) INDEPENDENT of C (always happens)
Grouped GEMM has worse memory access patterns than dense bmm
Net: 8N tokens + scatter overhead > 15N fused-dense
For sparse to win: requires C <= 1 AND Triton-fused dispatch kernel.
Eager-mode sparse dispatch at any C is throughput-neutral or negative.
Implication for queue:
- iter 117b-3 not promotable as-is at any capacity factor
- iter 117b-3b (sparse-Q attention) SKIPPED -- same code path class,
will hit same overhead ceiling
- sparsity throughput axis PARKED until iter 117b-2-fix lands
(Triton entmax with E=30->32 padding, then design fused dispatch)
- autonomous protocol jumps to iter 117b-2-fix priority
Distinct from iter 117b-2 NOT-VIABLE: that was a kernel input-shape bug
(fixable via padding); this is an algorithmic finding (eager sparse
dispatch is structurally throughput-negative vs fused dense bmm).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent d9bce1c commit 1a033b2
1 file changed
Lines changed: 28 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2305 | 2305 | | |
2306 | 2306 | | |
2307 | 2307 | | |
| 2308 | + | |
| 2309 | + | |
| 2310 | + | |
| 2311 | + | |
| 2312 | + | |
| 2313 | + | |
| 2314 | + | |
| 2315 | + | |
| 2316 | + | |
| 2317 | + | |
| 2318 | + | |
| 2319 | + | |
| 2320 | + | |
| 2321 | + | |
| 2322 | + | |
| 2323 | + | |
| 2324 | + | |
| 2325 | + | |
| 2326 | + | |
| 2327 | + | |
| 2328 | + | |
| 2329 | + | |
| 2330 | + | |
| 2331 | + | |
| 2332 | + | |
| 2333 | + | |
| 2334 | + | |
| 2335 | + | |
2308 | 2336 | | |
2309 | 2337 | | |
2310 | 2338 | | |
| |||
0 commit comments