Commit add79c6
[Quantization] Support NVFP4 for inline-swiglu fused MoE experts (MiniMax-M3)
MiniMaxM3VLExperts is a standard transformers 5.x fused-experts container
(3-D gate_up_proj/down_proj + num_experts) but applies SwiGLU inline and has
no act_fn submodule, so _is_fused_experts_module returned False -> the experts
were never wrapped -> nvfp4_experts_only enabled zero expert quantizers and
export raised NotImplementedError("...experts type 'MiniMaxM3VLExperts'...").
Drop the act_fn requirement from the detector. _QuantFusedExperts only
intercepts F.linear and never reads act_fn, and _export_fused_experts is
weight-only, so no export change is needed once detection wraps the experts.
Models needing custom forwards (Llama4, GptOss, DBRX, Qwen3-VL-MoE) remain
excluded earlier via their explicit registrations.
Flip the now-incorrect test_module_missing_act_fn test and add an inline-SwiGLU
synthetic experts detection + calibration test. Add a CHANGELOG entry and a
MiniMax M3 row to the llm_ptq support matrix.
Validated end-to-end on GB200: 14,592 expert weight quantizers enabled,
260 GB NVFP4 checkpoint, wikitext-2 perplexity 5.083 -> 5.420 (+6.6%).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Yifan Jiang <19356972+yifjiang@users.noreply.github.com>1 parent cc17f2c commit add79c6
4 files changed
Lines changed: 107 additions & 7 deletions
File tree
- examples/llm_ptq
- modelopt/torch/quantization/plugins
- tests/unit/torch/quantization/plugins
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
114 | 114 | | |
115 | 115 | | |
116 | 116 | | |
| 117 | + | |
117 | 118 | | |
118 | 119 | | |
119 | 120 | | |
| |||
130 | 131 | | |
131 | 132 | | |
132 | 133 | | |
133 | | - | |
| 134 | + | |
| 135 | + | |
134 | 136 | | |
135 | 137 | | |
136 | 138 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1442 | 1442 | | |
1443 | 1443 | | |
1444 | 1444 | | |
1445 | | - | |
1446 | | - | |
| 1445 | + | |
| 1446 | + | |
1447 | 1447 | | |
1448 | | - | |
| 1448 | + | |
| 1449 | + | |
| 1450 | + | |
| 1451 | + | |
| 1452 | + | |
| 1453 | + | |
1449 | 1454 | | |
1450 | 1455 | | |
1451 | 1456 | | |
1452 | 1457 | | |
1453 | 1458 | | |
1454 | 1459 | | |
1455 | | - | |
| 1460 | + | |
1456 | 1461 | | |
1457 | 1462 | | |
1458 | 1463 | | |
| |||
Lines changed: 94 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
87 | 118 | | |
88 | 119 | | |
89 | 120 | | |
| |||
145 | 176 | | |
146 | 177 | | |
147 | 178 | | |
148 | | - | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
149 | 185 | | |
150 | 186 | | |
151 | 187 | | |
152 | 188 | | |
153 | | - | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
154 | 196 | | |
155 | 197 | | |
156 | 198 | | |
| |||
652 | 694 | | |
653 | 695 | | |
654 | 696 | | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
655 | 747 | | |
656 | 748 | | |
657 | 749 | | |
| |||
0 commit comments