Commit 361f7e3
Merge puzzletron compression algorithm (#1121)
### What does this PR do?
Implement puzzletron compression algorithm based on Puzzle paper
(https://arxiv.org/abs/2411.19146)
<details>
<summary> Th list of reviewed and merged MRs that resulted in the
feature/puzzletron branch</summary>
Merging dkorzekwa/any_model to feature/puzzletron
[Add anymodel directories to feature/puzzletron by danielkorzekwa · Pull
Request #974 ·
NVIDIA/Model-Optimizer](#974)
- merged
[Draft: anymodel activation scoring by danielkorzekwa · Pull Request
#989 ·
NVIDIA/Model-Optimizer](#989)
- merged
[Draft: Merge anymodel pruning by danielkorzekwa · Pull Request #990 ·
NVIDIA/Model-Optimizer](#990)
- merged
[Draft: Merging anymodel:build_library_and_stats by danielkorzekwa ·
Pull Request #993 ·
NVIDIA/Model-Optimizer](#993)
- merged
[Dkorzekwa/any model calc one block scores by danielkorzekwa · Pull
Request #994 ·
NVIDIA/Model-Optimizer](#994)
- merged
[Draft: merge any_model: mip_and_realize_models by danielkorzekwa · Pull
Request #995 ·
NVIDIA/Model-Optimizer](#995)
- merged
[Dkorzekwa/any model other modeqls by danielkorztiekwa · Pull Request
#1007 ·
NVIDIA/Model-Optimizer](#1007)
- merged
PR to 1007: #1039 - merged
[Dkorzekwa/anymodel gptoss by danielkorzekwa · Pull Request #1020 ·
NVIDIA/Model-Optimizer](#1020)
- merged
[Merge any_model tutorial by danielkorzekwa · Pull Request #1035 ·
NVIDIA/Model-Optimizer](#1035)
- merged
[Merge mbridge distillation for any_model by danielkorzekwa · Pull
Request #1036 ·
NVIDIA/Model-Optimizer](#1036)
- merged
[MR branch for the remaining difference between dkorzekwa/any_model an…
by danielkorzekwa · Pull Request #1047 ·
NVIDIA/Model-Optimizer](#1047)
- merged
[Dkorzekwa/decilm hf code cleanup by danielkorzekwa · Pull Request #1071
·
NVIDIA/Model-Optimizer](#1071)
- merged
[Dkorzekwa/decilm hf code cleanup 2 by danielkorzekwa · Pull Request
#1073 ·
NVIDIA/Model-Optimizer](#1073)
- merged
[Dkorzekwa/anymodel subblock stats by danielkorzekwa · Pull Request
#1085 ·
NVIDIA/Model-Optimizer](#1085)
- merged
[Dkorzekwa/anymodel subblock stats nodecilm by danielkorzekwa · Pull
Request #1102 ·
NVIDIA/Model-Optimizer](#1102)
- merged
[Dkorzekwa/decilm cleanup post subblockstats by danielkorzekwa · Pull
Request #1103 ·
NVIDIA/Model-Optimizer](#1103)
- merged
[code clean up by danielkorzekwa · Pull Request #1110 ·
NVIDIA/Model-Optimizer](#1110)
- merged
Merging into main:
[Activation hooks redesign (reuse hooks component across both minitron
and puzzletron) by danielkorzekwa · Pull Request #1022 ·
NVIDIA/Model-Optimizer](#1022)
- merged
[Dkorzekwa/puzzletron use importance hooks from prune by danielkorzekwa
· Pull Request #1115 ·
NVIDIA/Model-Optimizer](#1115)
- merged
</details>
<!-- Details about the change. -->
### Usage
Puzzletron tutorial:
https://github.com/NVIDIA/Model-Optimizer/tree/feature/puzzletron/examples/puzzletron
### Testing
The main e2e test for compressing 9 models with Puzzletron:
https://github.com/NVIDIA/Model-Optimizer/blob/feature/puzzletron/tests/gpu/torch/puzzletron/test_puzzletron.py
2-gpu nightly tests:
-
https://github.com/NVIDIA/Model-Optimizer/actions/runs/24468209205/job/71501061203
-
https://github.com/NVIDIA/Model-Optimizer/actions/runs/24470214159/job/71508152952
### Before your PR is "*Ready for review*"
- Is this change backward compatible?: ✅
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: ✅
- Did you write any new necessary tests?: ✅
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
✅
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added Puzzletron: end-to-end heterogeneous pruning & NAS workflow with
AnyModel support, example pipelines, deployment and evaluation
utilities, and tools for converting/pruning and exporting compressed
checkpoints.
* **Documentation**
* Comprehensive Puzzletron tutorials, model-specific guides, evaluator
instructions, example configs, and changelog entry.
* **Chores**
* CI/workflow updates (extras installation, longer GPU test timeout),
pre-commit hook exclusion updated, and CODEOWNERS entries added.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Liana Mikaelyan <lmikaelyan@nvidia.com>
Signed-off-by: Liana Mikaelyan <45925959+LianaMikael@users.noreply.github.com>
Signed-off-by: Daniel Korzekwa <daniel.korzekwa@gmail.com>
Signed-off-by: jrausch <jrausch@nvidia.com>
Signed-off-by: root <root@pool0-00848.cm.cluster>
Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Co-authored-by: Liana Mikaelyan <lmikaelyan@nvidia.com>
Co-authored-by: Liana Mikaelyan <45925959+LianaMikael@users.noreply.github.com>
Co-authored-by: J Rausch <38429553+j-rausch@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent dec2952 commit 361f7e3
235 files changed
Lines changed: 24885 additions & 166 deletions
File tree
- .github
- workflows
- docs/source
- examples
- llm_eval
- megatron_bridge
- results
- pruning
- puzzletron
- configs
- gptoss-20b_remove_experts_memory
- pruning
- llama-3_1-8B_pruneffn_memory
- pruning
- llama-3_2-3B_pruneffn_memory
- pruning
- mistral-small-24b-instruct-2501_pruneffn_memory
- pruning
- nemotron-nano-12b-v2
- pruning
- qwen2_5_7b_instruct_pruneffn_memory
- pruning
- qwen3-8b_pruneffn_memory
- pruning
- evaluation
- modelopt/torch
- export
- plugins
- prune/importance_hooks
- puzzletron
- activation_scoring
- activation_hooks
- anymodel
- converter
- model_descriptor
- models
- gpt_oss
- llama
- mistral_small
- nemotron_h_v2
- nemotron_h
- qwen2
- qwen3_vl
- qwen3
- puzzformer
- dataset
- mip
- plugins
- mbridge
- pruning
- replacement_library
- sewing_kit
- subblock_stats
- tools
- bypassed_training
- utils
- data
- utils
- tests
- _test_utils/torch
- puzzletron
- tokenizer
- examples
- megatron_bridge
- speculative_decoding
- gpu_megatron/torch/export
- gpu/torch
- puzzletron
- resources/configs
- Qwen
- Qwen2.5-7B-Instruct
- pruning
- Qwen3-8B
- pruning
- Qwen3-VL-30B-A3B-Instruct
- pruning
- meta-llama
- Llama-3.1-8B-Instruct
- pruning
- Llama-3.2-3B-Instruct
- pruning
- mistralai/Mistral-Small-24B-Instruct-2501
- pruning
- nvidia
- NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16
- pruning
- NVIDIA-Nemotron-Nano-12B-v2
- pruning
- openai/gpt-oss-20b
- pruning
- pruning
- quantization
- sparsity/attention_sparsity
- unit/torch
- export
- puzzletron
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
49 | 50 | | |
50 | 51 | | |
51 | 52 | | |
| 53 | + | |
52 | 54 | | |
53 | 55 | | |
54 | 56 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
| 51 | + | |
51 | 52 | | |
52 | 53 | | |
53 | 54 | | |
| |||
64 | 65 | | |
65 | 66 | | |
66 | 67 | | |
67 | | - | |
| 68 | + | |
68 | 69 | | |
69 | 70 | | |
70 | 71 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
135 | | - | |
| 135 | + | |
136 | 136 | | |
137 | 137 | | |
138 | 138 | | |
| |||
144 | 144 | | |
145 | 145 | | |
146 | 146 | | |
147 | | - | |
| 147 | + | |
148 | 148 | | |
149 | 149 | | |
150 | 150 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
71 | | - | |
| 71 | + | |
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
| 97 | + | |
97 | 98 | | |
98 | 99 | | |
99 | 100 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
44 | 45 | | |
45 | 46 | | |
46 | 47 | | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
47 | 56 | | |
48 | 57 | | |
49 | 58 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
43 | 59 | | |
44 | 60 | | |
45 | 61 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| 43 | + | |
42 | 44 | | |
43 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
44 | 52 | | |
45 | 53 | | |
46 | 54 | | |
| |||
50 | 58 | | |
51 | 59 | | |
52 | 60 | | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
53 | 81 | | |
54 | 82 | | |
55 | | - | |
| 83 | + | |
56 | 84 | | |
57 | 85 | | |
58 | 86 | | |
| |||
72 | 100 | | |
73 | 101 | | |
74 | 102 | | |
75 | | - | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
76 | 107 | | |
77 | 108 | | |
78 | 109 | | |
| |||
109 | 140 | | |
110 | 141 | | |
111 | 142 | | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
112 | 159 | | |
| 160 | + | |
113 | 161 | | |
114 | 162 | | |
115 | 163 | | |
| 164 | + | |
116 | 165 | | |
117 | 166 | | |
118 | 167 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
49 | 52 | | |
50 | 53 | | |
51 | 54 | | |
| |||
92 | 95 | | |
93 | 96 | | |
94 | 97 | | |
95 | | - | |
| 98 | + | |
96 | 99 | | |
97 | 100 | | |
98 | 101 | | |
| |||
158 | 161 | | |
159 | 162 | | |
160 | 163 | | |
161 | | - | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
162 | 178 | | |
163 | | - | |
| 179 | + | |
164 | 180 | | |
165 | 181 | | |
166 | 182 | | |
| |||
169 | 185 | | |
170 | 186 | | |
171 | 187 | | |
172 | | - | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
173 | 193 | | |
174 | 194 | | |
175 | 195 | | |
| |||
0 commit comments