Commit ca7eb64
authored
DSV4 PTQ example with dequant on the fly (#1341)
### What does this PR do?
Type of change: new example <!-- Use one of the following: Bug fix, new
feature, new example, new tests, documentation. -->
Add deepseek v4 official modeling ptq example
### Usage
See readme, and it requires the vllm PR:
vllm-project/vllm#42209
### Testing
Tested with ptq and export of dsv4 flash and served with vllm.
### Before your PR is "*Ready for review*"
Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)
and your commits are signed (`git commit -s -S`).
Make sure you read and follow the [Security Best
Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors)
(e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(...,
weights_only=False)`, `pickle`, etc.).
- Is this change backward compatible?: ✅ / ❌ / N/A <!--- If ❌, explain
why. -->
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ / ❌ / N/A
<!--- Mandatory -->
- Did you write any new necessary tests?: ✅ / ❌ / N/A <!--- Mandatory
for new features or examples. -->
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
✅ / ❌ / N/A <!--- Only for new features, API changes, critical bug fixes
or backward incompatible changes. -->
### Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added DeepSeek‑V4 routed‑expert post‑training quantization and an
NVFP4 checkpoint conversion utility.
* **Documentation**
* Expanded DeepSeek quantization guide with directory layout, updated
V3/V3.2 workflows, and detailed V4 routed‑expert calibration,
single/multi‑node examples, and export guidance.
* **Chores**
* Made example quantization scripts location‑independent.
* Updated pre‑commit license hook to skip DeepSeek example quantization
files.
<!-- review_stack_entry_start -->
[](https://app.coderabbit.ai/change-stack/NVIDIA/Model-Optimizer/pull/1341?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)
<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Meng Xin <mxin@nvidia.com>1 parent f9423c0 commit ca7eb64
8 files changed
Lines changed: 1271 additions & 9 deletions
File tree
- examples/deepseek
- deepseek_v3
- deepseek_v4
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | | - | |
107 | | - | |
| 106 | + | |
| 107 | + | |
108 | 108 | | |
109 | 109 | | |
110 | 110 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
10 | 17 | | |
11 | 18 | | |
12 | 19 | | |
| |||
54 | 61 | | |
55 | 62 | | |
56 | 63 | | |
57 | | - | |
| 64 | + | |
58 | 65 | | |
59 | 66 | | |
60 | 67 | | |
61 | 68 | | |
62 | 69 | | |
63 | | - | |
| 70 | + | |
64 | 71 | | |
65 | 72 | | |
66 | 73 | | |
| |||
78 | 85 | | |
79 | 86 | | |
80 | 87 | | |
81 | | - | |
| 88 | + | |
82 | 89 | | |
83 | 90 | | |
84 | 91 | | |
| |||
91 | 98 | | |
92 | 99 | | |
93 | 100 | | |
94 | | - | |
| 101 | + | |
95 | 102 | | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
File renamed without changes.
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
19 | 21 | | |
20 | 22 | | |
21 | 23 | | |
| |||
84 | 86 | | |
85 | 87 | | |
86 | 88 | | |
87 | | - | |
| 89 | + | |
88 | 90 | | |
89 | 91 | | |
90 | 92 | | |
| |||
File renamed without changes.
0 commit comments