-
Notifications
You must be signed in to change notification settings - Fork 75
Pull requests: intel/auto-round
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: handle per-tensor FP8 dequantization for Devstral models
#1356
opened Jan 27, 2026 by
SwekeR-463
Loading…
4 of 9 tasks
Refactor FP8 dequantization and detection using registry pattern
#1348
opened Jan 27, 2026 by
scopophobic
Loading…
5 tasks done
Optimize CPU RAM peak memory during quantization
#1346
opened Jan 27, 2026 by
lvliang-intel
•
Draft
4 of 9 tasks
rm duplicate args of the quantization extra config
#1334
opened Jan 23, 2026 by
WeiweiZhang1
Loading…
1 of 9 tasks
add support for w4a16_mixed
enhancement
New feature or request
ready
only add when the PR is ready to merge
#1326
opened Jan 23, 2026 by
n1ck-guo
Loading…
6 of 17 tasks
Autoround in vLLM Office Hours
documentation
Improvements or additions to documentation
#1322
opened Jan 23, 2026 by
yiliu30
Loading…
1 of 18 tasks
enable glm4_moe_lite quantization & generation
#1321
opened Jan 22, 2026 by
WeiweiZhang1
Loading…
3 of 18 tasks
Optimize FP8 layer conversion by skipping weight initialization
#1295
opened Jan 16, 2026 by
Copilot
AI
Loading…
Robust FP8 layer detection for ignore_layers (#1283)
#1289
opened Jan 15, 2026 by
scopophobic
Loading…
Fix ignore_layers not working for FP8 models
#1286
opened Jan 15, 2026 by
Copilot
AI
Loading…
11 tasks done
[WIP][refactor quanizers][step 1] refactor rtn and tuning
#1278
opened Jan 14, 2026 by
n1ck-guo
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.