Skip to content

Pull requests: vllm-project/llm-compressor

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Modernize sparsification info type hints Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2644 opened Apr 23, 2026 by prdeepakbabu Loading…
[do not land] GPTQ actorder regression test suite awq For any issue / PR related to AWQ support fp8 For any issue / PR related to FP8 support gptq For any PR / issue related to GPTQ support llama For any PR / issue related to Llama herd support qwen For any PR / issue related to Qwen support w4a16
#2643 opened Apr 22, 2026 by HDCharles Collaborator Draft
3 tasks
Add SmoothQuant layer mappings for Cohere, DeepSeek V3, and Phi3 enhancement New feature or request smoothquant For any issue / PR related to SmoothQuant support transforms Related to transforms-based modifiers like SpinQuant and Quip two-reviews When a PR requires two reviews
#2639 opened Apr 21, 2026 by jayakumarpujar Loading…
4 tasks done
[AWQ] Seed grid search with identity baseline + fail fast on non-finite loss awq For any issue / PR related to AWQ support enhancement New feature or request Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2635 opened Apr 21, 2026 by juju812 Loading…
2 of 3 tasks
[Deprecation] [Offload] [Tracing] Remove legacy offloading logic in tracing Refactor Code cleanup and/or improvements to existing features tracing Issues related to model tracing
#2633 opened Apr 20, 2026 by kylesayrs Collaborator Loading…
[Deprecation] Replace deprecated function usage autoround For any PR / issue related to autoround support quality-failed Refactor Code cleanup and/or improvements to existing features
#2632 opened Apr 20, 2026 by kylesayrs Collaborator Loading…
add example of w8a8fp8 for qwen3.5 documentation Improvements or additions to documentation enhancement New feature or request fp8 For any issue / PR related to FP8 support qwen For any PR / issue related to Qwen support two-reviews When a PR requires two reviews
#2631 opened Apr 20, 2026 by zhangxin81 Loading…
Adding test_group to lm-eval configs enhancement New feature or request fp8 For any issue / PR related to FP8 support nvfp4 For any PR / issue related to NVFP4 support two-reviews When a PR requires two reviews w4a16
#2623 opened Apr 16, 2026 by debroy-rh Loading…
Defer weight qparams to epoch end, unify calibration lifecycle
#2621 opened Apr 15, 2026 by HDCharles Collaborator Loading…
2 of 5 tasks
test gptq issue [not for land] enhancement New feature or request gptq For any PR / issue related to GPTQ support nvfp4 For any PR / issue related to NVFP4 support quality-failed
#2617 opened Apr 14, 2026 by HDCharles Collaborator Loading…
Add actorder support for GPTQ block quantization enhancement New feature or request fp8 For any issue / PR related to FP8 support gptq For any PR / issue related to GPTQ support ready When a PR is ready for review Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2616 opened Apr 14, 2026 by rk119 Loading…
[Tests] Add transformers v5 modeling tests and clean up import guards qwen For any PR / issue related to Qwen support Refactor Code cleanup and/or improvements to existing features
#2614 opened Apr 13, 2026 by dsikka Collaborator Loading…
[not for land] DDP regression tests awq For any issue / PR related to AWQ support documentation Improvements or additions to documentation enhancement New feature or request llama For any PR / issue related to Llama herd support quality-failed qwen For any PR / issue related to Qwen support
#2613 opened Apr 13, 2026 by HDCharles Collaborator Loading…
4 tasks done
fix: support transformers >= 5.0 (TORCH_INIT_FUNCTIONS fallback) bug Something isn't working qwen For any PR / issue related to Qwen support two-reviews When a PR requires two reviews w4a16
#2608 opened Apr 12, 2026 by quivent Loading…
[oneshot] clean offload_dir during post-processing
#2605 opened Apr 10, 2026 by brian-dellabetta Collaborator Draft
3 tasks
[docs] deepseek v3.2 docs documentation Improvements or additions to documentation ready When a PR is ready for review
#2602 opened Apr 10, 2026 by brian-dellabetta Collaborator Loading…
fix: correct TOKENIZERS_PARALLELISM_ENV constant value needs-rebase ready When a PR is ready for review two-reviews When a PR requires two reviews
#2596 opened Apr 10, 2026 by kuishou68 Loading…
[Refactor] Refactor splits to only use the "calibration" split (#2551) needs-rebase ready When a PR is ready for review Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2589 opened Apr 8, 2026 by arpitkh101 Loading…
Observers refactor needs-rebase
#2585 opened Apr 8, 2026 by HDCharles Collaborator Loading…
[Refactor] Consolidate Intermediate Offloading needs-rebase two-reviews When a PR requires two reviews
#2583 opened Apr 8, 2026 by menogrey Contributor Loading…
[AWQ] [gemma3] remove input layernorm mapping
#2571 opened Apr 6, 2026 by brian-dellabetta Collaborator Loading…
1 task
feat: add ActivationOrdering support for per-channel GPTQ quantization needs-rebase ready When a PR is ready for review two-reviews When a PR requires two reviews
#2525 opened Mar 26, 2026 by matdou Loading…
[Examples] Reorganize examples by model/scheme/algo hierarchy documentation Improvements or additions to documentation needs-rebase
#2510 opened Mar 24, 2026 by dsikka Collaborator Draft
ProTip! What’s not been updated in a month: updated:<2026-03-22.