-
Notifications
You must be signed in to change notification settings - Fork 432
Open
0 / 10 of 1 issue completedLabels
enhancementNew feature or requestNew feature or requestgood first issueA good first issue for users wanting to contributeA good first issue for users wanting to contributekeep-open
Description
Functionality in llm-compressor and compressed-tensors could be further optimized using triton jits and torch.compile to provide execution speed-ups. Some functionality that could potentially be replaced include:
- Quantization Compressors
compress_weightanddecompress_weightfunctionality
- Observer
calculate_qparamsfunctionality for the MinMax observer and MSE observer - Update GPTQ
A proposed solution should swap the existing code with optimized functionality, include updated tests, and quick benchmarks showing the differences in performance
Reactions are currently unavailable
Sub-issues
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueA good first issue for users wanting to contributeA good first issue for users wanting to contributekeep-open