What's Changed
- Bump supported TensorRT-LLM version to v0.20.0
- Bump supported Torch-TensorRT version to v2.7.0
- Add support for KV cache quantization for compressed-tensors
- Add support for weight-only, FP8, and KV cache quantization for modelopt
- Add support for vision-language models
- Add support for draft-target models
- Drop support for AutoGPTQ
- Add new build arguments
Full Changelog: v0.2.1...v0.3.0