Skip to content

v0.3.0

Latest

Choose a tag to compare

@junstar92 junstar92 released this 16 Jul 00:50
19c9865

What's Changed

  • Bump supported TensorRT-LLM version to v0.20.0
  • Bump supported Torch-TensorRT version to v2.7.0
  • Add support for KV cache quantization for compressed-tensors
  • Add support for weight-only, FP8, and KV cache quantization for modelopt
  • Add support for vision-language models
  • Add support for draft-target models
  • Drop support for AutoGPTQ
  • Add new build arguments

Full Changelog: v0.2.1...v0.3.0