-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Open
Labels
Model customization<NV>Adding support for new model architectures or variants<NV>Adding support for new model architectures or variantsquestionFurther information is requestedFurther information is requested
Description
System Info
System Information:
- OS:dgx is ubuntu24.04
- Python version:3.13
- CUDA version:13.0
- GPU model(s):gb10
- Driver version:580.126.09
- TensorRT-LLM version:docker 1.3.0rc6
Detailed output:
Paste the output of the above commands here
How would you like to use TensorRT-LLM
I want to run inference of a qwen3.5 27B,I don't know how to integrate it with TensorRT-LLM.
Have tried 1.3.0rc 5 or 6, all failed
Before submitting a new issue...
- Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Model customization<NV>Adding support for new model architectures or variants<NV>Adding support for new model architectures or variantsquestionFurther information is requestedFurther information is requested