Skip to content

[Usage]: how to use 1.3.0rc6 to deploy qwen3.5 27B on dgx spark #12000

@luningxie

Description

@luningxie

System Info

System Information:

  • OS:dgx is ubuntu24.04
  • Python version:3.13
  • CUDA version:13.0
  • GPU model(s):gb10
  • Driver version:580.126.09
  • TensorRT-LLM version:docker 1.3.0rc6

Detailed output:

Paste the output of the above commands here

How would you like to use TensorRT-LLM

I want to run inference of a qwen3.5 27B,I don't know how to integrate it with TensorRT-LLM.
Have tried 1.3.0rc 5 or 6, all failed

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Labels

Model customization<NV>Adding support for new model architectures or variantsquestionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions