Skip to content

TRTLLM/TensorRT #24

@naveenmiriyaluredhat

Description

@naveenmiriyaluredhat
  1. Steps to create TensorRT engines
  2. What happens during the TensorRT engine creation ?
  3. How does TRTLLM engine differ from vLLM engine ?
  4. Creating engines for PP vs TP ?
  5. What do the optimizations mean for enable_fmha and fuse_allreduce ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions