Replies: 2 comments 1 reply
-
I think this would likely need refit support. This is an example in native TensorRT https://github.com/NVIDIA/TensorRT/tree/release/9.0/demo/Diffusion#generate-an-image-guided-by-a-text-prompt-and-using-specified-lora-model-weight-updates. There may need to be some APIs exposed in torch-trt to work OOB |
Beta Was this translation helpful? Give feedback.
1 reply
-
For future notice there is now this API that helps make loras easier to use with Torch-TRT Models https://github.com/pytorch/TensorRT/blob/main/examples/dynamo/mutable_torchtrt_module_example.py |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I have a base model and several LORA adapters trained on top of it. The base model will always be loaded and for each inference request I modify the model by applying an adapter. I want to optimize my model using TensorRT, is there a way to apply LORA adapters on the optimized TensorRT model?
I would appreciate any ideas on where I can start to work on this problem? Thank you.
Beta Was this translation helpful? Give feedback.
All reactions