-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Currently, onnx isn't used for inference of transformer models. Alongside using ray one approach to improve inference throughput again would be to:
- download models
- do the onnx export from the model
- load using onnxrunntime
- catch errors for unsupported models
- do some initial validation of results?
Hopefully, we get some of this for free with Optimum if we wait a bit: https://github.com/huggingface/optimum
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request