Skip to content

Add onnx support  #580

@davanstrien

Description

@davanstrien

Currently, onnx isn't used for inference of transformer models. Alongside using ray one approach to improve inference throughput again would be to:

  • download models
  • do the onnx export from the model
  • load using onnxrunntime
  • catch errors for unsupported models
  • do some initial validation of results?

Hopefully, we get some of this for free with Optimum if we wait a bit: https://github.com/huggingface/optimum

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions