Add onnx support 

Currently, onnx isn't used for inference of transformer models. Alongside using ray one approach to improve inference throughput again would be to:
- download models
- do the onnx export from the model 
- load using onnxrunntime 
- catch errors for unsupported models 
- do some initial validation of results? 

Hopefully, we get some of this for free with Optimum if we wait a bit: https://github.com/huggingface/optimum 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add onnx support #580

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add onnx support #580

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions