-
Notifications
You must be signed in to change notification settings - Fork 28
Description
Overview
Issue #762 tracks adding demos for the tt-forge-onnx frontend for both ONNX and PaddlePaddle, loading models and inputs from the tt-forge-models repository. For PaddlePaddle, loaders exist and demos will be created in tt-forge.
For ONNX, there is ongoing discussion about hosting ONNX files in public locations (e.g. HuggingFace) instead of private storage (e.g. S3). Making ONNX models publicly downloadable would help users run them with the tt-forge-onnx compiler.
Until ONNX models are onboarded to tt-forge-models, we can still provide ONNX demos in tt-forge using a conversion pipeline.
Proposed approach
- Load PyTorch models from tt-forge-models using existing loaders
- Convert them to ONNX using
torch.onnx.exportor another suitable API - Compile and run the resulting ONNX models with the tt-forge-onnx compiler
This gives ONNX demos in tt-forge for release validation
Suggested scope (examples)
Start with a subset of the models in #762, for example:
- Text: BERT, GPT-2, RoBERTa, T5, Perceiver, SqueezeBERT
- Vision: ResNet, AlexNet, EfficientNet, GoogLeNet, MobileNetV1, DenseNet, VGG
- Other: N-BEATS (TimeSeries), SSD300, YOLOv8, Autoencoder, SegFormer
Next steps (after ONNX hosting is resolved)
Once ONNX models are available in tt-forge-models (e.g. HuggingFace or similar), demos can be updated to load ONNX files directly instead of converting from PyTorch. This issue acts as an intermediate solution until that happens.
cc: @nvukobratTT