Skip to content

Conversation

@MagellaX
Copy link

@MagellaX MagellaX commented Sep 9, 2025

Description

This introduces an initial ONNX import path encapsulated in a dedicated crate. The importer reads an ONNX model and produces a Luminal primitive graph plus named input/output GraphTensors for data ingress/egress. It is self-contained and does not affect existing runtime behavior.

Key points

  • Provides a single entry point to load models and construct a runnable primitive graph, returning input/output tensors keyed by original ONNX names.
  • Uses protobufs generated via prost with a vendored protoc, avoiding external build dependencies and reducing CI friction.
  • Implements a pragmatic operator subset sufficient for basic models and initial end-to-end validation:
    • Constants and initializers
    • Elementwise Add/Sub/Mul/Div/Max/Min with simple broadcasting
    • Relu, Sigmoid, Tanh, Sqrt
    • MatMul and Gemm (alpha/beta/trans flags)
    • Softmax with axis (including negative indices)
    • Reshape (supports -1 and 0 when shape is constant), Transpose
    • Unsqueeze/Squeeze (via attribute or secondary input), Concat
  • Converts ONNX shapes (including dim_param) into Luminal Expressions, mapping symbolic dims consistently.
  • Marks graph outputs for retrieval to ensure values are retained after execution.
  • Includes a smoke test that programmatically builds a small ONNX model (MatMul + Add + Softmax), imports it, runs the graph, and compares numerically to an equivalent Luminal graph.

Design notes

  • The returned Graph is boxed to keep GraphTensor internal pointers stable while building and executing the graph.
  • A trimmed ONNX proto is included to cover the importer’s current needs; it can be swapped or expanded as coverage grows.

Limitations and scope

  • Not yet covering convolution, pooling, batch norm, gather/scatter, pad/clip, shape ops, or control flow.
  • Focused on floating point tensors and common inference paths; quantized/complex dtypes are not handled.
  • Broadcasting is intentionally simple and may need to be generalized to full ONNX semantics.
  • Only import is implemented; export is out of scope here.

Near-term follow-ups

  • Expand op coverage: Conv/ConvTranspose, pooling ops, Reduce ops (Mean/Sum), ArgMax, Pad/Clip, Flatten/Shape/Gather.
  • Improve broadcasting to faithfully match ONNX rules across ranks and singleton dims.
  • Extend dtype support (int/bool/bfloat16/float16) and test mixed-type behavior.
  • Add an example CLI to load an ONNX file, feed inputs, and dump outputs for quick local validation.
  • Add model-based tests (small public ONNX models) to exercise importer paths beyond the smoke test.
  • Consider switching to the full upstream ONNX proto and documenting version compatibility.

Tracking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants