Skip to content

[Tools][Export] Implement (any) Model Exporter (Pilot: smolVLA) #5

@copparihollmann

Description

@copparihollmann

Context

New models (e.g., lerobot/smolVLA) often utilize PyTorch operators or control flow patterns that are not yet supported by the torch-mlir or iree-turbine export paths. Manually debugging these export failures is inefficient. We need a tool that attempts to compile a model, captures the specific operator failures (e.g., aten::fft, aten::complex), and generates a structured "Gap Report" to guide the implementation of missing shims or MLIR lowerings.

Objective

Develop a Model Export Harness (src/tools/model_audit/) that automates the ingestion, tracing, and lowering analysis of arbitrary PyTorch models, using smolVLA as the primary integration test case.

Scope of Work

  1. Model Harness (harness.py):
    • Integration with transformers / lerobot to load models and automatically generate valid dummy inputs (shapes/types) for tracing.
    • Support for torch.export (AOT) and torch_mlir.compile (JIT) paths.
  2. Failure Classifier (analyzer.py):
    • Parses torch-mlir diagnostic logs to identify the root cause of export failure.
    • Classifies errors into: MISSING_OP, TYPE_MISMATCH, DYNAMIC_SHAPE_ERROR.
  3. Gap Reporter (reporter.py):
    • Outputs a shim_requirements.yaml listing the specific aten::* ops that need to be decomposed or registered in the compiler backend.

Acceptance Criteria (Definition of Done)

We define success by the tool's ability to identify gaps in smolVLA and other reference models:

Test 1: Auto-Input Generation

  • Input: lerobot/smolVLA (or a mock VLA model class).
  • Condition: Run harness.py.
  • Success: The tool successfully infers input shapes (image + text tokens) and executes the model.forward() pass in eager mode without crashing.

Test 2: Missing Operator Detection

  • Input: A mock model containing an unsupported op (e.g., aten::complex or a specific unsupported FFT).
  • Condition: Run the export harness.
  • Success: The tool catches the crash/exception and outputs a JSON report identifying the specific missing op name.

Test 3: Shim Spec Generation

  • Input: The full smolVLA model (assuming current compiler stack fails on it).
  • Condition: Run the full audit suite.
  • Success: Generates artifacts/smolVLA_gaps.yaml containing:
    • missing_ops: List of unsupported ATen operators.
    • locations: Stack traces pointing to where these ops are used in the model code.

Test 4: Successful Lowering (Regression)

  • Input: A simple ResNet18 (known supported).
  • Condition: Run the harness.
  • Success: Returns status: SUPPORTED and saves the valid .mlir file to artifacts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions