Skip to content

[Feature]: RFC : Multi-vendor runtime and Python registration build layout #320

@Peter9606

Description

@Peter9606

Suggestion Description

RFC : Multi-vendor runtime and Python registration build layout

Status Draft
Author Peter Han
Created 2026-03-31
Related RFC : Device and runtime layer, RFC : Test tiering and backend matrix, RFC : Pluggable GPU compile backend, Issue #240: per-backend Conversion TableGen layout

Summary

This RFC proposes a build and registration layout that makes FlyDSL's MLIR/Python stack multi-vendor friendly while preserving current ROCm behavior as default.

The scope covers:

  1. lib/Runtime split and build selection for vendor JIT runtime wrappers.
  2. python/mlir_flydsl/FlyRegisterEverything.cpp registration model without hard-coded vendor assumptions.
  3. python/mlir_flydsl/dialects/FlyROCDL.td and Python dialect binding wiring as optional vendor components.
  4. Related CMake files in lib/ and python/mlir_flydsl/.

The design keeps a strict invariant --- single runtime stack per process, while allowing different builds (or optional components in one build) to support different vendors.

Motivation

Current build wiring is ROCm-first in multiple places:

  • FlyRegisterEverything.cpp unconditionally registers fly_rocdl dialect and ROCDL conversion passes.
  • python/mlir_flydsl/CMakeLists.txt always declares ROCDL Python bindings and includes upstream rocdl dialect sources.
  • FlyJitRuntime is built directly from lib/Runtime/FlyRocmRuntimeWrappers.cpp in Python CMake, with HIP as required dependency.

This makes it harder to:

  1. Build community/dev environments without vendor-specific target dialect/runtime packages.
  2. Add a second vendor backend without copy-pasting ROCm assumptions into core wiring.
  3. Keep pass registration behavior correct while avoiding duplicate MLIRPass registry instances.

Non-goals

  1. Supporting multiple vendor runtimes loaded simultaneously in one process.
  2. Defining full lowering pipelines for all future vendors in this RFC.
  3. Large API redesign of Python user-facing DSL.

Design goals

  1. Default-compatible: no behavior change for current ROCm-first builds unless options are changed.
  2. Composable: each vendor stack (dialect, conversion, CAPI, runtime wrapper) can be enabled/disabled with CMake options.
  3. Registry-safe: preserve global pass registry semantics used by _mlirRegisterEverything and FlyPythonCAPI.
  4. Incremental adoption: phases can land independently with low churn.

Current constraints that must remain true

From existing python/mlir_flydsl/CMakeLists.txt and runtime behavior:

  1. _mlirRegisterEverything must not statically link C++ pass libs that create a separate local MLIRPass registry.
  2. Passes should be registered through CAPI entry points linked via EMBED_CAPI_LINK_LIBS into FlyPythonCAPI / _mlirRegisterEverything.
  3. FlyPythonCAPI symbol visibility control (-fvisibility=hidden + version script) remains enforced to avoid LLVM symbol conflicts.

Proposed architecture

1) Runtime layer layout (lib/Runtime)

Introduce per-vendor runtime wrapper targets under lib/Runtime/ and move runtime build logic out of python/mlir_flydsl/CMakeLists.txt.

Proposed structure:

lib/Runtime/
  CMakeLists.txt
  rocm/
    CMakeLists.txt
    FlyRocmRuntimeWrappers.cpp
  # future
  iluvatar/
    CMakeLists.txt
    FlyIluvatarRuntimeWrappers.cpp

Build strategy options:

  • Option A (recommended v1): single selected runtime target per build
    • FLYDSL_JIT_RUNTIME=rocm|iluvatar|none
    • Builds one shared library with stable output basename expected by Python loader.
  • Option B: multi-runtime artifacts in one build
    • Build libfly_jit_runtime_rocm.so, libfly_jit_runtime_iluvatar.so, etc.
    • Python backend chooses by jit_runtime_lib_basenames().

This RFC recommends Option A for initial rollout to reduce packaging and symbol-surface complexity.

2) Registration model (FlyRegisterEverything.cpp)

Replace hard-coded vendor registrations with generated or conditionally compiled blocks.

Recommended mechanism:

  1. Replace source with template FlyRegisterEverything.cpp.in.
  2. CMake assembles vendor registration fragments from enabled options.
  3. Generate final FlyRegisterEverything.cpp via configure_file.

Core registrations (always on):

  • mlirRegisterAllDialects
  • fly dialect handle insertion
  • mlirRegisterAllPasses
  • mlirRegisterFlyPasses

Vendor registrations (conditional):

  • dialect handle insertion (e.g. fly_rocdl)
  • conversion pass registration (e.g. mlirRegisterFlyToROCDLConversionPass)
  • vendor transform/attr passes (e.g. mlirRegisterFlyROCDLClusterAttrPass)

This avoids references to missing symbols when a vendor stack is not built.

3) Python dialect bindings (python/mlir_flydsl/dialects/*.td)

Vendor dialect binding declarations become optional:

  • FlyOps.td binding remains always enabled (core Fly).
  • FlyROCDL.td binding is wrapped by if(FLYDSL_ENABLE_FLY_ROCDL).
  • Future vendor dialects (e.g. FlyIXDL.td) follow same pattern.

In MLIRFlyDSLSources, include upstream vendor dialect python sources conditionally:

  • Add MLIRPythonSources.Dialects.rocdl only when ROCDL stack is enabled.

For generated binding files and stub generation:

  • Make _fly_rocdl_*_gen.py copy/stub commands conditional.
  • Avoid hard-coded module list that fails when a vendor module is absent.

4) CMake option model

Introduce a new CMake option:

  • FLYDSL_ENABLE_FLY_ROCDL (default ON)

This option gates the Fly->ROCDL-related build path consistently (dialect
bindings, conversion pass registration/CAPI wiring, and related Python module
packaging).

When adding support for additional hardware platforms in the future, introduce
new parallel CMake options (peer options to FLYDSL_ENABLE_FLY_ROCDL) to gate
their platform-specific stacks.

4.1) Build-time selection vs runtime env semantics

This RFC distinguishes two configuration layers with different responsibilities:

  1. Build-time (CMake) vendor selection decides which vendor components are
    present in the produced binaries (dialect bindings, conversion/CAPI targets,
    runtime wrapper shared libraries).
  2. Runtime env selection (FLYDSL_COMPILE_BACKEND,
    FLYDSL_RUNTIME_KIND) keeps the existing Python configuration surface and
    compile/runtime pairing validation model.

For the recommended v1 single-vendor build model, runtime env variables do not
provide additional vendor-switching capability beyond what is built:

  • If env is unset, defaults resolve to the built vendor.
  • If env explicitly selects a different vendor, FlyDSL must fail fast with a
    clear diagnostic (instead of failing later in JIT/link/load paths).

This preserves one stable user/programmatic interface across single-vendor and
future multi-vendor packaging modes while avoiding ambiguous "third selector"
configuration.

5) CAPI and pass registration linkage

Keep the current registry-safe pattern:

  1. Vendor pass CAPI functions are implemented in vendor CAPI libs.
  2. _mlirRegisterEverything links those CAPI libs via EMBED_CAPI_LINK_LIBS conditionally.
  3. Do not add vendor C++ pass libs to _mlirRegisterEverything PRIVATE_LINK_LIBS.

This preserves global pass registration visibility for PassManager.parse().

Concrete file-level direction

lib/CMakeLists.txt

  • Add add_subdirectory(Runtime) (new).
  • Keep existing Conversion, Dialect, CAPI subdirectories.

lib/CAPI/CMakeLists.txt and lib/CAPI/Dialect/CMakeLists.txt

  • Keep per-vendor CAPI subdirs but gate with same vendor options used by Python.
  • Ensure CAPI targets needed by Python embedding are available only when enabled.

python/mlir_flydsl/CMakeLists.txt

Refactor into clear sections:

  1. Core Python sources/extensions.
  2. Optional vendor dialect Python bindings/extensions.
  3. _mlirRegisterEverything generation and conditional EMBED_CAPI_LINK_LIBS.
  4. Runtime wrapper dependency hook from lib/Runtime target(s), not inlined HIP build logic.
  5. Conditional stubgen/copy for vendor-generated python files.

python/mlir_flydsl/FlyRegisterEverything.cpp

  • Convert to generated source with conditional vendor blocks.
  • Keep module name _mlirRegisterEverything unchanged for compatibility.

python/mlir_flydsl/dialects/FlyROCDL.td

  • No semantic change in ROCDL op definitions.
  • Build inclusion becomes conditional in CMake.

Operating System

No response

GPU

No response

ROCm Component

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions