Suggestion Description
RFC : Multi-vendor runtime and Python registration build layout
Summary
This RFC proposes a build and registration layout that makes FlyDSL's MLIR/Python stack multi-vendor friendly while preserving current ROCm behavior as default.
The scope covers:
lib/Runtime split and build selection for vendor JIT runtime wrappers.
python/mlir_flydsl/FlyRegisterEverything.cpp registration model without hard-coded vendor assumptions.
python/mlir_flydsl/dialects/FlyROCDL.td and Python dialect binding wiring as optional vendor components.
- Related CMake files in
lib/ and python/mlir_flydsl/.
The design keeps a strict invariant --- single runtime stack per process, while allowing different builds (or optional components in one build) to support different vendors.
Motivation
Current build wiring is ROCm-first in multiple places:
FlyRegisterEverything.cpp unconditionally registers fly_rocdl dialect and ROCDL conversion passes.
python/mlir_flydsl/CMakeLists.txt always declares ROCDL Python bindings and includes upstream rocdl dialect sources.
FlyJitRuntime is built directly from lib/Runtime/FlyRocmRuntimeWrappers.cpp in Python CMake, with HIP as required dependency.
This makes it harder to:
- Build community/dev environments without vendor-specific target dialect/runtime packages.
- Add a second vendor backend without copy-pasting ROCm assumptions into core wiring.
- Keep pass registration behavior correct while avoiding duplicate
MLIRPass registry instances.
Non-goals
- Supporting multiple vendor runtimes loaded simultaneously in one process.
- Defining full lowering pipelines for all future vendors in this RFC.
- Large API redesign of Python user-facing DSL.
Design goals
- Default-compatible: no behavior change for current ROCm-first builds unless options are changed.
- Composable: each vendor stack (dialect, conversion, CAPI, runtime wrapper) can be enabled/disabled with CMake options.
- Registry-safe: preserve global pass registry semantics used by
_mlirRegisterEverything and FlyPythonCAPI.
- Incremental adoption: phases can land independently with low churn.
Current constraints that must remain true
From existing python/mlir_flydsl/CMakeLists.txt and runtime behavior:
_mlirRegisterEverything must not statically link C++ pass libs that create a separate local MLIRPass registry.
- Passes should be registered through CAPI entry points linked via
EMBED_CAPI_LINK_LIBS into FlyPythonCAPI / _mlirRegisterEverything.
FlyPythonCAPI symbol visibility control (-fvisibility=hidden + version script) remains enforced to avoid LLVM symbol conflicts.
Proposed architecture
1) Runtime layer layout (lib/Runtime)
Introduce per-vendor runtime wrapper targets under lib/Runtime/ and move runtime build logic out of python/mlir_flydsl/CMakeLists.txt.
Proposed structure:
lib/Runtime/
CMakeLists.txt
rocm/
CMakeLists.txt
FlyRocmRuntimeWrappers.cpp
# future
iluvatar/
CMakeLists.txt
FlyIluvatarRuntimeWrappers.cpp
Build strategy options:
- Option A (recommended v1): single selected runtime target per build
FLYDSL_JIT_RUNTIME=rocm|iluvatar|none
- Builds one shared library with stable output basename expected by Python loader.
- Option B: multi-runtime artifacts in one build
- Build
libfly_jit_runtime_rocm.so, libfly_jit_runtime_iluvatar.so, etc.
- Python backend chooses by
jit_runtime_lib_basenames().
This RFC recommends Option A for initial rollout to reduce packaging and symbol-surface complexity.
2) Registration model (FlyRegisterEverything.cpp)
Replace hard-coded vendor registrations with generated or conditionally compiled blocks.
Recommended mechanism:
- Replace source with template
FlyRegisterEverything.cpp.in.
- CMake assembles vendor registration fragments from enabled options.
- Generate final
FlyRegisterEverything.cpp via configure_file.
Core registrations (always on):
mlirRegisterAllDialects
fly dialect handle insertion
mlirRegisterAllPasses
mlirRegisterFlyPasses
Vendor registrations (conditional):
- dialect handle insertion (e.g.
fly_rocdl)
- conversion pass registration (e.g.
mlirRegisterFlyToROCDLConversionPass)
- vendor transform/attr passes (e.g.
mlirRegisterFlyROCDLClusterAttrPass)
This avoids references to missing symbols when a vendor stack is not built.
3) Python dialect bindings (python/mlir_flydsl/dialects/*.td)
Vendor dialect binding declarations become optional:
FlyOps.td binding remains always enabled (core Fly).
FlyROCDL.td binding is wrapped by if(FLYDSL_ENABLE_FLY_ROCDL).
- Future vendor dialects (e.g.
FlyIXDL.td) follow same pattern.
In MLIRFlyDSLSources, include upstream vendor dialect python sources conditionally:
- Add
MLIRPythonSources.Dialects.rocdl only when ROCDL stack is enabled.
For generated binding files and stub generation:
- Make
_fly_rocdl_*_gen.py copy/stub commands conditional.
- Avoid hard-coded module list that fails when a vendor module is absent.
4) CMake option model
Introduce a new CMake option:
FLYDSL_ENABLE_FLY_ROCDL (default ON)
This option gates the Fly->ROCDL-related build path consistently (dialect
bindings, conversion pass registration/CAPI wiring, and related Python module
packaging).
When adding support for additional hardware platforms in the future, introduce
new parallel CMake options (peer options to FLYDSL_ENABLE_FLY_ROCDL) to gate
their platform-specific stacks.
4.1) Build-time selection vs runtime env semantics
This RFC distinguishes two configuration layers with different responsibilities:
- Build-time (CMake) vendor selection decides which vendor components are
present in the produced binaries (dialect bindings, conversion/CAPI targets,
runtime wrapper shared libraries).
- Runtime env selection (
FLYDSL_COMPILE_BACKEND,
FLYDSL_RUNTIME_KIND) keeps the existing Python configuration surface and
compile/runtime pairing validation model.
For the recommended v1 single-vendor build model, runtime env variables do not
provide additional vendor-switching capability beyond what is built:
- If env is unset, defaults resolve to the built vendor.
- If env explicitly selects a different vendor, FlyDSL must fail fast with a
clear diagnostic (instead of failing later in JIT/link/load paths).
This preserves one stable user/programmatic interface across single-vendor and
future multi-vendor packaging modes while avoiding ambiguous "third selector"
configuration.
5) CAPI and pass registration linkage
Keep the current registry-safe pattern:
- Vendor pass CAPI functions are implemented in vendor CAPI libs.
_mlirRegisterEverything links those CAPI libs via EMBED_CAPI_LINK_LIBS conditionally.
- Do not add vendor C++ pass libs to
_mlirRegisterEverything PRIVATE_LINK_LIBS.
This preserves global pass registration visibility for PassManager.parse().
Concrete file-level direction
lib/CMakeLists.txt
- Add
add_subdirectory(Runtime) (new).
- Keep existing
Conversion, Dialect, CAPI subdirectories.
lib/CAPI/CMakeLists.txt and lib/CAPI/Dialect/CMakeLists.txt
- Keep per-vendor CAPI subdirs but gate with same vendor options used by Python.
- Ensure CAPI targets needed by Python embedding are available only when enabled.
python/mlir_flydsl/CMakeLists.txt
Refactor into clear sections:
- Core Python sources/extensions.
- Optional vendor dialect Python bindings/extensions.
_mlirRegisterEverything generation and conditional EMBED_CAPI_LINK_LIBS.
- Runtime wrapper dependency hook from
lib/Runtime target(s), not inlined HIP build logic.
- Conditional stubgen/copy for vendor-generated python files.
python/mlir_flydsl/FlyRegisterEverything.cpp
- Convert to generated source with conditional vendor blocks.
- Keep module name
_mlirRegisterEverything unchanged for compatibility.
python/mlir_flydsl/dialects/FlyROCDL.td
- No semantic change in ROCDL op definitions.
- Build inclusion becomes conditional in CMake.
Operating System
No response
GPU
No response
ROCm Component
No response
Suggestion Description
RFC : Multi-vendor runtime and Python registration build layout
Summary
This RFC proposes a build and registration layout that makes FlyDSL's MLIR/Python stack multi-vendor friendly while preserving current ROCm behavior as default.
The scope covers:
lib/Runtimesplit and build selection for vendor JIT runtime wrappers.python/mlir_flydsl/FlyRegisterEverything.cppregistration model without hard-coded vendor assumptions.python/mlir_flydsl/dialects/FlyROCDL.tdand Python dialect binding wiring as optional vendor components.lib/andpython/mlir_flydsl/.The design keeps a strict invariant --- single runtime stack per process, while allowing different builds (or optional components in one build) to support different vendors.
Motivation
Current build wiring is ROCm-first in multiple places:
FlyRegisterEverything.cppunconditionally registersfly_rocdldialect and ROCDL conversion passes.python/mlir_flydsl/CMakeLists.txtalways declares ROCDL Python bindings and includes upstreamrocdldialect sources.FlyJitRuntimeis built directly fromlib/Runtime/FlyRocmRuntimeWrappers.cppin Python CMake, with HIP as required dependency.This makes it harder to:
MLIRPassregistry instances.Non-goals
Design goals
_mlirRegisterEverythingandFlyPythonCAPI.Current constraints that must remain true
From existing
python/mlir_flydsl/CMakeLists.txtand runtime behavior:_mlirRegisterEverythingmust not statically link C++ pass libs that create a separate localMLIRPassregistry.EMBED_CAPI_LINK_LIBSintoFlyPythonCAPI/_mlirRegisterEverything.FlyPythonCAPIsymbol visibility control (-fvisibility=hidden+ version script) remains enforced to avoid LLVM symbol conflicts.Proposed architecture
1) Runtime layer layout (
lib/Runtime)Introduce per-vendor runtime wrapper targets under
lib/Runtime/and move runtime build logic out ofpython/mlir_flydsl/CMakeLists.txt.Proposed structure:
Build strategy options:
FLYDSL_JIT_RUNTIME=rocm|iluvatar|nonelibfly_jit_runtime_rocm.so,libfly_jit_runtime_iluvatar.so, etc.jit_runtime_lib_basenames().This RFC recommends Option A for initial rollout to reduce packaging and symbol-surface complexity.
2) Registration model (
FlyRegisterEverything.cpp)Replace hard-coded vendor registrations with generated or conditionally compiled blocks.
Recommended mechanism:
FlyRegisterEverything.cpp.in.FlyRegisterEverything.cppviaconfigure_file.Core registrations (always on):
mlirRegisterAllDialectsflydialect handle insertionmlirRegisterAllPassesmlirRegisterFlyPassesVendor registrations (conditional):
fly_rocdl)mlirRegisterFlyToROCDLConversionPass)mlirRegisterFlyROCDLClusterAttrPass)This avoids references to missing symbols when a vendor stack is not built.
3) Python dialect bindings (
python/mlir_flydsl/dialects/*.td)Vendor dialect binding declarations become optional:
FlyOps.tdbinding remains always enabled (core Fly).FlyROCDL.tdbinding is wrapped byif(FLYDSL_ENABLE_FLY_ROCDL).FlyIXDL.td) follow same pattern.In
MLIRFlyDSLSources, include upstream vendor dialect python sources conditionally:MLIRPythonSources.Dialects.rocdlonly when ROCDL stack is enabled.For generated binding files and stub generation:
_fly_rocdl_*_gen.pycopy/stub commands conditional.4) CMake option model
Introduce a new CMake option:
FLYDSL_ENABLE_FLY_ROCDL(defaultON)This option gates the Fly->ROCDL-related build path consistently (dialect
bindings, conversion pass registration/CAPI wiring, and related Python module
packaging).
When adding support for additional hardware platforms in the future, introduce
new parallel CMake options (peer options to
FLYDSL_ENABLE_FLY_ROCDL) to gatetheir platform-specific stacks.
4.1) Build-time selection vs runtime env semantics
This RFC distinguishes two configuration layers with different responsibilities:
present in the produced binaries (dialect bindings, conversion/CAPI targets,
runtime wrapper shared libraries).
FLYDSL_COMPILE_BACKEND,FLYDSL_RUNTIME_KIND) keeps the existing Python configuration surface andcompile/runtime pairing validation model.
For the recommended v1 single-vendor build model, runtime env variables do not
provide additional vendor-switching capability beyond what is built:
clear diagnostic (instead of failing later in JIT/link/load paths).
This preserves one stable user/programmatic interface across single-vendor and
future multi-vendor packaging modes while avoiding ambiguous "third selector"
configuration.
5) CAPI and pass registration linkage
Keep the current registry-safe pattern:
_mlirRegisterEverythinglinks those CAPI libs viaEMBED_CAPI_LINK_LIBSconditionally._mlirRegisterEverythingPRIVATE_LINK_LIBS.This preserves global pass registration visibility for
PassManager.parse().Concrete file-level direction
lib/CMakeLists.txtadd_subdirectory(Runtime)(new).Conversion,Dialect,CAPIsubdirectories.lib/CAPI/CMakeLists.txtandlib/CAPI/Dialect/CMakeLists.txtpython/mlir_flydsl/CMakeLists.txtRefactor into clear sections:
_mlirRegisterEverythinggeneration and conditionalEMBED_CAPI_LINK_LIBS.lib/Runtimetarget(s), not inlined HIP build logic.python/mlir_flydsl/FlyRegisterEverything.cpp_mlirRegisterEverythingunchanged for compatibility.python/mlir_flydsl/dialects/FlyROCDL.tdOperating System
No response
GPU
No response
ROCm Component
No response